The general pattern is a fixed set of resources that are consumed/retired at a fixed maximum rate, where the optimal design consistently gets as close to the maximum rate as possible without exceeding the resource limit. There are (at least) two other places in software where backpressure is used, both commonly found in the database world:
Storage I/O scheduling, which unlike the network case is often not interrupt-driven. As you approach the IOPS rate limit for the system due to high priority tasks, the rate of IOPS scheduled for low priority tasks like background write-back in databases is dynamically reduced to maintain headroom for high priority tasks. This is implemented as backpressure on low-priority tasks.
In query processing, a query as a user would understand it can materialize upwards of millions of sub-operations on the same server that can run concurrently given adequate system resources. The number of sub-operations that can be in-flight and retired per second is approximately fixed and shared across all concurrent user queries. For large queries, you incrementally materialize these sub-operations at a rate based on the instantaneous capacity of the system to handle new sub-operations. This is backpressure based on execution slot (and related memory) availability.
Writing software for barrel processors takes this to the extreme. The entire software design principle is to consistently generate fine-grained concurrent threads of execution at runtime that are close to the hardware concurrency limit globally (which is very high) but never exceeds it. Highly optimized code quickly gets pretty weird but it is basically all backpressure mechanics to maintain throughput.
Here's something I do regularly: I have B bits of data, where B is multiple orders of magnitude larger than the RAM I have available. I can process chunks of B in parallel with a near linear improvement in throughput, but still at an overall lower rate than I can read it from storage.
In other words, I/O is faster than CPU for this task.
A naive design where I read the data as fast as possible on a dedicated thread, and dump it into an unbounded queue that a thread-per-core consumes, will quickly run out of memory.
By putting a limit on queue depth, the queue can communicate back to the storage reader that it can't accept more data. This is backpressure. The reader in turn can decide whether to e.g. wait, slow down, discard etc. as appropriate for the use case.
You won't/can't have it in a direct function call invocation style of programming. E.g. if you have a control loop like "call A, pass result to B, pass result to C" then it's impossible for A to be "too fast."
Network calls are the biggest source of asynchronously queued execution, but you can find models where you have it on a local machine too with multiprocessing. A trivial silly non-network single-machine example might be something like unpacking compressed files than doing [thing] with their contents - maybe you have enough CPU to do them in parallel, but you don't want to blow up your disk by unpacking all of them with no throttling. Even in your lexer/parser example if you wanted to parallelize those steps with a queue in between them, in theory you could have such a huge input that you ran out of memory... in practice, nah, that's not very likely the way you'd do it, or a problem you'd have.
Sometimes "just drop things" or "just make the slow part faster" still aren't really easy/feasible/acceptable even without distributed systems.
I dunno if I'd really call it something like this like the linked article, though "But other forms of backpressure can happen too: for example, if your software has to wait for the user to take some action."
Ehhh I think labeling user input as backpressure, because the software is waiting for _input_, is somewhere between confusing and inaccurate. When I have seen backpressure discussed in my day job, it has always involved a (theoretical or real) slow consumer, and therefore some queue in front of that consumer. I agree that "networking" or not, is irrelevant.
Unix pipes have built in back pressure. A very simplified view, assuming blocking calls and single threads: if the process on the left side of the pipe produces data too fast and fills up the buffer, the operating system won’t give it any additional CPU cycles, until the program on the right side of the pipe caught up with reading and freed up the buffer.
Things become more complicated when multi-threading and / or asynchronous IO gets involved. If the producer on the left side of the pipe has multiple threads, e.g., one writer thread and multiple worker threads, producing data for the writer thread. Then it has to invent its own signaling between the different threads, to avoid throwing away data, when its internal buffers fill up. This is effectively a kind of back pressure.
In your example with the lever and the parser: if the lexer is multi threaded, you either need unlimited buffer for the tokens or some signaling for the threads, to slow down when the parser cannot keep up. This example is a bit artificial, since most languages don’t allow parallel lexers. But with parallel compilers and one (incremental) linker, this becomes more realistic.
Not a thing in software outside of pieces communicating over a network, not using a protocol which has implicit flow control like TCP.
E.g. words like "the lexer was producing tokens too fast, so the parser applied backpressure" have never been heard.