A lot of this is down to configuration. For example, you say “Emacs TRAMP uses one ssh connection per command”, but that’s only true if you’re not using the ControlMaster SSH option. Add this to your ~/.ssh/config file:
ControlMaster auto
ControlPersist yes
ControlPath ~/.ssh/control/%C
Then run mkdir -p ~/.ssh/control/, if you haven’t already. Why this isn’t the default I don’t know, but once configured correctly you won’t have any problems from TRAMP.
I have this, yet I still see emacs sending individual I/O operations over SSH and blocking on their completion.
Everything that requires remote file system interaction is super slow. Emacs just seem to be doing lots of individual I/O operations.
VScode has a remote server that batches them before updating over the network.
The client sends operations to the server and queries the server for updates asynchronously. The server can perform multiple operations, and batch them into one response.
Stuff like, "regex on all files in this directory" is performed by emacs as "list all files in directory, wait, for each file, regex that file, wait". VScode just sends the "regex all files" and the server locally handles everything, and send one update back.
The difference is going from < 100ms for VSCode vs 5 seconds for emacs. Night and day.
The same happens for pretty much every modern feature (git status, diffs, blames and updates, autocompletion, correctness checks / intellisense, etc.).
This varies from command to command, but M-x grep literally just runs the program grep on the remote machine. It doesn’t enumerate the files and search each one individually. If you’re using something other than M-x grep, then sure, it might be written badly.
You might have already tried this: I’ve had fantastic performance starting a remote emacs server and using emacs (often in terminal mode) through ssh. It certainly has a one time cost: setup your local terminal for all keys to go through, and sync your init files. I still use tramp for certain rare cases but most work happens on 3 remote and 1 local machine with four different emacs servers running, and a couple of additional non-development machines that I simply ssh into from within emacs shells.
> The same happens for pretty much every modern feature (git status, diffs, blames and updates, autocompletion, correctness checks / intellisense, etc.).
Could it be a configuration issue in parts at least? Because it does start to sound like it. For example git status has never been slow for me in Emacs, except for huge diffs with lots of changes in lots of files. Same for diffs. I know, that autocompletion depends on the language and tools used for it and what things are checked for possible auto completion entries. It is possible to limit autocompletion to only use some sources, or to make it use a language server for some languages. When developing Rust, Python or TypeScript in Emacs, I did not experience slow correctness checks (I assume you mean type checks and unused variable kind of stuff.).
I searched a little and found the following blog, which claims, that there are issues with it, when you do heavy data transfers, for example via many rsyncs at the same time:
I suppose that’s a good point. With multiple large simultaneous flows, multiple TCP connections probably will be better, unless SSH goes to all of the trouble to reimplement all of TCP’s nicer features. And at that point you should just be using a real VPN anyway.
I wonder if ssh shouldn’t have a sensible default like ~/.ssh/control/%C for ControlPath, so that you could just turn on ControlMaster and have it just work. Then TRAMP could set the ControlMaster option on the command line when it runs ssh. At least then people wouldn’t have to mess with their SSH config, and they wouldn’t have to consider whether it will break something else.