I love tailscale, but the performance overhead on file transfer (my primary use case for it) is very real.
Samba transfers take a 15 megabyte per second hit over tailscale even with a fairly fast CPU on both ends (Ryzen 3600 and Ryzen 7900X3D) on my local network
Try netbird, it's the same idea but with support for using kernel-mode WireGuard when one of the peers is able to connect to another one directly without doing NAT tricks (so either both peers are on the same subnet, or at least one of them has a public IP).
WG is quite fast. Can’t be the limiter. Like this guy I’ve driven 1 G easily on 7950 and Epyc 9654. I think I did 10 G but I can’t recall because at some point I just moved everything local and did 40 G. But I’m sure it would work on CPU on reasonable machine
A very likely culprit is the packet encapsulation changing things for the worse. An informative test would be to tcpdump (wireshark, etc) the packet stream with and without tailscale. Look at packet sizes, etc.
The overhead shouldn't be 15% but there could be some weird interaction with the link MTU for the VPN causing, e.g., smaller packets to be sent with more overhead.
15 MiB/s is trivially handled by any CPU you're likely to run. Indeed 100 MiB/s seems reasonable. 15 MiB/s cap seems either the protocol being used is doing too many round trips (assuming the machines you're testing with are far apart) or the network that's being set up requires routing through Tailscale's infra for hole punching.
It sounds like the traffic gets routed through a Tailscale relay because all attempts at direct connection failed. A direct connection would have been as fast as a direct connection.
Ok a 12% differential on a LAN is kind of surprising. I wonder what Tailscale could possibly doing that would be causing this issue because aside from the control plane I don't believe they're in the data path all that much. Maybe WireGuard on Windows isn't as optimized as it is on Linux?
IME it adds about (at least) 1ms of latency over local networks. You should be able to use a different dns suffix to use the LAN interface instead of Tailscale.
Samba transfers take a 15 megabyte per second hit over tailscale even with a fairly fast CPU on both ends (Ryzen 3600 and Ryzen 7900X3D) on my local network