Thanks for the feedback-we're actually planning to open-source a version of our work that significantly improves on the original UDT project: https://udt.sourceforge.io/.
Look forward to it. I'd be interested to hear how your tool compares with Facebook WDT[1], as that would be my go-to right now if someone asked me for a fast point-to-point data transfer solution.
I don't have the expertise to determine whether it is "abandonware" or simply "done" I'm afraid. Found it a bit of a pain to use, hence why an alternative sounds interesting.
This article was interesting and also frustrating to read.
1. There are very few numbers. In particular, improvement in performance under various circumstances is _not_ given! If you dig around you can find their transfer time application [1], but there is no discussion on that page.
2. The basis for the improvement is not spelled out. (References are given, but you have to know the field - "acronyms only".) If I understand correctly, their contribution is the improved measures of congestion used. Their landing page just touts "don't use TCP"... which sounds like Step 0 of a very long process.
I admit, the title is basically accurate: "how to build" not "the performance of".
tl;dr: Start with existing work, simulate and improve incrementally.
I don't know anything about the field, but this article didn't lead me to understand any better. I'd love to know the real numbers they observed, which approaches didn't pan out, are they effectively using an error correcting code?
Anyway, it's certainly not an academic paper - just an advertisement.
I just tried the calculator. It seems that if you're in "US Metro" or "Europe", then the transfer protocol is just as fast as TCP, is this correct? I wonder why this is the case. Is it because the routers play more fairly?
I would expect it means your service provider isn't dropping packets. Their protocol seems to just be more aggressive about not backing off in the face of packet loss, which is helpful if one of your links is a marginal radio connection.
The cynic in me thinks they achieve better throughput because they don't play nice with TCP and monopolize the link while everybody else gets backed off.
I'm inclined to agree. Perhaps there's a place for a "fast" FTP protocol but I wouldn't want to be sharing the link with them ... imagine the cries going around the office ... "has the internet gone down again!?"
For TCP transfers, I expect to see near 100% utilisation (if I'm not sharing bandwidth) but that's on wired (reliable) links.
communication links are generally a shared resource (common good) and the experience will degrade for everyone if individual transfers/users aren't backing down voluntarily.
You can do better than TCP and still "play nice". Just using up exactly the bandwidth TCP refuses to use, as FASP does by default, gets you much faster transmission.
Then let's not say it. Clearly you miss the point: any provisioned backbone bandwidth that does not carry a packet just goes to waste. Network backbone channels are not provisioned like your consumer-grade cable TV network endpoint hubs. Carriers pay for guaranteed capacity. When TCP fails to use bandwidth capacity, it is failing to use backbone bandwidth, not dodgy last-mile bandwidth.
Even moreso: when so configured, FASP is having exactly zero effect on the actual bit rate of the TCP connections. Another TCP connection would compete for the same packet slots, slowing them all down. The FASP traffic does not slow down any of the TCP connections, but only competes with other FASP (or similar) traffic.
And if you are fully saturating exactly the packet slots TCP would not try to use, then you are not introducing packet loss. That is just how FASP works.
> The cynic in me thinks they achieve better throughput because they don't play nice with TCP and monopolize the link while everybody else gets backed off.
Thanks for taking the time to read it! To address your concerns:
1. To give you an idea of the speed improvements, we transferred a 2GB file between Ohio and Singapore on AWS and were able to transfer it in 0:26 (seconds) using our protocol, vs 2:15 for SCP.
2. The basis for improvement is taking into account the changes in round-trip-time for a particular network path; these temporary increases are used as the primary congestion signal.
We are not using error correcting codes, which are good for preventing the retransmission of packets but do not address the underlying problem of avoiding congestion in a network.
There is no meaningful difference, at the end of the day they'll both be fine for measuring the base case of single-threaded TCP connection performance.
I prefer rsync as well, but SCP works on more machines by default (no dependency on rsync being installed on the target host).
Hey HN! I'm Mahamad, co-founder of Tachyon Transfer, where we're building faster file transfer tools for developers. We've spent the last year building an ultra-fast FTP replacement, and we thought we'd show you guys what our technical process was like. Let me know if you have any questions!
Please show performance tests versus hpn-ssh, GridFTP (aka, the defacto tool of the particle physics and genetics research communities) and simpler systems like wget2's multi-threaded mode.
Standard SSH uses TCP over port 22 by default, so it wouldn't be possible without modifying SSH to use a different protocol. That being said, however, our protocol uses TLS over UDP via the OpenSSL libraries so it is secure by default. We also offer a BSD-style socket interface that you can use if you want a drop in replacement for TCP sockets. Shoot me a note at mahamad _at_ trytachyon _dot_ com if you want to chat!
What's the difference between "TLS over UDP" and Quic? Did you compare your solution to some Quic implementation?
Also, how was CPU usage? In my experience UDP-based stream protocols do signifanctly worse (I assume due to user-space scheduling and unoptimized hot paths). This may be particularly relevant for laptops and mobile.
At the moment we offer both options. We offer our own network with a pricing plan similar to massive.io (though 10c per gb vs 25c) Our licensing is cheaper but requires large volumes.
The article has essentially no technical information on how the protocol works, besides the unlabeled flow rate graphs which show TCP rate popping up and down and theirs more or less constant. Their rate appears much less stable than FASP's. OTOH, FASP is extremely expensive.
I have updated the Wikipedia page on the FASP protocol they compete with, to provide more detail on how it works. Theirs might work similarly.
Curiously, FASP usage only really took off after the product got a comprehensive GUI file management app, with scheduling and a speed control you could drag up and down. Being able to transfer files overseas several times as fast as the competition was not enough.
Hey, thanks for your comment-you're right there aren't too many details on the protocol changes we made in part because we will follow up on that in another post. This blog post was getting too long as it is and we wanted to focus more on the need to simulate and test. Your comments on the hacker news post you link too actually partially served to inspire us!
The original design for FASP's flow control was developed in China and applied in a router: you would have one at each end, and it would spoof TCP for clients. That was a huge flop, because it didn't say Cisco on the nameplate.
The Aspera principals realized they could implement it in user space using UDP, bypassing the network infrastructure purchasing cabal, and sell directly to the users who had data to move.
I always wanted to get the astronomy community using it (e.g. to Antarctica and Atacama, Chile) under a free license, but never quite got there.
> Building this infrastructure took a substantial amount of time..
> If anyone is interested in trying the Tachyon Transfer Algorithm we offer a storage transfer acceleration API like AWS does. Our SDK includes node, c++ and objc and could be used in a wide variety of applications
So it was a lot of effort, and now they're inviting Big-G and Cloudflare to contact them to possibly achieve a paltry 30%-ish speed increase for certain scenarios? Or are they inviting app devs who want faster video uploads to reach out? What is the actual use case where the sometimes 30% improvement matters and actually moves the needle?!
Why hasn't Tachyon been working with their prospective customers and warming them up the whole time, or at least working the social and investor nets and reaching out proactively already?
This strategy is kind of like being a dweeb at a poorly lit school dance and hoping the most popular girl at the dance somehow notices you're wearing shoes that let you float a centimeter in the air. Cool trick, bud.
Presumably it's not a $10/mo service contract. Is this really an effective strategy when building and selling to enterprise these days? To me it sounds like a risky and hard way to make less money than what is possible using tried and true product development strategies. To be fair, I have also made this mistake before. It was embarrassing enough as a solo-founder, and seems less forgivable with larger founding group sizes, because it means more folks agreed to support and follow such a sub-optimal harebrained scheme :)
You all sound like very capable software engineers, and I know it's both fun and satisfying to build and make The Thing.
Good luck, sincerely.
p.s. You may also consider pursuing some of the medium sized targets like Backblaze, Rackspace, Larry Ellisons Oracle OCI, or Microshaft Azure.
(sorry, I couldn't resist having some fun at the end, though the suggestion is serious!)
Always good to see more in this space! Long fat networks (LFNs or "elephants") are everywhere, especially once you start moving data between continents.
I've had success personally with UFTP, but you explicitly set the transmit rate. Don't forget to enable encryption/authentication if you want the downloads to be verified! You'll get silent UDP corruption otherwise: http://uftp-multicast.sourceforge.net/
I actually read the entire article and was specifically looking for a reference to hpn-ssh which I think is the most standard way to approach this … can op comment here on that tooling and how that compares and contrasts ?
I haven't seen hpn-ssh before, but from a cursory look at the project page it looks like the main improvements are targeted at improving the speed of the encryption using multi-threading, and increasing ssh/scp buffer sizes. These are certainly good improvements over standard ssh/scp (and setting TCP buffers to the value of the bandwidth delay product for a particular network path is a well known way to squeeze some perf out of TCP) but do not address the root cause of slowdown in window-based, loss-based congestion control.
In order to be fair to other flows, exponential back-off is required on detection of congestion, and packet loss as an indicator of congestion is both a lagging indicator of congestion and has a very low signal to noise ratio on high throughput, lossy networks.
hpn-ssh is specifically designed for high latency, high bandwidth file transfer and is more than just "big buffers and multi-threaded." And the question remains: how does your solution compare in simulated and real-world testing?
It's a little strange that you "conducted an extensive literature review" of congestion algorithms but you aren't aware of basic common tools like hpn-ssh, wget2's multithreading mode, or GridFTP which is used extensively in particle physics and genetics research communities.
Thanks for the feedback. The file transfer ecosystem is very large and conducting a through review of the application level tools was not the goal of this project, as the overwhelming majority of them focus on differences at the application layer, not the transport layer.
We are specifically focusing on rebuilding a congestion control algorithm from the ground up that can better tolerate modern network conditions, including things like high bandwidth, high packet loss, and high latency.
With respect to Grid-FTP, wget2 multi-threading, and other multi-flow approaches: the problem with getting performance increases out of multiple, distinct traffic flows is that you become more and more unfair to other packet traffic as you increase the number of flows you are using. For example, if you use 9 TCP (or any other AIMD) flows to send a file over some link, and a tenth connection is started, you now are taking up to 90% of the available bandwidth (because AIMD flows are designed to be fair amongst themselves).
I tried a couple of different programs for personal use and udt/wft wouldn't compile for me even after 2 hours+ messing with it
CERN's fdt https://github.com/fast-data-transfer/fdt was way better. I did use it over ssh port forwarding which artificially slowed things down, but it saturated my uplink to 300-400 MB/s
IMVHO the main issue in file transfer today is that in 2022 most people still do not have a public ip (like an IPv6 global ones) so most people still have NAT traversal issues and need to relay on third parties or not-so-performant more or less distributed networks...
The second main issue is that most do not own a personal domain name with a subdomain per personal host (like {desktop,craphone,laptop}.mydomain.tld etc).
Those two issues are so big IMVHO that push all others aside...
Yep. Most people do not have an internet connection. They only have a web+ connection from a mobile telco further restricted by not having control of their hardware. So, if we care about the internet we can ignore them and just develop as we always have for actual computers on the internet.
But if you're only profit motivated then this isn't reasonable and you should only target smartphones without internet access and gimp your software to make it possible to run on such limited platforms.
That's why some use FLOSS and others have invented the concept of public research, and that's why they are they are fought, parasitized, hindered, denaturized from the inside.
The trick is to know how much "little" critical mass is needed to succeed...
I worked at a tier 2 ISP about 15 years ago that developed multiple products trying to sell accelerated transfers as a service. They worked similarly to what this article describes. The problem was that there were very few buyers. It's easier to sell transparent acceleration boxes as an appliance, and even then it's very niche.
Never understood, once SMB gets going it's pretty fast, but it takes agessss to list a directory. Like why can't it just pipe the output of dir() or ls (when samba) out over the network.
So congestion and packet loss are different problems; it is true that forward error correction could be a good way to avoid retransmitting lost packets, but the only way to avoid congestion is to adjust the congestion window (for window based congestion control) or packet sending rate (for rate based congestion control) based on some indicator of congestion.
The canonical UDT implementation does not come with encryption, however there are some older open source GitHub repos that have attempted to add TLS to UDT. The original author of UDT, Yunhong Gu, has a project called Sector/Sphere that adds some application-level encryption to file transfer if you want to check it out: http://sector.sourceforge.net/. We've added encryption for our algorithm though!
With respect to QUIC, I believe it was designed specifically to reduce the latency of HTTP connections by using multiple UDP flows and building the reliability/ordering guarantees at the application layer.
The problem with getting performance increases out of multiple, distinct traffic flows is that you become more and more unfair to other packet traffic as you increase the number of flows you are using. For example, if you use 9 TCP (or any other AIMD) flows to send a file over some link, and a tenth connection is started, you now are taking up to 90% of the available bandwidth (because AIMD flows are designed to be fair amongst themselves).
> The problem with getting performance increases out of multiple, distinct traffic flows
Is this in response to Quic or just multiple TCP streams? Afaik quic multiplexes everything over a single connection on a single port.
As for fairness, the reality (aiui) is that neither tcp nor udp is guaranteed more bandwidth. It's all up to middleboxes, and I think there are quite a few that assume udp is "less important"/"can deal better with degraded performance" (eg video conferencing, live streams) and instead prefers tcp, in which case there's no recourse. Did you ever observe any conditions like this?
> a feedback control algorithm best known for its use in TCP congestion control. AIMD combines linear growth of the congestion window when there is no congestion with an exponential reduction when congestion is detected.
> If your internet connection is 1Gps and you are transferring a 10Gb file, it should theoretically take 10 seconds to transfer
Err what? I don't know the this "Gps" unit is, but if it's 1Gbps (gigabit per second), and a 10GB (gigabyte) file, that's not how it works... it would be 80 seconds.
It should be OK. It says "10Gb" not "10GB", ie. it is a 10 gigabit file. (while it is untraditional to measure file size in bits, it should be perfectly fine)
Sorry about the typo-you are right it should be Gbps. As for the transfer time, we are just using bits for the file size to make the mental math easier.