How we built a new, fast file transfer protocol

AitchEmArsey · on March 26, 2022

Interesting, but somewhat misses the point; the reason people want an alternative to Aspera is that no-one wants to pay for file transfer tools.

mcharawi · on March 26, 2022

Thanks for the feedback-we're actually planning to open-source a version of our work that significantly improves on the original UDT project: https://udt.sourceforge.io/.

AitchEmArsey · on March 26, 2022

Look forward to it. I'd be interested to hear how your tool compares with Facebook WDT[1], as that would be my go-to right now if someone asked me for a fast point-to-point data transfer solution.

[1] https://github.com/facebook/wdt

killingtime74 · on March 27, 2022

Interesting project, is it still maintained? The last release was in 2016

AitchEmArsey · on March 27, 2022

I don't have the expertise to determine whether it is "abandonware" or simply "done" I'm afraid. Found it a bit of a pain to use, hence why an alternative sounds interesting.

rurban · on March 27, 2022

did you describe your UDT improvements somewhere? why not push them to upstream?

fn-mote · on March 26, 2022

This article was interesting and also frustrating to read.

1. There are very few numbers. In particular, improvement in performance under various circumstances is _not_ given! If you dig around you can find their transfer time application [1], but there is no discussion on that page.

2. The basis for the improvement is not spelled out. (References are given, but you have to know the field - "acronyms only".) If I understand correctly, their contribution is the improved measures of congestion used. Their landing page just touts "don't use TCP"... which sounds like Step 0 of a very long process.

I admit, the title is basically accurate: "how to build" not "the performance of".

tl;dr: Start with existing work, simulate and improve incrementally.

I don't know anything about the field, but this article didn't lead me to understand any better. I'd love to know the real numbers they observed, which approaches didn't pan out, are they effectively using an error correcting code?

Anyway, it's certainly not an academic paper - just an advertisement.

[1] https://www.trytachyon.com/file-transfer-calculator

amelius · on March 26, 2022

I just tried the calculator. It seems that if you're in "US Metro" or "Europe", then the transfer protocol is just as fast as TCP, is this correct? I wonder why this is the case. Is it because the routers play more fairly?

jandrese · on March 26, 2022

I would expect it means your service provider isn't dropping packets. Their protocol seems to just be more aggressive about not backing off in the face of packet loss, which is helpful if one of your links is a marginal radio connection.

The cynic in me thinks they achieve better throughput because they don't play nice with TCP and monopolize the link while everybody else gets backed off.

drpixie · on March 26, 2022

I'm inclined to agree. Perhaps there's a place for a "fast" FTP protocol but I wouldn't want to be sharing the link with them ... imagine the cries going around the office ... "has the internet gone down again!?"

For TCP transfers, I expect to see near 100% utilisation (if I'm not sharing bandwidth) but that's on wired (reliable) links.

hansel_der · on March 27, 2022

nailed it.

communication links are generally a shared resource (common good) and the experience will degrade for everyone if individual transfers/users aren't backing down voluntarily.

ncmncm · on March 27, 2022

You can do better than TCP and still "play nice". Just using up exactly the bandwidth TCP refuses to use, as FASP does by default, gets you much faster transmission.

hansel_der · on March 27, 2022

> Just using up exactly the bandwidth TCP refuses to use

There's not much good to say about ppl that "use up the space others refuse to use" at the queue in a store, thou.

i.e. our network links are generally not designed to be used at full capacity b/c queueing is hard and buying faster/more gear is easy.

ncmncm · on March 27, 2022

Then let's not say it. Clearly you miss the point: any provisioned backbone bandwidth that does not carry a packet just goes to waste. Network backbone channels are not provisioned like your consumer-grade cable TV network endpoint hubs. Carriers pay for guaranteed capacity. When TCP fails to use bandwidth capacity, it is failing to use backbone bandwidth, not dodgy last-mile bandwidth.

Even moreso: when so configured, FASP is having exactly zero effect on the actual bit rate of the TCP connections. Another TCP connection would compete for the same packet slots, slowing them all down. The FASP traffic does not slow down any of the TCP connections, but only competes with other FASP (or similar) traffic.

jandrese · on March 28, 2022

Anything that causes packet loss will slow down TCP. If you are fully saturating a link you will introduce packet loss. It's just how TCP works.

ncmncm · on March 28, 2022

And if you are fully saturating exactly the packet slots TCP would not try to use, then you are not introducing packet loss. That is just how FASP works.

hansel_der · on April 1, 2022

there is just no way for end-points to know about the 'unused slots' much less use them specifically (w/o introducing packet loss or latency).

hansel_der · on March 27, 2022

> The cynic in me thinks they achieve better throughput because they don't play nice with TCP and monopolize the link while everybody else gets backed off.

sounds like BBRv1

mcharawi · on March 26, 2022

Thanks for taking the time to read it! To address your concerns:

1. To give you an idea of the speed improvements, we transferred a 2GB file between Ohio and Singapore on AWS and were able to transfer it in 0:26 (seconds) using our protocol, vs 2:15 for SCP.

2. The basis for improvement is taking into account the changes in round-trip-time for a particular network path; these temporary increases are used as the primary congestion signal.

We are not using error correcting codes, which are good for preventing the retransmission of packets but do not address the underlying problem of avoiding congestion in a network.

koprulusector · on March 26, 2022

Can I ask a dumb question? Why SCP and not rsync?

metadat · on March 26, 2022

There is no meaningful difference, at the end of the day they'll both be fine for measuring the base case of single-threaded TCP connection performance.

I prefer rsync as well, but SCP works on more machines by default (no dependency on rsync being installed on the target host).

Straw · on March 26, 2022

How much was SCP affected by TCP buffer size tuning?

mcharawi · on March 26, 2022

Hey HN! I'm Mahamad, co-founder of Tachyon Transfer, where we're building faster file transfer tools for developers. We've spent the last year building an ultra-fast FTP replacement, and we thought we'd show you guys what our technical process was like. Let me know if you have any questions!

KennyBlanken · on March 26, 2022

Please show performance tests versus hpn-ssh, GridFTP (aka, the defacto tool of the particle physics and genetics research communities) and simpler systems like wget2's multi-threaded mode.

eps · on March 26, 2022

Would also be nice to compare against different standard TCP congestion avoidance algs, of which there's plenty.

It is, after all, a very well researched area.

Bancakes · on March 26, 2022

Can I tunnel this over SSH and use it the same way as faster drop-in replacement for SFTP? (Why not?)

mcharawi · on March 26, 2022

Standard SSH uses TCP over port 22 by default, so it wouldn't be possible without modifying SSH to use a different protocol. That being said, however, our protocol uses TLS over UDP via the OpenSSL libraries so it is secure by default. We also offer a BSD-style socket interface that you can use if you want a drop in replacement for TCP sockets. Shoot me a note at mahamad _at_ trytachyon _dot_ com if you want to chat!

klabb3 · on March 26, 2022

What's the difference between "TLS over UDP" and Quic? Did you compare your solution to some Quic implementation?

Also, how was CPU usage? In my experience UDP-based stream protocols do signifanctly worse (I assume due to user-space scheduling and unoptimized hot paths). This may be particularly relevant for laptops and mobile.

unmole · on March 27, 2022

> What's the difference between "TLS over UDP" and Quic?

QUIC is much much more than TLS over UDP.

hansel_der · on March 27, 2022

how much?

unmole · on March 27, 2022

Ha! Around 4 RFCs and 20 Internet Drafts more: https://datatracker.ietf.org/wg/quic/documents/

tener · on March 26, 2022

Can you share some actual performance numbers across whatever are the key metrics that you observe?

rsync · on March 26, 2022

Is this software that one licenses and uses on any arbitrary network or do you run a network of some kind that users pay to access?

Or both ?

I think this is a software package but the tl;dr doesn’t make that clear to me…

mcharawi · on March 26, 2022

At the moment we offer both options. We offer our own network with a pricing plan similar to massive.io (though 10c per gb vs 25c) Our licensing is cheaper but requires large volumes.

ncmncm · on March 27, 2022

The article has essentially no technical information on how the protocol works, besides the unlabeled flow rate graphs which show TCP rate popping up and down and theirs more or less constant. Their rate appears much less stable than FASP's. OTOH, FASP is extremely expensive.

I have updated the Wikipedia page on the FASP protocol they compete with, to provide more detail on how it works. Theirs might work similarly.

Curiously, FASP usage only really took off after the product got a comprehensive GUI file management app, with scheduling and a speed control you could drag up and down. Being able to transfer files overseas several times as fast as the competition was not enough.

FASP was on HN a couple of years ago, https://news.ycombinator.com/item?id=21898072

mcharawi · on March 28, 2022

Hey, thanks for your comment-you're right there aren't too many details on the protocol changes we made in part because we will follow up on that in another post. This blog post was getting too long as it is and we wanted to focus more on the need to simulate and test. Your comments on the hacker news post you link too actually partially served to inspire us!

ncmncm · on March 28, 2022

I am glad to have somebody acting on this stuff.

The original design for FASP's flow control was developed in China and applied in a router: you would have one at each end, and it would spoof TCP for clients. That was a huge flop, because it didn't say Cisco on the nameplate.

The Aspera principals realized they could implement it in user space using UDP, bypassing the network infrastructure purchasing cabal, and sell directly to the users who had data to move.

I always wanted to get the astronomy community using it (e.g. to Antarctica and Atacama, Chile) under a free license, but never quite got there.

metadat · on March 26, 2022

> It took us a little while to build UDT..

> Building this infrastructure took a substantial amount of time..

> If anyone is interested in trying the Tachyon Transfer Algorithm we offer a storage transfer acceleration API like AWS does. Our SDK includes node, c++ and objc and could be used in a wide variety of applications

So it was a lot of effort, and now they're inviting Big-G and Cloudflare to contact them to possibly achieve a paltry 30%-ish speed increase for certain scenarios? Or are they inviting app devs who want faster video uploads to reach out? What is the actual use case where the sometimes 30% improvement matters and actually moves the needle?!

Why hasn't Tachyon been working with their prospective customers and warming them up the whole time, or at least working the social and investor nets and reaching out proactively already?

This strategy is kind of like being a dweeb at a poorly lit school dance and hoping the most popular girl at the dance somehow notices you're wearing shoes that let you float a centimeter in the air. Cool trick, bud.

Presumably it's not a $10/mo service contract. Is this really an effective strategy when building and selling to enterprise these days? To me it sounds like a risky and hard way to make less money than what is possible using tried and true product development strategies. To be fair, I have also made this mistake before. It was embarrassing enough as a solo-founder, and seems less forgivable with larger founding group sizes, because it means more folks agreed to support and follow such a sub-optimal harebrained scheme :)

You all sound like very capable software engineers, and I know it's both fun and satisfying to build and make The Thing.

Good luck, sincerely.

p.s. You may also consider pursuing some of the medium sized targets like Backblaze, Rackspace, Larry Ellisons Oracle OCI, or Microshaft Azure.

(sorry, I couldn't resist having some fun at the end, though the suggestion is serious!)

hansel_der · on March 27, 2022

sarcasm aside, FANG's have something similar for some time now.

this is aimed at consumers (scientists)

Scaevolus · on March 26, 2022

Always good to see more in this space! Long fat networks (LFNs or "elephants") are everywhere, especially once you start moving data between continents.

I've had success personally with UFTP, but you explicitly set the transmit rate. Don't forget to enable encryption/authentication if you want the downloads to be verified! You'll get silent UDP corruption otherwise: http://uftp-multicast.sourceforge.net/

rsync · on March 26, 2022

I actually read the entire article and was specifically looking for a reference to hpn-ssh which I think is the most standard way to approach this … can op comment here on that tooling and how that compares and contrasts ?

mcharawi · on March 26, 2022

Thanks for reading!

I haven't seen hpn-ssh before, but from a cursory look at the project page it looks like the main improvements are targeted at improving the speed of the encryption using multi-threading, and increasing ssh/scp buffer sizes. These are certainly good improvements over standard ssh/scp (and setting TCP buffers to the value of the bandwidth delay product for a particular network path is a well known way to squeeze some perf out of TCP) but do not address the root cause of slowdown in window-based, loss-based congestion control.

In order to be fair to other flows, exponential back-off is required on detection of congestion, and packet loss as an indicator of congestion is both a lagging indicator of congestion and has a very low signal to noise ratio on high throughput, lossy networks.

KennyBlanken · on March 26, 2022

hpn-ssh is specifically designed for high latency, high bandwidth file transfer and is more than just "big buffers and multi-threaded." And the question remains: how does your solution compare in simulated and real-world testing?

It's a little strange that you "conducted an extensive literature review" of congestion algorithms but you aren't aware of basic common tools like hpn-ssh, wget2's multithreading mode, or GridFTP which is used extensively in particle physics and genetics research communities.

mcharawi · on March 26, 2022

Thanks for the feedback. The file transfer ecosystem is very large and conducting a through review of the application level tools was not the goal of this project, as the overwhelming majority of them focus on differences at the application layer, not the transport layer.

We are specifically focusing on rebuilding a congestion control algorithm from the ground up that can better tolerate modern network conditions, including things like high bandwidth, high packet loss, and high latency.

With respect to Grid-FTP, wget2 multi-threading, and other multi-flow approaches: the problem with getting performance increases out of multiple, distinct traffic flows is that you become more and more unfair to other packet traffic as you increase the number of flows you are using. For example, if you use 9 TCP (or any other AIMD) flows to send a file over some link, and a tenth connection is started, you now are taking up to 90% of the available bandwidth (because AIMD flows are designed to be fair amongst themselves).

hansel_der · on March 27, 2022

glad to see you reflect on the unfairness problem.

since you did not know hpn-ssh, google's BBR aproach might also be of interest.

JZL003 · on March 27, 2022

I tried a couple of different programs for personal use and udt/wft wouldn't compile for me even after 2 hours+ messing with it

CERN's fdt https://github.com/fast-data-transfer/fdt was way better. I did use it over ssh port forwarding which artificially slowed things down, but it saturated my uplink to 300-400 MB/s

kkfx · on March 26, 2022

IMVHO the main issue in file transfer today is that in 2022 most people still do not have a public ip (like an IPv6 global ones) so most people still have NAT traversal issues and need to relay on third parties or not-so-performant more or less distributed networks...

The second main issue is that most do not own a personal domain name with a subdomain per personal host (like {desktop,craphone,laptop}.mydomain.tld etc).

Those two issues are so big IMVHO that push all others aside...

superkuh · on March 26, 2022

Yep. Most people do not have an internet connection. They only have a web+ connection from a mobile telco further restricted by not having control of their hardware. So, if we care about the internet we can ignore them and just develop as we always have for actual computers on the internet.

But if you're only profit motivated then this isn't reasonable and you should only target smartphones without internet access and gimp your software to make it possible to run on such limited platforms.

kkfx · on March 27, 2022

That's why some use FLOSS and others have invented the concept of public research, and that's why they are they are fought, parasitized, hindered, denaturized from the inside.

The trick is to know how much "little" critical mass is needed to succeed...

mypalmike · on March 26, 2022

I worked at a tier 2 ISP about 15 years ago that developed multiple products trying to sell accelerated transfers as a service. They worked similarly to what this article describes. The problem was that there were very few buyers. It's easier to sell transparent acceleration boxes as an appliance, and even then it's very niche.

amaccuish · on March 26, 2022

Never understood, once SMB gets going it's pretty fast, but it takes agessss to list a directory. Like why can't it just pipe the output of dir() or ls (when samba) out over the network.

hansel_der · on March 27, 2022

comparing a ford-fiesta with a hypersonic rocket-car; totally different applications.

parallelism, integrity/reliability and efficiency get somewhat more important once you regularly have to shuffle petabytes around the globe.

ac130kz · on March 26, 2022

Some basic transfer based on UDP with forward error correction is a really good solution to tackle packet loss and avoid TCP congestion entirely.

mcharawi · on March 26, 2022

So congestion and packet loss are different problems; it is true that forward error correction could be a good way to avoid retransmitting lost packets, but the only way to avoid congestion is to adjust the congestion window (for window based congestion control) or packet sending rate (for rate based congestion control) based on some indicator of congestion.

bradknowles · on March 28, 2022

You need to show detailed benchmark examples with other protocols, including s3.

Otherwise, it's empty paper-ware.

dochtman · on March 26, 2022

Does UDT come with encryption? If so, how does it compare to QUIC?

mcharawi · on March 26, 2022

The canonical UDT implementation does not come with encryption, however there are some older open source GitHub repos that have attempted to add TLS to UDT. The original author of UDT, Yunhong Gu, has a project called Sector/Sphere that adds some application-level encryption to file transfer if you want to check it out: http://sector.sourceforge.net/. We've added encryption for our algorithm though!

With respect to QUIC, I believe it was designed specifically to reduce the latency of HTTP connections by using multiple UDP flows and building the reliability/ordering guarantees at the application layer.

The problem with getting performance increases out of multiple, distinct traffic flows is that you become more and more unfair to other packet traffic as you increase the number of flows you are using. For example, if you use 9 TCP (or any other AIMD) flows to send a file over some link, and a tenth connection is started, you now are taking up to 90% of the available bandwidth (because AIMD flows are designed to be fair amongst themselves).

klabb3 · on March 26, 2022

> The problem with getting performance increases out of multiple, distinct traffic flows

Is this in response to Quic or just multiple TCP streams? Afaik quic multiplexes everything over a single connection on a single port.

As for fairness, the reality (aiui) is that neither tcp nor udp is guaranteed more bandwidth. It's all up to middleboxes, and I think there are quite a few that assume udp is "less important"/"can deal better with degraded performance" (eg video conferencing, live streams) and instead prefers tcp, in which case there's no recourse. Did you ever observe any conditions like this?

moreati · on March 26, 2022

> AIMD

Additive Increase Multiplactive Decrease (for others wondering)

> a feedback control algorithm best known for its use in TCP congestion control. AIMD combines linear growth of the congestion window when there is no congestion with an exponential reduction when congestion is detected.

-- https://en.wikipedia.org/wiki/Additive_increase/multiplicati...

charcircuit · on March 26, 2022

Where is the download for the SDK?

kevinherron · on March 26, 2022

> If your internet connection is 1Gps and you are transferring a 10Gb file, it should theoretically take 10 seconds to transfer

Err what? I don't know the this "Gps" unit is, but if it's 1Gbps (gigabit per second), and a 10GB (gigabyte) file, that's not how it works... it would be 80 seconds.

madsbuch · on March 26, 2022

It should be OK. It says "10Gb" not "10GB", ie. it is a 10 gigabit file. (while it is untraditional to measure file size in bits, it should be perfectly fine)

mcharawi · on March 26, 2022

Sorry about the typo-you are right it should be Gbps. As for the transfer time, we are just using bits for the file size to make the mental math easier.