Hacker News new | past | comments | ask | show | jobs | submit login
FTP is 50 years old (filestash.app)
417 points by elvis70 on April 16, 2021 | hide | past | favorite | 183 comments



Amazing.

I never thought I would say this, but I actually implemented an FTP server in 2020. This was needed to support firmware updates to specific hardware (Electric Vehicle charging stations). Apparently embedded software developers choose FTP whenever a spec doesn't specify how binary file transfers should work.

It was kind of amusing getting FTP to work in a modern cloud environment. I run a single Kubernetes pod with a Node.js based FTP server optimized for one thing: Transferring files between FTP and Google Cloud Storage. A series of ports are specified in the Docker file to enable passive FTP transfers.

Even more amusing was the number of varieties in which FTP was implemented by different hardware manufacturers. I regularly had to dive into the FTP libraries to add support for crazy edge cases (tcpflow in kubectl exec -it is your friend!). Example: one device added a newline in the middle of a command (USER\n myusername)..

The latest curve ball I received this week is that a certain firmware version of a Qualcomm modem chip cannot deal with the size of the IP packets coming from our FTP server... Fun stuff!


Implementing an FTP server from scratch that had to be compatible with lots of clients in 2020 was an interesting choice. Just to have it in javascript? Perhaps security-motivated? There are probably battle tested implementations in e.g. Python, Java or other safe-ish languages to build on?

I learned this lesson in the mid 90s when fixing client compatibility bugs in an FTP server module we had built in an interpreted language, because, how hard could it be...

> The latest curve ball I received this week is that a certain firmware version of a Qualcomm modem chip cannot deal with the size of the IP packets coming from our FTP server... Fun stuff!

Right.


> There are probably battle tested implementations in e.g. Python, Java or other safe-ish languages to build on?

My experience... there aren't a ton of choices in this space. There are a few FTP servers designed to power B2B backend services. Many of the options are designed only to provide access to the local filesystem.


> My experience... there aren't a ton of choices in this space. There are a few FTP servers designed to power B2B backend services. Many of the options are designed only to provide access to the local filesystem.

The "local filesystem" doesn't have to be a local file system. It's just a good, common abstraction useful for interoperability. Why not "rclone mount" your Google Drive, or use some other FUSE based file system to get easy interoperability between legacy FTP servers and modern storage options?


I don't think I agree that the local filesystem is a good, common abstraction. It's a serviceable abstraction most of the time, and a terrible abstraction at other times. This isn't just some contrarian stance I'm taking--I've just spent too much time fighting with filesystem semantics on too many OSs.

Going through FUSE is, in my mind, a last resort.


How much better have we gotten at specifying protocols? Have we learned how to make protocols less ambiguous and less susceptible to crazy edge cases which make it burdensome to implement support in practice once there are lots of sloppy implementations in the field?


1. in many cases the misbehaving clients and servers are obviously wrong, but Postel's law means they "worked for me" when the developer tested it 10 years ago before abandoning it.

2. The FTP protocol got a lot of cruft added to it that modern clients don't implement (e.g. any transfer mode other than stream).

3. FTP over TCP predates NATs and firewalls, which caused a lot of problems as well.

4. FTP was designed for human-readable, not machine-readable output. In particular the output of a LIST command is woefully underspecified.

I think #1 is the biggest issue for long-term viability of protocols. Not following Postel's law is a recipe for death (same reason why it's suicide for a browser to unilaterally untrust a major CA; any site that doesn't work in browser X is assumed to be browser X's fault), but following Postel's law is a recipe for undocumented de-facto standards with crazy edge cases.


Thanks, wonderful reply!

I see that there's been a debate over the applying Postel's Law, a.k.a. the Robustness Principle, for some time:

https://en.wikipedia.org/wiki/Robustness_principle#Criticism

> a defective implementation that sends non-conforming messages might be used only with implementations that tolerate those deviations from the specification until, possibly several years later, it is connected with a less tolerant application that rejects its messages.

That Wikipedia page led me on to this IETF draft from 2019 on protocol maintenance

https://tools.ietf.org/html/draft-iab-protocol-maintenance-0...

> Abstract

> The robustness principle, often phrased as "be conservative in what you send, and liberal in what you accept", has long guided the design and implementation of Internet protocols. The posture this statement advocates promotes interoperability in the short term, but can negatively affect the protocol ecosystem over time. For a protocol that is actively maintained, the robustness principle can, and should, be avoided.

It seems that you need both for the protocol to be unambiguously, fully specified, and for popular implementations to avoid applying Postel's Law! But we've seen how market forces conspire to work against that.

Brainstorming opportunities for improvement beyond the suggestions in the IETF doc:

• Accompany the protocol with a validation test suite.

• Provide a validation service.

• Treat non-validating messages skeptically ("quirks mode").


Just curious on the motivation.

Why not run a regular FTP server and have your application periodically look for new files to process? For horizontal scaling, you just take a distributed lock on the file name.


That honestly sounds more complicated. FTP isn't that difficult of a protocol, especially if you only need to support one known client, you can take all sort of shortcuts.

If you deploy an existing FTP server, and _then_ integrate with it at the filesystem level you now have two components, and your sysadmin requirements grow. Now you gotta administrate an FTP server that's probably written for classic UNIX single server usage, gotta handle filesystem permissions, gotta somehow hook up your distributed locks to the filesystem, sanitize filenames for your chosen filesystem.

Honestly filesystems suck, there's so many gotchas from a security perspective, when all you really want is to pipe binary data in this side, and out the other side.

I implemented an IRC bot in a few hours in javascript one day. Those classic IETF text based protocols are actually really fun and easy to implement, especially in a language that makes strings safe and easy (i.e. not C).

I could easily see figuring out all the deployment concerns around integrating with an existing FTP server end up taking way longer than just integrating the subset needed for this use case.


Strings in c aren't that bad with stuff like bstring or glib. Just gotta be careful with deallocations so you don't get leaks. Much better than all the security issues with std strings. I mean not anywhere close to python/ruby/javascript easy but it's not as bad as HN likes to declare it.


Active mode is pretty weird. Coordinating a single client across two ports sounds difficult to me, but I've never implemented it. If that's not a difficult protocol, then what is?


Http3, IMAP, Caldav and MAPI?

Cordinating a client across two ports sounds trivial compared to for example properly implementing client and server versions of IMAP search commands when no client or server follows the specification.


Until you realize that the two ports commonly involve dealing with middleware that cant handle it properly.


You will have similar issues with other protocols as well. I have seen physical bandwidth limiters, windows drivers, corrupt winsock LSPs etc mess with IMAP traffic. I have seen middleware replacing SMTP commands sent from a client to server for no good reason, such as replacing "EHLO hostname" with "EHLO **".


The lock would be handled by something like Redis or a DB.

But yeah, if you only need to support one client, I can see the reasoning. It would never have flied at any of the places I've worked, though, having to support tons of clients.


Yeah totally I hear you. If you have to support many external clients using an existing ftp implementation makes lot more sense.


I had considered this, but decided against this for a couple of reasons:

- Scaling requirements are relatively low. Even though we're dealing with 10 thousands of devices, the amount of firware updates at a given time to those devices is minimal. Our main scaling challenges are around OCPP over websockets. Story for another day.

- I have bad memories of ProFTPd etc buffer overflow exploits.

- I wanted something simple that could bridge between FTP and our cloud persistence (MongoDB and Cloud Storage).

- I found this Node.js library that I since then forked: https://github.com/autovance/ftp-srv - The great thing about this library is that it allows a quick implementation of a custom filesystem.

- For Kubernetes pods the file system should really be treated as a /tmp - which we are doing.

- When a charge station connects, the FTP username/password is a temporary generated set of tokens that is checked against our MongoDB.

Essentially, I'm using FTP as a throwaway here.

If you think through this you can imagine it would be quite a lift to accomplish this with an existing FTP server.


They describe why, tons of edge cases from poorly implemented ftp clients, they needed full control over the protocol to handle weird edge cases because of multiple embedded clients.


Because nodejs kubernetes modern cloud.


The main reason for using Node.js is because the rest of our stack is Node.js: https://bedrock.io

We use MongoDB as persistence and have existing wrappers for dealing with Google Cloud Storage.

Since it's an isolated service we could've used a different implementation language.

In our case Node.js in our existing Kubernetes environment was the least amount of friction


> Because nodejs kubernetes modern cloud.

Not necessarily so. The history of FTP servers is ridden by bugs with practically no exceptions. At some point some folks decided they finally implement a bug-free implementation and even dared to call it "Very Secure FTPd." Needless to say, it turned out it has bugs, too.

As most of these bugs were related to buffer overflows and similar issues, implementing a new FTP server in a safer language is not such a bad idea, and today's JavaScript is efficient enough to make it a reasonably well-working implementation. I pity the author though for the bugs they encounter and workarounds that will need to be implemented.


I agree, but I don't see why we would then assume that forking some ftp server library from npm would fare any better, security wise.

I see a fairly alarming open issue: https://github.com/autovance/ftp-srv/issues/167


Exactly my reasoning. See my comment above for more info.

I'm making the maintenance of this less painful by doing a hacking/debugging session with manufacturers once a month where we hook up many devices and fix issues. After addressing most edge cases fewer are coming up now (despite a relentless stream of new cheaply manufactured devices)


I’m just glad they used ftp over tftp. Maybe someday they’ll use FTPS but I have my doubts it’ll ever catch on with the popularity of SFTP.


Just to note for those who don't know, FTPS and SFTP are completely different protocols.

The similarity of the names often causes confusion, but SFTP has nothing to do with the venerable FTP.

SFTP stands for "SSH file transfer protocol" and it's a completely different beast. IMHO, a somewhat unfortunate naming choice, but that's water under the bridge.

(...while FTPS stands for "FTP over SSL", and that actually uses plain old FTP with an additional SSL/TLS layer...)


These standards are so very different, and they don't scale well.

TFTP is actually over UDP, guarantees only one data packet on the wire at any one time (no sliding window), does not support listing a remote directory, and is extreme in simplicity.

FTPS has such arbitrary controls for TLS optional versus required status over control and data channels that it is easy to misconfigure.

SFTP lacks two key features (amidst jump host and other scope creep frenzy), anonymous mode and URL support in a browser.

A new file transfer protocol, restricted to DJB ciphers a la Wireguard, able to run over TCP or UDP would likely be best. If Chrome and Safari both added browser clients, the server world would likely dump most FTP the next day.

https://mywiki.wooledge.org/FtpMustDie


SFTP supports anonymous access. I actually just shut down my sftp server to move it or I would be able to show you, but it's super easy on CentOS. Just set up chroot and set a null pw for the usernames of your choice. You can use posix permissions to hide subdirs or files if you wish. You can use chattr or mount permissions to make it read-only or write-only. The only thing missing is browser support. I might have time to put it back online later today and will update this thread.


Ideally, an FTP emulation of any password for FTP/anonymous, recorded to /var/log/secure, would be within SFTP (maybe checking for an "@" character followed by some dots, hoping for an email).

Forcing the null password up the stack to /etc/shadow (or other credential sources) potentially compromises PAM and other applications that may depend upon it.

It sounds like you've implemented a separate SSH server within a chroot for this to protect the base OS; I've done the same for tinyssh with nspawn for an internal project. This is not easy.

Anonymous access for SFTP doesn't scale to the extent used in FTP, even omitting browser access.


FTP is certainly more flexible and virtual users are far more secure than adding folks to /etc/passwd. PureFTPd [1] was my favorite for that very reason. There have been a few FTP daemons that supported the SFTP protocol and had virtual users, but they had too many bugs for me. I believe ProFTPd was one of them.

Regarding SFTP and null passwords, I do not use a separate sshd. I just use the "Match" stanza in OpenSSH. Any SFTP users I add are in the sftpusers group and don't have a shell. SELinux will block some nonsense. For a few years, I had a cron job that was dynamically adding any account that bots would try. I think I was up to about 23k SFTP accounts. I will fire it back up either today or tomorrow and you are welcome to do a pen-test on it. I will also post the sshd_config.

[1] - https://www.pureftpd.org/project/pure-ftpd/


I was forced to implement chroot() for SFTP users under Oracle/RedHat Linux 5. We are, alas, still running it.

The OpenSSH 4.3 release on this platform does not support the "match" keyword, but I was able to coerce it to run a separate SFTP-only on port 24, where I constrained the SFTP-specific accounts. I find that I prefer this approach.

My wily users then discovered that the working passwd entry also let them login with FTP on port 21, so careful control of allowed groups for both protocols was eventually required. Afterwards there is always the nagging suspicion that something was missed.

OpenSSH would also be much better with localized SFTP accounts that were not defined in /etc/passwd. Add that to the wishlist.


Makes sense. I also had to implement a work around "scponly" for CentOS 5. Not fun.

I put a sftp server back up. Feel free to play around with it. This is a single sshd instance and a copy of the config is in the /pub directory of the anonymous user. I did not change anything in pam. The sftp users are selinux confined as user_u.

    server:  45.79.100.12
      port:  22
  username:  anonymous, anon, pub, public
  pw: (null) just hit enter
This message probably won't age well if I remove that node.


Updated this system. It should create accounts if you try to sftp to it. Just wait 2 mins and try the same name again. Any non system account 2 to 16 chars starting with alpha should work.


It uses pam so you can configure it however you want or even write your own module.

It’s dangerous though, be sure to check these users don’t get shell access or the ability to create network pipes.


also symlinks and hardlinks. You can restrict this in the internal-sftp subsystem assuming you force sftp.


>SFTP lacks two key features (amidst jump host and other scope creep frenzy), anonymous mode and URL support in a browser.

You forgot the abysmal performance compared to FTPS due to the SSH flow control conflicting with the underlying TCP flow control.


Keep in mind that I have no control on the client. No manufacturer has implemented TLS let alone sFTP.

If I had any influence on the protocol it would be HTTPs. This is why I wasn't expecting to build an FTP server in 2020.


The good old days ...

Username: anonymous

Anonymous login accepted, enter e-mail address as password.

Password: aoeu@aoeu.com

I bet aoeu.com and asdf.com got a good amount of unwanted mail back then.


I get ASDF, but what is AOEU based on?


https://www.aoeu.com/

> It resulted from a typo. I was writing a domain registration system for an ISP and, during some debugging, forgot to use the --no-act command line flag and ended up registering aoeu.com for real. Since it was so easy to type, I kept it.

The letters A O E U are the first four characters of the left hand on a Dvorak keyboard, which is the layout I use.


Dvorak


This is why example.com exists.


Will Kubernetes or Node.js make it to 50.


i sincerely hope that insecure ftp is either running over tls or a vpn...


Yep VPN. The devices don't support TLS. (We have Cloudflare in front of other services)


doesn't that open up like a ton of man-in-the-middle attacks? I realize it may just be one of those "no other option" things but dang gritted-teeth-emoji


> I never thought I would say this, but I actually implemented an FTP server in 2020.

If you did this for work, junior engineer move in my opinion. This practice is called not invented here syndrome.


This very much depends on the ecosystem they had in their specific embedded environment. Some have a few kilobytes of working memory: so they'd have to download the file and write it straight to flash (yes, security is a problem but manageable). There are many common cases where, for instance, malloc is disabled.

You don't always have off-the-shelf packages for every conceivable environment you work in.


It’s a 50 years old protocol, chances are you’re not the first one who has this problem.


From rfc801 (transition from NCP to TCP/IP) : "FTP: This is specified in RFC 765. It is very similar to the FTP used with the NCP. The primary differences are that in addition to the changes for Telnet, that the data channel is limited to 8-bit bytes so FTP features to use other transmission byte sizes are eliminated."

So FTP is older than the standardisation of byte size :-)


I feel like 7-bit mode stuck around for a while. I remember seeing the option in modern FTP clients.


I believe that option was more to do with transparently converting between DOS and Unix line terminators. I definitely recall several ruined 2400bps modem downloads in the mid-nineties, because I'd failed to issue a TYPE I command.


Not so surprising when you consider it was developed in the era of 36 bit word machines.


I thought a byte ment “by eight”

What’s a nibble of a byte is 8bits


Wikipedia has some info (https://en.wikipedia.org/wiki/Byte#History_of_the_conflictin...):

"The size of the byte has historically been hardware-dependent and no definitive standards existed that mandated the size. Sizes from 1 to 48 bits have been used.[4][5][6][7] The six-bit character code was an often-used implementation in early encoding systems, and computers using six-bit and nine-bit bytes were common in the 1960s. These systems often had memory words of 12, 18, 24, 30, 36, 48, or 60 bits, corresponding to 2, 3, 4, 5, 6, 8, or 10 six-bit bytes. In this era, bit groupings in the instruction stream were often referred to as syllables[a] or slab, before the term byte became common."

If you want to be specific, you could say "octet" to describe a set of 8 bits.


AFAIK, before everything was byte-addressable and this became standard, people used to talk about "word" sizes. Some computers would have 36 or 38-bit words. The CDC 6000 had 60-bit words, where that was the smallest size value you could address in memory. For things like characters, you would have to pack multiple chars per word. The Symbolics 3600 Lisp Machine had 36-bit words, and it used some of those bits to tag different data types so every value in memory had some basic type information attached to it.


By the way, the handy word for an 8-bit byte is “octet” (I have not observed octet used in the software industry, only in school. Don’t use it at work or you might come across as an ass...)

I was curious and searched HN and found this 2012 comment explaining octet is essentially an anachronism, as bytes have been standardized at 8 bits: https://news.ycombinator.com/item?id=4649528 . Feeling old!


> By the way, the handy word for an 8-bit byte is “octet” (I have not observed octet used in the software industry, only in school. Don’t use it at work or you might come across as an ass...)

I see it used (and use it myself) from time to time in contexts where you want to differentiate between 7/8/9-bit bytes, mostly when working with embedded/low-level software.


The place I see it most is in networking RFCs.


Beat me to it. "Octet" is used in a lot of networking documentation, and it also nicely decouples wire formats from any meaning the host system might have for the information. From IP's point of view, an octet is literally that: a bag of 8 bits, without meaning. It's not part of a floating point number, or a UTF character, or anything else but just plain undifferentiated data.


Thanks. My application layer bias is showing. We will always need octet in the low-level networking domain, huh.


> I have not observed octet used in the software industry, only in school.

Commonly, the word octet is used to refer to IP addresses, and the permission octets of files in nix systems.


application/octet-stream


Octet is the word used for bytes in French. Data sizes in French locale are expressed most often in ko, Mo, Go, To ...


I remember my Commodore 64's book (in Spanish) mentioned octets. This brings back memories!


A nibble is half a byte or 4 bits.



Quote:

    > nybble: /nib´l/, nibble, n.
                        ^---- see that word over here?


I hereby declare 256bytes a whopper, and 512bytes a doublewhopper


1024 bytes is a vomit, 4096 is a heart-attack.


SHA-whopper


Nibble and nybble are alternate spellings.


> I thought a byte ment “by eight”

There are differing sizes for bytes. This isn't so common, anymore, but that was the point being made.

> What’s a nibble of a byte is 8bits

A nibble is 4 bits not 8 bits.


On the other side upcoming Firefox 90 is removing support for FTP:

https://blog.mozilla.org/addons/2021/04/15/built-in-ftp-impl...


This could just be auto tested as the code would rarely change. It's like Firefox devel has a dartboard of unique and useful features to be randomly removed.


Aww man, I wonder what the reason is. Firefox is always my go to application for opening things I only use every once in a while.


Because -very- few people use it per their telemetry so they don't want to maintain it anymore. It makes sense to throw out a sizeable chunk of code that is only being used by 0.1% of your users. Lftp is great command line replacement for it and extremely scriptable.


They kinda need to stop acting like their telemetry is the full picture. Their telemetry led them to believe that compact mode was barely used, yet look at the community backlash (which they summarily ignore because "muh telemetry").


Not that I think we should go back, but I do miss the wildwest days of web development when it was still acceptable to FTP untested code straight to production.

For my first dev job we would develop on production using an FTP client to push up changes on save. One day I was writing an SQL UPDATE statement and I forgot to include a WHERE clause. I basically nuked the entire product DB and it took days to recover because it was also not unheard of to not have regular db back ups. No one really questioned it too much though. Stuff like that just happened from time to time back then.

I rarely ever use FTP today. I wonder if students learning to code today even know what FTP is? It was one of the first things I learnt when learning to build websites as a teenager, but I don't that's the case anymore.

It's kind of interesting how processes in tech have evolved as much as the technology over the last few decades. It's hard to think of a good usecase for FTP anymore, but just a couple of decades ago it was used everywhere. Is anyone still using it for anything?


The backend team in one of my previous jobs (around 2010) had an interesting file locking mechanism.

All the developer were sat around one table but you couldn’t see each other behind the dual screens. So every hour or so, somebody stood up to inform the team that he is going to be editing file xyz, so please don’t touch.

It wasn’t production though, everyone was just working on a shared staging system.


>an interesting file locking mechanism

Only 5 years ago, we had an 'editing stick'. Only the person physically in possession of the editing stick was allowed to make edit the configs to our (early 90's vintage) SCADA system.

Frankly the system worked very well: "if it ain't broke, don't fix it".

We only got rid of the editing stick when the machinery in question was scrapped.


Don’t worry, SFTP is the backbone of the US financial system.


That’s a lot more modern than the expected csv over smtp solutions.


But SFTP is nothing like FTP (in terms of protocol).


Preach. I was setting up SFTP batch jobs for financial data last year. New jobs, not migrations.


Hopefully with a modern encryption selection that doesn't have known short circuit attacks?


...and the US healthcare system.


And big mobile telecoms as well.


Im a youngin’ and I regularly use it to transfer Roms to hacked portable systems (psp, vita, 3ds, wii). It’s convenient to transfer larger roms this way.


I still use it to sync folders to my iPad (from Documents Readdle). Since walled gardens don't leave much options open, it's basically either that or a cloud provider -- and no free cloud plan has enough space to store these folders.

So FTP wins easily this one. It's free with unlimited storage, always has been, always will be.


Is that FTP, or SFTP? Despite the similar name they have almost nothing in common.


I just use FTP.


It's still used a lot in finance for file delivery.


I like the user-facing simplicity of FTP as a text-based interface for browsing and downloading/uploading files. No broken links, as the directory layout generally doesn’t change every other year, no eternal-beta web UI, diverse choice of powerful native clients. It’s unfortunate that the protocol itself didn’t held up so well.


FTP is still used by some photographers at major sporting events. The cameras they use actually have RJ jacks in the side, and they run Ethernet cables back to a router, or maybe a second operator with a computer. The cameras have built-in FTP clients, and can upload image files as they are taken. It's apparently important for stills images to appear in near-real-time on sports web sites.


Fun trivia: in the early days of the ARPAnet, mail was delivered (at least on PDP-10's) by FTP-appending a message in to the receipients mailbox file. (Which was protected as append-only to world (owner, group, world).)

Didn't last long, obviously, but that was back in the days when every site had a well-publicized guest login and you could telnet anywhere. (Well, OK, in 1972 there were only a few dozen nodes, but there were some really interesting ones to play with.)

We used to play a game with telnet from HARV-10 where we'd telnet-chain around the world until someone dropped the connection or it got too slow.


Back in the days of X.25 and the UK "coloured book" protocols, grey book email went over blue book "network independent FTP" (in batch). I don't know how NIFTP compared with ARPANET FTP as a protocol, though.


Nice, thanks for sharing this! Sometimes I forget 50 years doesn't refer to the 50s or 60s anymore :)


On reading the headline my first thought was "wow, FTP is way older than I expected" and when I realised it's not, my second thought was: "damn, I'm old..." :)


it will again soonish


Tangentially related. About 10 years ago I considered an idea that would allow websites to accept large files. The web admin would either embed our company's page as an iframe or just link to our whitelabeled url, such as "mydomain.ftphub.com". The admin would then get an email with a download link. Or they could reverse the process and send a link to a customer the way yousendit does.

I still own the domains ftphub.com and ftphub.net, and I put the domains at auction since it looks like I'll never get around to it.

Should I work on this ? Or has this opportunity been commoditized to the point where I there is no way it could turn a profit ? Also has anyone under 30 even heard of "ftp" ? The abbreviation in our brand but not sure it's meaningful anymore. Thanks and sorry if this is too off topic.


under 30 and yes ive heard of FTP. I used them at school to submit assignments for some classes and I use FTPs (albeit sparingly) at my current place of work.


> Jimmy Hendrix died 6 months ago

Pedantry: It's "Jimi".


I wouldn't call it pedantry, it's the guy's name


This places the creation of FTP closer to the collapse of the Ottoman Empire than the present day


And the author of the protocol is still alive:

https://en.wikipedia.org/wiki/Abhay_Bhushan


What's the oldest protocol that's still regularly used?


Perhaps the oldest protocol for automated (in terms of encoding/decoding) text communication that's still being regularly used?

https://en.wikipedia.org/wiki/Baudot_code#ITA2

> In 1924, the CCITT introduced the International Telegraph Alphabet No. 2 (ITA2) code[14] as an international standard, which was based on the Western Union code with some minor changes.

> ITA2 is still used in telecommunications devices for the deaf (TDD), Telex, and some amateur radio applications, such as radioteletype ("RTTY").

It's interesting that the fundamentals of this were invented in 1870s. Who needs semiconductors when you have gears and levers?

https://en.wikipedia.org/wiki/Émile_Baudot


British naval signal and letter flags codified 1817.

Morse code from 1840s. Ham radio still uses it when signal took weak fir voice. Proficiency was required for a ham license until 2006.


In my college comp-sci networking class, one of our projects was to figure out a system for sending messages visually between two parties spaced 100 yards (91 meters) apart as quickly as possible.

Each team of four came up with their own signaling method, trying to balance how many bits were transmitted with each symbol with how quickly we could encode/decode the message.

Binoculars were not allowed. We were outside, so we didn't know how windy it would be.

I seem to recall on my team, we were sending two bits per symbol. I think it was just a large cardboard box, we colored each of the four flaps visually distinct, then unfolded one of the flaps for each symbol.

I don't recall a single team thinking to re-purpose naval flags.

Duh.


https://en.wikipedia.org/wiki/Flag_semaphore would have been good if you needed to send arbitrary text.

Naval flags would be higher bandwidth if you could pre-arrange a modest-sized dictionary of the words and phrases you’d need.

https://www.navalgazing.net/Signalling-Part-1 is a good read.


It’s been over two decades but I think the requirement may have been an arbitrary bitstream but if not that, it was likely any of the visible ASCII characters.


There is also flag semaphore which is very fast and very recognizable and gives you about 5 bits per signal.


Morse may not be the oldest but my bet is on it being by far the most widely still used of anything nearly two centuries old. I got into ham radio a decade ago and learned Morse (poorly) just because. It already wasn't a requirement, but so many still use it.


Hopefully there's an app for listening to it and interpreting it? I bet those old timers get pretty fast with the Morse


Braille is originally from a similar timeframe and carries six bits. Modern enhancements have focused on the protocol above the payload to make it more efficient.


If that counts, then so do Chinese characters (1200 BC or older)


Interesting question and I guess it depends what you mean by "protocol"[1]. If simple characters-over-the-wire counts, then ASCII was developed throughout 60s and was itself derived from much older telegraph standards. Teletype is arguably more of a "protocol" and is actually even older, going back to the 50s, and arguably the 30s in some form, and still forms the basis of all terminal computing today including (especially) across networks.

[1] Sometimes it seems like every interesting discussion boils down to definitions, doesn't it?


If we're limiting it to "used by computers", and "actually used in some meaningful non-hobby volume" maybe the T1/DS1 TDMA protocol? That dates back to 1962.

Or maybe morse code counts as a protocol?, Still around in VOR (VHF omnidirectional range) and NDB (Non-directional beacon) in the aviation world.


RS-232 (1960) is still going strong.


Snail-mail ain't dead yet.


I believe Semaphore predates it


How so? Mail couriers are surely much older than semaphore systems of any kind.


IBM z/OS (né OS/360) is 57 years old and still actively developed, so maybe something used there?


"What's the oldest protocol that's still regularly used?"

People are answering with old telephone codes and British naval signals, etc.

You should be thinking:

- The muslim call to prayers.

- The affirmation of faith, as it is spoken in mass or (protestant) church ... or the first and second readings, followed by a Gospel reading.

- The Jewish rite of circumcision (which is, among other things, almost certainly a signaling mechanism).


If we're abandoning automation:

https://en.wikipedia.org/wiki/Protocol_(diplomacy)

> The term protocol is derived, via French and Medieval Latin, from the Greek word πρωτόκολλον protokollon "first glued sheet of or onto a papyrus-roll".

> The rules of protocol to create space where meetings can take place.

https://www.protocol.dubai.ae/About/Protocol-History

> The diplomatic relations that existed between Egypt and Babel, which started in 1450 B.C., included highlights on the application of standards for Protocol and Etiquette that were related to both diplomatic immunities as well as receptions and ceremonies.


I would hate to be involved with any generic means of communication that required circumcisions to impart information. That sounds like a rather gory way to tell my wife I'll be late for dinner.


But it was a great way to tell your wife not to make pork for dinner. At least until it caught on beyond the intended market.


That depends on how you define "protocol". ASCII dates from 1963, and that's still the baseline for text-based protocols.

FTP predates Ethernet, Token Ring and ARCNET. It predates TCP/IP. For an actual protocol, and not just a simple format specification like ASCII, it doesn't get older than that in current use.


ASCII is a very obvious format. If you take the english alphabet and symbols, it fits almost perfectly in the first 7 bits. In 5th grade I tried encoding English in binary, and noticed how I accidentally reinvented ASCII.


I wouldn't say it's obvious. There's some careful decisions in the design that might not be obvious at first sight, such as having upper and lowercase characters be the same save for a single bit.


Or the numbers 0 to 9 at 48 to 57 (ie just mask out bits 4 and 5)


I see. I would retract my statement if HN let me edit.


Retract it at the bottom where it is least likely to be seen like we do in modern papers/news sites.


Encode English in Binary in 5th grade? You must be a smart cookie.

In 5th grade, I had the attention span of a fruit fly. Probably would have tried drawing phalluses in ascii if I had known about it.


Morse would be older.



Ha, I was like "REALLY!?!?" before actually clicking the link. I was thinking of: https://en.wikipedia.org/wiki/Shibboleth_Single_Sign-on_arch...

...which is an unlikely contender for oldest protocol!


You can consider it a successor protocol :-)


DTMF “touch tone” telephone dialling dates from 1963


100V DC T1 lines still exist, I'm told


XON/XOFF?


Just in time for it to be completely removed from Chrome, Firefox, etc.


I don't know. Maybe I'm just of a different time, but I feel like ftp is outside the scope of a browser


Choosing a ftp server in 2021 is an decission between two options, bad and worse. Vsftpd is easier to setup than proftpd but it last release was 6 years ago...



Does it matter if the last release was six years ago as long as there's no security bugs? It's not like FTP servers need new features or anything...


One of my first real internet experiences was downloading random crap via FTP from WSMR (https://en.wikipedia.org/wiki/Simtel)


Just adding to the praise. I still use FTP regularly and greatly appreciate it.

It's one of those things that stands up to its name: it's a file transfer protocol, and is still, practically, the best.


And just the other month, we switched a transmission from SFTP to FTP because we couldn't be bothered to deal with the bottlenecks in the SFTP stack. HTTP? Maybe next year :-)


The worst part is not that its 50 years old. The worst part is that some places still INSIST on using it. For instance I have to deal with FOTA and it's a pain.


Older technologies are more reliable and last longer, because they have survived the changes of time. That also means if you learn older technology (my rule of thumb is 15+– think Elixir, Lisp, Haskell, Python, Java, C++), your knowledge is unlikely to expire.


FTP is not reliable. It has survived (though it's nearly dead) only due to entrenchment.


I was surprised to discover that FTP (or SFTP) is not reliable. A couple of years ago, I discovered that there were mis-matches between files sent to a remote SFTP server using Perl’s `Net::SFTP` and the checksum that was sent after the complete set had been transmitted. The uploaded files were a few bytes smaller than they should have been.

I didn’t have the time/resources for a deep dive to determine the root cause of the problem so my work-around at the time was to use `stat` to compare file sizes of the local and uploaded file. If they didn’t match, the file was simply re-transmitted – and it always worked the second time. ¯\_(ツ)_/¯


Afaik SFTP is not the same as FTPS. SFTP is based on SSH, whereas FTPS just adds SSL/TLS on top of FTP so that wouldn’t be an issue with FTP, but SSH.


> though it's nearly dead

There's still millions of ftp servers running the wild: https://www.shodan.io/search?query=ftp


I think you mean Erlang there and not Elixir. Elixir itself is only 10 years old, Erlang is 35. That said, Elixir does seem to be pretty stable, I've not run into major issues when learning it with regard to dated material being "wrong" (either actually wrong with regard to the current language incarnation, incomplete like pre-`go mod` golang materials, or "off" from the current idiomatic use of the language).


I had a finance-industry client require we upload to their FTP server because they couldn't trust downloading the file over HTTPS from our server with a secure/unique URL.


Did they have any rationale for that?


I'm not in finance, but aerospace/defense is similarly change averse. Often things only change when the cost of the old way becomes too great (and even then it's hard) or causes a failure of some sort (security, bad release, missed milestone costing money). I recall at my first job that we had to sit a senior engineer down (in the hierarchy a level or two above the software team) and ask him to flip a switch on a relay board, reliably, every 1/10th of a second to simulate a test scenario. That was the thing that finally persuaded them to ok the purchase of equipment to automate those tests.

And this was a year or so after a major, and costly, rework was required due to a real-world system failure (no fatalities or injuries, fortunately) that would have been caught if testing had been more thorough. But testing wasn't more thorough because we simply couldn't actually flip switches fast enough to simulate a wide enough variety of behaviors, so the testing was woefully incomplete and the timing issue at the heart of the failure was sufficiently complex to be non-discoverable in code and design reviews. Literally having to spend several million out of pocket to fix a broken system wasn't enough on its own to cause them to let us improve our testing methods.

Worth noting, the cost of the new testing equipment was only in the 5-figure range, not the 7-figure range of the rework.


Security was their whole rationale given.

I had some related thoughts. They may have a strong content filter on their incoming web. I was sending them a CSV but we had to talk about it like it was a proprietary-format Excel document for the sake of those who would eventually be loading it with Excel.

My best guess eventually was that FTP was considered the best way to go around their content filter on purpose. Makes negative sense to me.


It's still pretty huge in the enterprise behind-the-scenes world of EDI exchange. I work in the railroad industry and use it pretty much daily (in addition to sftp, ftps, and other protocols).


I'm 54, so news like this is a little bit of a scary reminder of my mortality. :-(

Still, a couple of years back now I had to setup an IP-limited SFTP server for our comms team. But I made them 'sign in blood' that it was a temporary solution until their supplier could implement a more secure alternative. Like I said, it was a few years ago...

FTP definitely still has its uses, it just needs careful thought and setup.


Can confirm, it is still very popular, so much so we even built a cloud service to connect FTP to your S3 or Azure buckets, to future-ise it somewhat. Come check it out if it's something you might need, https://docevent.io


It's been about a decade since I've used ftp. I'm trying to remember if there is anything it provides that sftp/scp does not? I remember the egress ports being a pain, and encrypting the traffic required ssl certs and being even more of a pain.


The FTP protocol has support for this:

    $ /usr/bin/ftp
    ftp> help
    Commands may be abbreviated.  Commands are:
    ...proxy...
While I've never been able to make that work correctly, it supposedly allows transfers between two remote hosts (all cleartext of course). The logins work, but the gets and puts don't.

It didn't seem worth the time to set up, and similar functionality has been added to SSH file transfer.

The SITE command can also run arbitrary programs on the server (becoming something like an rsh), which will never be added to SFTP. I've actually written a couple web applications that drive a VAX with "site spawn" and some DCL.


> allows transfers between two remote hosts

That’s commonly known as “FXP”, and has been disabled for security reasons on all modern platforms.


> It's been about a decade since I've used ftp. I'm trying to remember if there is anything it provides that sftp/scp does not?

The FTP protocol provides lots of features that SFTP doesn't. However, most of those distinctive features are rarely implemented in FTP clients for Unix-like systems or Windows. It is more common to find them implemented in mainframe or minicomputer FTP implementations:

- Record-oriented files (STRU R). Unix-like systems and Windows don't have any concept of record-oriented files (where the filesystem is aware of record boundaries–on Unix and Windows, record boundaries are an application-level concept only.) Platforms such as IBM mainframe operating systems, and OpenVMS RMS, do have such files, and so FTP implementations for them often support STRU R

- Block mode file transfer (MODE B). This is where transfers are broken into blocks and each block has a header with a length and flags. You need this to transfer STRU R files (especially binary STRU R files). Unix and Windows FTP programs generally only support stream transfer mode (MODE S)

- Flagging data blocks as corrupt (in a MODE B transfer). This was intended to be used when the file being FTPed is being read directly from a tape. If one of the tape blocks has an invalid checksum, and repeated reads fail to read it with a valid checksum, you can transfer what you read from the block, but set a bit in the block header to indicate the data could be corrupt

- Compressed transfers (MODE C). This is like MODE B but blocks can be compressed. Unfortunately the compression is really basic, just run-length encoding. Some non-standardised extensions to FTP add support for better compression formats (e.g. "MODE Z" for zlib), but those generally are adding compression to MODE S not MODE B.

- Built-in ASCII-EBCDIC conversion (TYPE A vs TYPE E). Unfortunately, this is not aware of the existence of different variants of ASCII and EBCDIC ("code pages"), although there are some non-standardised extensions to add that (which IBM mainframe FTP servers/clients commonly implement). There was also an Internet draft to define a TYPE U explicitly for UTF-8, but it never advanced to an RFC [0]

- Metadata to indicate the carriage control format used in a text file ("FORM"). ("Carriage control" is about telling old-fashioned line printers how to print each line.) You can mark a file as using either ASA carriage control [4] (TYPE A A, TYPE E A), or TELNET carriage control (TYPE A T, TYPE E T; this means carriage control using CR, LF, HT, VT, FF, etc). This feature is rarely implemented except on mainframe platforms where this metadata is supported by the filesystem. (It isn't in the FTP standard, but I believe some IBM mainframe FTP clients/servers also support the IBM proprietary 'machine code' [5] carriage control as well.)

- Non-8 bit bytes. This was commonly used with 36-bit operating systems, for example TOPS-10 or TOPS-20, to transfer files made up of 36-bit words (TYPE L 36).

- "Paged files" (STRU P). These are files composed of "pages" (blocks) where each page has a header, you can have sparse files (some pages not present), even different access control for each page. Unfortunately, the specification is overly specific to the needs of TOPS-10, and can't really be used on other platforms that have the same concept but implement it in a different way. RFC1123 recommends not to implement this feature.

- Account numbers (ACCT). This lets you supply an account number, as well as username (USER) and password (PASS). This was used on some mainframe systems, to bill each transfer to a particular account. (The same user may have access to multiple accounts, and this lets them choose which one to bill the transfer to – consider the case of an academic working on multiple research projects simultaneously.)

- Structure Mount (SMNT). This lets you mount a filesystem over FTP. Most commonly used for DOS/Windows FTP servers to change drive letters.

- Store Unique (STOU). Allows you to upload a file without choosing a name for it. The server chooses a unique name for you, and at the end of the transfer tells you the name it chose.

- Allocate space for a file (ALLO). This allows you to allocate disk space for a file upfront before you transfer it. For record-oriented files (STRU R) or page-oriented files (STRU P), also supports specifying the maximum size of a record/page. Especially used in IBM mainframe operating systems where you are expected to say how big your file (dataset) is going to be before you write any data to it. (In principle, this could be implemented on Linux using fallocate, but in practice Linux FTP servers just ignore the ALLO command.)

It is common for IBM mainframe systems to prefer FTPS (FTP over SSL/TLS) over SFTP, because they actively implemented and use some of the above features which are unique to FTP and lacking in SFTP. IBM mainframe FTP servers also often offer lots of proprietary (non-standard) features in their FTP servers, such as setting dataset allocation parameters, submitting batch jobs (JCL), checking batch job status and retrieving batch job output, executing MVS console commands, running SQL queries (especially with DB2). Since FTP commands are ASCII, it is easy to invoke commands not supported by a particular FTP client simply with "QUOTE SITE". That generally isn't possible with SFTP, since its commands are in a binary format.

There are also some interesting FTP extensions defined for use with high-performance computing – GridFTP [1] [2] [3], including an extended version of block mode (MODE E) with support for "striped" transfers, in which multiple connections (possibly even running across different hosts) transfer different parts of a very large file.

[0] https://tools.ietf.org/html/draft-klensin-ftpext-typeu-00

[1] https://www.ogf.org/documents/GFD.20.pdf

[2] https://www.ogf.org/documents/GFD.21.pdf

[3] https://www.ogf.org/documents/GFD.47.pdf

[4] https://en.wikipedia.org/wiki/ASA_carriage_control_character...

[5] https://en.wikipedia.org/wiki/IBM_Machine_Code_Printer_Contr...


This might be the best reply I have ever gotten. Thank you. This certainly explains why I hear about some companies enforcing FTPs. It kind of makes me wish SFTP was just full fledged ftp tunneled through ssh. I cringe when I remember trying to explain active/passive ftp to users when I did tech support for a local ISP, or later to server admins when I managed firewalls.


the one big mistake the original protocol made: when sending a file it just starts streaming bytes without first sending the total size of the file. Makes it impossible to know, did the transfer complete 100%? Hence you have to be able to resume a partial transfer.



So, do we still have to give our email to enter in Anonymous mode? And can we use PASV?


I always thought it was clever that the PNG format was designed so that readers could catch various kinds of transfer errors, including the error of accidentally using FTP ASCII mode (rather than binary mode). That way you get a very clear error message, rather than some confusing CRC failure.

http://www.libpng.org/pub/png/spec/1.2/PNG-Rationale.html#R....

(ASCII mode automatically translates line-endings, which can break non-text formats in confusing ways)


You never had to enter a real email address (or anything that looked like an email address) on any anonymous FTP server I've ever used. You could usually leave it blank, but if a server really wanted me to enter something and I didn't feel like mashing the keyboard, I would sometimes use the address of the server's administrator or some well-known email address that wasn't mine.


You can just put webmaster@ (no domain) and that almost always works it there's any check. Maybe ftpmaster@ would be more appropriate though.


I always used me@myhouse.org :-)


Old but not obsolete.


Definitly, there's about 10 millions server with port 21 opened (source: https://www.shodan.io/search?query=port%3A21)


One of those is definitely mine. I have an anonymous FTP server up and serving a bunch of old goldsrc game maps and mods, mostly for my own nostalgia but sharing for anyone else who liked the original Half-Life series and mods as much as I did.


Modern cryptography is almost 50 years old.


Happy Birthday!


Thank you FTP for giving us Spotify.

FTP -> Ratio servers -> Napster -> MP3.com -> iTunes -> Spotify.


FTP's contribution to music industry:

FTP -> Ratio servers -> Napster -> MP3.com -> 99c songs on iTunes -> 24x7 streaming music.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: