Hacker News new | past | comments | ask | show | jobs | submit login
Decommissioning a free public API (cambus.net)
170 points by andrewaylett on Feb 1, 2016 | hide | past | favorite | 59 comments



As someone that has been running a free API for coming up to 5 years now, I really don't think you handled the shut down very well.

I understand you were offering a free service, but nowhere on your site did you ever mention it was an "experiment" and that you might shut it down at any time. If that was the goal, I think it's only fair to give your potential users notice. Is it a smart idea to build anything on a free service? No, but this services as a prime example of why not to. I think we need to be a bit more responsible with how we handle these shut downs. If you have a heavily used (180 million requests per day!), free, public service, you should really plan around slowly decommissioning it, rather than a two weeks heads up before an abrupt hard shutdown.

I feel for the guys who had to come up with a quick patch for the set top boxes. They most likely had nothing to do with implementing your API, but now there are thousands of customers who's set top boxes are crashing because you were unwilling to work something out with them. It's not your fault, but it's not their customers fault either. In this case, minimal work on your part would have given them the chance to patch and push a firmware update out. I'm not sure what really happened, but from what you wrote it didn't sound like you really gave them any options, which seems pretty irresponsible.


Their set top boxes were crashing because they couldn't handle an error code properly, any other position is untenable.

These same set top boxes should have also crashed in the case that a connection timed out or the service was temporarily unreachable.

You don't make a call to a service over the internet (whether you're paying for that service or not, whether it's your own service or not) without handling the case where it didn't succeed. There is no such thing as 100% uptime for a service (or a link between you and a service) and not defending against failure here is just tempting the fate that they ultimately suffered. I do not sympathise with them in the least.

And that's disregarding the error of having SLA expectations for a service where you don't have an SLA because you are not a customer.


I never disagreed with that. I agree that not only was their implementation of the API unwise, but it was done incorrectly. However, with that being said and done, the author had a choice to either help find a temporary solution, or leave them high and dry. He opted for the later, and that's the part I disagree with. It wasn't his fault, but he had a chance to make a significant impact for fellow developers (who probably did not implement the API), and more importantly a ton of unsuspecting people at home, and chose not to.


I don't know why he didn't offer to host it for the two-week period in exchange for reimbursement for the cost of running the service, or cost+20% if he wasn't feeling charitable. That seems like the obvious mutually-beneficial solution. And if they turned him down, then it's purely on them at that point.


This. This is what I would have done. I get a lot of emails with special requests along these lines. I used to do the simple ones for free and just ignore the larger ones, but my dear friend told me just tell them your hourly rate. It turns away the non serious and keeps the ones that really need it, and either way benefits you.


I speculate that after a week-end of nightmare, he was just happy to have implemented a solution that offloaded those trillions of requests onto GitHub, has put it behind him, and was not quite eager to make it his problem again.


You want the author to do more than just minimal work. The minimal work was the two weeks notice. "I stopped the service as planned, on November 15th, after a two weeks notice." They had a chance to patch and didn't take advantage of it.

If a company is using an external business-critical API and isn't bothering to pay enough attention to it that they see and respond to a two week shutdown notice, nothing is going to make them transition except a hard shutdown. And the service provider's request, to redirect the author's domain to their own endpoint, is obviously a non-starter. Maybe they could have offered to pay the growing hosting bills instead.


> They had a chance to patch and didn't take advantage of it.

Did they though? From what I can tell, the author never required users to 'sign up' for the API, he simply said here's the end point, here's the format, go at it. That means he is/was unable to reach out directly and notify his users via email. He's relying on them visiting the website on a daily/weekly basis just in case something changes. I don't think that's a very fair expectation to have out of anyone.


> he simply said here's the end point, here's the format, go at it.

I think this is the real lesson. Don't encourage people to take dependencies on you without requiring they give you a way to contact them.

Another lesson here is the value of the "scream test". If he'd turned off the service for a 24-hour period, it would have been enough to get most people's attention. This is doubly-true if he returned the 403 with a body that stated when the hard cutoff was.


I think that's a pretty big red flag for anyone using the API to think that maybe they should prepare for this thing to go down at any moment.


What would you suggest would have been fair in this case?


It's totally up to the individual, but if you can't contact your users directly, then I'd start by putting the shut down notice on your page with probably 3 months notice, as well as a warning that there will be a 24-72 hour outage in 30 days time, and another one in 2 months that would be closer to a week long. The outage should be long enough to cause major problems for people relying on it. The emails will come flooding in, turn the service back on, notify the users that contacted you, and they now have X months notice to get it fixed up. At least that way you can feel good that you put an honest effort into trying to notify your users. An update with two weeks notice on your website that the developers have long since forgotten just doesn't cut it.



Great writeup, and very interesting lessons to learn! For me, the most suprising part was the fact that a blocked service might cause an unsane increase in traffic (due to retries of badly written clients).

However, two things look a bit strange to me:

> On the day following the API termination, I received more than 1TB of incoming requests, [...] it doesn’t make sense for me to keep paying for data transfer overcharges.

This looks like the author chose a very bad hosting plan. For example, Hetzner simply lowers your connection speed if you get above the limit, and you only pay for additional traffic (and speed) if you really want to. Most other providers handle the situation in a similar way.

I know that those plans existed since at least ~15 years ago. At that time, our main criterion on a hosting provider was their traffic payment plan. We considered only providers who had a clear upper bound on the monthly costs.

> I stopped the service as planned, on November 15th, after a two weeks notice.

This seems to be very short period for closing down a service. Of course, there were no paying customers, so they got more than they paid for. However, some trouble could have been avoided by a longer closing period. In particular, it is much easier to get rid of annoying after-period users if you are able to deny their requests with a simple "You had 3 months to prepare for the shutdown!"


> This looks like the author chose a very bad hosting plan. For example, Hetzner simply lowers your connection speed if you get above the limit, and you only pay for additional traffic (and speed) if you really want to. Most other providers handle the situation in a similar way.

As mentioned elsewhere in the text, the core problem was that everything was hosted on the main domain (and IP). Hetzner at least only allows throttling per IP, which would have crippled the whole server. With everything tied together like that, there's little wiggle room.

Takeaway here, use distinct (sub-)domains for every project/product/whatever to prepare for cases like this.

> This seems to be very short period for closing down a service. Of course, there were no paying customers, so they got more than they paid for.

Especially as setting up your own Telize instance seems to take about 15 minutes including OS installation.

> However, some trouble could have been avoided by a longer closing period. In particular, it is much easier to get rid of annoying after-period users if you are able to deny their requests with a simple "You had 3 months to prepare for the shutdown!"

Users dense enough to make their products break on a 404 are rarely troubled by reason. Or deadlines. I doubt they'd have been deterred by a six months period – embedded code like that is a one-off affair and nobody checks on it until it breaks.


> This seems to be very short period for closing down a service.

That's because "things changed when [Frederic] discovered Telize was being used by malware and ransomware."

> However, some trouble could have been avoided by a longer closing period.

Are you suggesting he should have willfully and intentionally continued allowing his service to be used by malware and ransomware to harm innocent web users? Seems like more trouble was prevented by shutting down as quickly as possible.


Regarding the notice period - it's actually quite interesting, I mean, how were the customers / users notified? I think there was a notice on the website, but it's not like people are going to be checking there very frequently.

In my case, I was actually using Telize for the Geo IP lookup for some non mission critical stuff which only ran on-demand. I actually only noticed it was offline in mid-December when I wanted to use it. It took a few minutes to read the notice, and another 5 to find a similar free provider and parse their JSON. But the point remains; I don't think most people were aware of the impending shut down until it happened. Perhaps switching it off for a day or two then back on again would have got more people no notice. Either way, it was a free service, so I guess you can't complain much.


I wonder if anyone has proposed a new 4xx error code for the HTTP standard that's something like "Deprecated?" Sort of like 410/Gone but actually meaning "going away soon?" You could continue to return valid content in the response body but any sane client would also notice the 4xx and choke on it.

As a consumer of the API I could then (temporarily) ignore those errors and keep using the message body while I make plans to get off the platform.

EDIT: How does one go about proposing a new HTTP status code, anyway?


Or just set a date and start rejecting 1% of all requests on that date, 2% the next day, etc. Software that retries immediately will get slower and slower, which will hopefully cause someone to investigate before bandwidth costs become ridiculous. Even if it doesn't help the API users to notice it, you can estimate better what it will do when you turn it off for 100% of the requests.


An API failing intermittently like that would be a nightmare for anyone trying to debug the situation.


You check the website to find out if they are having server problems and you see they have deprecated the API. Problem solved.


(Replying to my own comment)

Looks like you submit proposals here: https://datatracker.ietf.org/wg/httpbis/documents/ I'll lurk a while and learn the culture before entertaining submitting a proposal. ^_^


That's the thing isn't it. Obviously you can serve extra "note: 'service is about to shutdown'" in your JSON, but since it's an API a lot of people aren't going to be seeing it. I guess best option would have been to collect everyone's email addresses using the service and email them, but it might have been flagged as spam.

Imo two weeks is plenty of time since you probably shouldn't be using free services like this in any actual product (or at least not without a backup or at very least proper logging)


It's always interesting to me when people build on top of a service that is free with no guarantee of it existing in the future. 4 years ago I wrote a tiny bit of code to play (it was so simple it can barely be called that) with a framework I came across. It simply took a zip code, looked it up in a sqlite DB and then spit back city/state. If you already had a backend you were better off just taking the sqlite DB and adding it or it's data to your own code (It's all on github but I won't link because it's not something I'd ever want to showcase). I threw up a website to play with it and without any effort on my part people started using it. I assume they found it through my github but I can't be sure.

I've received more emails over the years about requesting features (JSONP/CORS) and asking questions about it than any other piece of work I've open sourced. The code is literally 50 lines of custom code and could probably be less than 20 if I was writing it today, hell it could be less than 50 lines total (no framework needed) yet it is the most starred/most used thing I've written and made publically available... It gets about 4 million hits a month and I'm sure it being used in production environments in some places which blows my mind but I don't care enough to add logging to check where. It's on a shared host that I've had since I got started and since I have a number of my own tiny sites and sites I've hosted for friends/family It's costs is pretty much $0, the only cost is renewing the domain name and at this point $10/yr is worth not dealing with pissed off people (even if it was free and had no guarantee of service).


Sounds like how Telize started. Look at the first[1] article from the author. He decided to bring the service down after being contacted by a malware research company that his service is used by ransomware.

Would you like to risk someone contacting you and possibly threatening with legal action because they would believe you might have taken part in their data being held hostage?

There was a story floating around with the curl author receiving emails like that just because he showed up in credits. I personally would pull the plug from a free service immediately if I found out it was used by malware and I would have no accountability for the person I was enabling with my software.

[1] - http://www.cambus.net/adventures-in-running-a-free-public-ap...


>Would you like to risk someone contacting you and possibly threatening with legal action because they would believe you might have taken part in their data being held hostage?

My dad used to tell me, "You haven't hit the big time until someone tries to sue you." Think about all those unicorns getting their first big funding checks followed immediately by patent trolling lawsuits.

I wonder how someone could simply shut down a 180 million hits a day like that. Just cold turkey.

What if instead he did two weeks notice to pay wall instead. Those set top box people would have forked over whatever he asked for a couple weeks at least while they scrambled for a cheaper solution. Keep the price low enough and they might just stay forever.


> Those set top box people would have forked over whatever he asked for a couple weeks at least while they scrambled for a cheaper solution

The cheaper solution is to let their customers suffer the outage. Android-based STBs are not the most profitable of businesses (their business model is to aggregate copyright-infringing streams from the Internet and slap a self-contained UI on top of that). They're like $20 a pop delivered.


"It's always interesting to me when people build on top of a service that is free with no guarantee of it existing in the future."

1. free 2. no guarantees

Think of a web site that is free to access, e.g., one that lets a user query a web crawl or a database of user-submitted personal photos, that also sells ads to anyone wanting to buy.

"Customers" are allowed to buy ads, but how far into the future?

Can the web site change its terms at any time?

Can you think of an example?


This is strikingly similar to previous incidents with the NTP protocol (https://en.wikipedia.org/wiki/NTP_server_misuse_and_abuse). Several of these incidents also displayed the "retry immediately on failure" behavior.

In this case, since it's TCP, there might have been an alternate solution: use the iptables hashlimit module (or a similar in his operating system) to drop the SYN packet if the client retries too fast. The sending TCP stack should make it wait for a reasonable timeout before trying again, assuming the application only retries after the previous connection attempt finished.


Another possibility would be to use the tarpit netfilter extension:

http://www.netfilter.org/projects/patch-o-matic/pom-external...


You should have asked money from the manufacturer! They probably would've been a large enough company, that you could have specified your hosting expenses and they still would pay for it! Just to not have unhappy customers. Maybe this is unethical, but with '1TB of incoming requests', desperate times calls for desperate measures.


It's hardly unethical if his original intention was as he stated: to run a free API as a proof of concept.

It would only have been unethical if his intention was to sucker people into relying on it then switching to a pay-only system via extortion.


That was my first thought too, but maybe it's because I've been doing contracting so long. My first answer would have been "Sure, I can do that... for a price."


Seems like it makes sense to strongly limit APIs that are only intended to be demonstrations, e.g. by limiting their request rate, making them artificially slow, dynamically changing endpoints... Still allowing users to try them out, but make them unattractive to for "real" usage?

For free "production" tiers (that the author probably never intended to offer), require registration and authentication, giving you a way to contact users and having them confirm some kind of ToS before they use it.

I hope the stress goes away quickly for the author, since none of it is really his fault.


What I find interesting is that from the service point of view, he noticed he should have put his API on a sub-domain like api.telize.com or equivalent. By doing so, he could have just removed the DNS record to stop the flux of data instead of migrating to another provider and let the other provider handle the flow.

As I am providing free API services directly under a main www domain, I take notice!


> In an ideal world, maybe. In practice, it triggered a huge amount of retries on failed requests from poorly coded scripts, effectively creating a DDoS

I had this happen in production with a client once. Their backend uses a service of ours, and makes roughly ~20mil requests/month. We had an outage of about 15 minutes, and their systems ramped up to make over 40mil requests in this time.

At the end of the month, they tried emailing me that the service had 66% downtime, because their logs showed 20mil 200-Ok and 40mil 500-Errors responses.


I would be fascinated to know which manufacturer of set-top boxes coded a call to a third party free location API into their firmware. So as to avoid said manufacturer.


If you require quality code and sound design decisions, I'm afraid you may end up without a set-top box.


the programmers are in Sofia; if you look over twitter you'll see the interaction.


I'm surprised he didn't calculate the cost of doing what they asked, double it, add a consultancy fee and send them the quote.


Do you think they would pay? I have a feeling the cheque would bounce.


Don't take a cheque ;)

Really depends on who is calling the shots: the individual developer might not have the authority, but for someone higher up paying a fee to make the problem go away might be very reasonable.

(Although it is an interesting question, how would you have a client you only trust somewhat pay quickly for something like this if neither side is prepared? Western Union?)


At least there are also some nice twitter interactions around that, other people saying thanks and apologizing for hitting the service after it shut down.


hm, could you link please? I can't seem to find it.



Thank you!


A couple of things to take from this:

- it may be better to phase out an API by having it return empty results rather than a failure.

- just because a service is free doesn't mean that people won't have unreasonable support demands for it. We've seen this before in open source / Free software development.


If anything, a service being free means there will be more unreasonable support demands. Things being free tends to make people not value it, and thus not value the time of the people working on it.


"[...] 403 Forbidden HTTP status code. This should have been it, right?"

410 Gone would probably have been more fitting.


The author gave users two weeks of advance notice. Some are (rightly) pointing out that this doesn't give the users much time to react, but would a three-month or six-month notice really have been that much better? How likely is it that the set-top box manufacturer would see the warning, as long as things were running smoothly?

The only thing guaranteed to get users' attention is failure. One interesting approach would be a three-month period of increasingly likely failure (HTTP 403 or 410 or 444 or whatever). Start with a 1% chance of failure. Double that a few days later. And so on. Fail enough to get users' attention, so that they visit your website, see the warning, and react -- but not enough (initially) to cause irreparable harm.

Thoughts?


You could have requested money from that manufacturer for the service of keeping the server running, even if just by returning an empty json.


Not everyone is interested in money.


You should have gouged the crap out of that set-top box manufacturer. :D

Nice write up, a lot of things I never really thought about like the increase in traffic after killing it.


It sounds like you had an obviously valuable service that if you had sold, could potentially pay for itself, why was the instinct not to create a paid version of the API?

If you can't find enough people to pay for it, then shut it down, but if you can, then you have a solid business. Of course, you'd still have to take down the public API (since it would be no longer free, and grew unsustainable), but...


Why does everything have to be monetized? It seems pretty clear from the article that they spun up a test project until they didn't want to, and they provided a whole two weeks more lead-time than they needed to on taking it down.

I'm not sure why there's this expectation that someone should do a bunch more work for something they wanted to kill off, just for the sake of monetization.


> people just expect you to invest your own time and money to solve their problem, and for you to do it straight away, when it’s convenient for them, and of course without being compensated for it. No matter that they used a free service to begin with, without giving any notice beforehand, or that you have a daytime job and other involvements.

THIS. When you make something free you've built up a horde of angry, unsatiable, irrational bottom feeders who have no respect for the work you do and feels overwhelming entitlement to something free.


You should have let the set-top box manufacturers run their own servers for your domain. Then they could find out how much it costed you.


In my experience, people/companies will pay and in fact, want to, once they rely on a service enough.

People generally get that payment gives them some semblance of rights (whereas consuming for free does not).

If you're running a free service - make payment an option. It really does benefit some class of users.


Taking money gives you all sorts of unpleasant responsibilities to deal with. That's not something everyone wants, especially not if you've got a day job you're not planning on leaving.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: