Hacker News new | past | comments | ask | show | jobs | submit login
Cost of serving billions of images per month (medium.com/p)
301 points by ghoshbishakh on May 4, 2019 | hide | past | favorite | 135 comments



I like Unsplash. They and Pixabay are what I use for apps I have developed in the past.

But simply, this floors me. I looked at their costs and with some developer muscle, you could find savings such as:

- Move Fastly to Cloudflare. They don't expressly say what the cost is for that. But moving to CF would eliminate it.

- Move Heroku to Digital Ocean. It's not difficult to create a fully redundant solution.

- Move from Imgix to having Golang micro services which handle the resizing of images and use something like Belugacdn for the CDN. Beluga is $5k a month for a PB. (Or some other cheaper CDN if you don't like Beluga, but damn... imgix pricing)

I'm pretty sure that a savvy CTO could save at least $50k a month with a well designed project that does this over many months and achieves the same result and keeps the redundancy concerns for the small team.

I do realise however, why the team have done this. In the same position (with very few resources) I would probably have done the same. But damn, when a cost of a service gets up to a years salary ($120k) for a good developer. Time to seek alternatives.


$50k/month comes out to $600k/year which is the cost around 2-3 engineers counting overhead.

To manage infrastructure at that scale themselves they'd need someone who knows DevOps. To write services at that scale they'd need someone who knows backend engineering and micro services. Then they'd need to be able to have 24/7 on call rotations for when things break. And of course, like any rewrite it seems like a "couple month" project until you get into the details and it becomes a two year project. Soon you've got a full 6+ person team and are paying them a lot more than $600k/year.


600k year for 2-3 is only real in the silicon valley bubble. You could get just as talented of a team from a Nordic Country for example and only pay 120kish


Realistically the costs for the company are higher. At least in The Netherlands the total costs for the employer is between 2-3x because of social security, health insurance, pension, housing/office costs.


I've been an employer in NL for close to 30 years and the 2-3x fully loaded is not supported by any data that I'm aware of. 1.5 or so is more like it.


Not even close to true. Especially when comparing California with EU countries, even high tax rate ones. The difference between net salary and what company pays is actually very similar.


I have experience with Belgium, the 2 - 3x ratio is only true when considering the net wage of the employee.


> 600k year for 2-3 is only real in the silicon valley bubble.

Nope. You're forgetting the cost to the company is the burdened labor cost (https://en.wikipedia.org/wiki/Labor_burden), not just wages. Payroll taxes, benefits, office space and equipment, etc. can add another 30-40% to the cost of wages.


Im not forgetting labor cost. I would actually say payroll tax, benefits and office space are even higher in California than any European country.


Can confirm. In fact, you can get talented people starting at 70k. Hire some remotes. Most local companies can only offer non-inspiring typical enterprise work, and that's why you can get people at such low cost.


600k a year gets you at least five good developers in Berlin.


And 10-12 in Poznan or Warsaw.


And some projects really are just simple. Scale makes no difference to their services, it's a basic OLTP + search with resizing image proxy.

This doesn't have any SLAs nor is it critical software either. 24/7 oncall is completely unnecessary.


Don‘t underestimate the massive cpu power needed for scaling images at their request rate


We run an ad platform that resizes, crops to the important section, transforms with text, and more with billions of requests per day. It's very cacheable, horizontally scalable, and CPU is cheap.

The main scaling challenge here is serving and bandwidth, both of which are handled by their CDN. I expect 5 figure costs for that but the rest of the platform is very expensive.


Wonder if they could use S3 bucket with a lambda to resize. I’m sure their images are a long tail so certain images likely have way more traffic than others. At a previous company where we were also doing a lot of imgix traffic we did something like this but with a go service but now I’d just offload to s3 and use lambda to avoid Devops..


I’ve done exactly this. But not at any scale. But if they are specializing in this and with low margins, AWS is not the answer. The compute cost would be relatively cheap and possibly even the storage cost. I would think that the bandwidth costs would be expensive.


A friend of mine runs a site which uses aws s3 and lambda resizing, but cache is via BunnyCDN. His costs are low after moving from aws cloudfront, and he’s serving ~400m images a day. (This is run by a company of 2 staff)

Just because everything is in AWS doesn’t force you to use all of AWS services.


Do you happen to have a link?


hey man, nice seeing you here.


Why would it depend on request rate rather than upload rate? Just pre-scale several variants on upload. Then run an MR job every now and then and delete the variants that haven't been touched for more than e.g. a month. Scale those on demand, store recents and frequently accessed pre-scaled. This is literally a few days of work.


It's much nicer to be able to resize on demand. Many CDNs and image/file hosting services offer transform abilities via URL which makes it easy to get exactly what you need and optimize for every site and device.

But since images are immutable, this is a perfect use-case for lambda functions fronted by a CDN.


I bet there's a very long tail, i.e. the majority of images are rarely accessed, if ever.


Maybe some caching would help.


I just had a look at your b2b site

Do you actually run pay-per-action campaigns, or do you provide a dashboard for the customer data ?


We started as a native ad network and have pivoted to B2B with our own proprietary B2B data, which then powers a platform to run campaigns for marketers, and intel for sales teams. We also have a legacy adnetwork business for publishers.


They say they spend $12,700/month on Fastly. I guess I don't know offhand the difference between Fastly and Imgix (which they also say is their CDN), but the levels of bandwidth they're talking about are not going to be free at Cloudflare.

We serve significantly more bandwidth and requests than their 2016 numbers at under $15k compared to their $17.5k, and we're not particularly lean on infra costs being a .Net shop with Win servers and MS SQL on Azure, and I can still bring our monthly a lot closer to $10k if I can find time for some projects.

But the reality is that even thousands of dollars per month saved can take a good while to ROI against the costs of development, or more so the opportunity costs.


> the levels of bandwidth they're talking about are not going to be free at Cloudflare

Correct. Cloudflare's terms state that you can't serve a disproportionate/substantial amount of non-HTML content, so they would need to sign up for Enterprise with a full contract to get image serving via CF.

Enterprise is generally $5k+/month, but CF contracts can offer a subset of enterprise features from $500-$5k/month, so I would guess ~3k/month just to serve multiple petabytes of images. This doesn't eliminate the bandwidth costs so CF may not be the single solution for the bandwidth costs.


They specifically say they use Imgix's CDN for the images, which can be confirmed by looking at the response headers.

Fastly is used for their actual website, judging by the ASN info. Meaning they wouldn't be serving petabytes of images from CloudFlare, just HTML/CSS/JS, which should be perfectly fine.


Resizing images in Golang is not as trivial as you might imagine at first. You need to be able to handle all sorts of formats and color spaces. You'd be surprised by the kind of weird garbage your users will upload. Then you need to use SIMD, not pure Golang solutions, or performance will suffer. So you end up adopting a wrapper around libvips or similar, at which point you will start to ponder if you should have stuck with C/C++ in the first place. (It all depends on how much of Go's features you use or if it's just a nicer, safer C for you.)


What's the business model here?

We would never run with such expensive options for this functionality, especially if it was a free/donations powered service. This company raised $7.5M though: https://techcrunch.com/2018/02/15/unsplash-simple-token-seri...


As far as I can tell they're selling data. Their API requires hotlinking so they can track everything. They know exactly how often each picture is viewed and downloaded. Unsplash knows which images are successful and which ones are duds. I'm not sure who they're selling to, but I'm sure lots of folks would be interested.


They raised money from the crypto-coin outfit run by the former fab.com CEO. At Fab and at his previous company it seems the modus oprandi was to raise millions and then pocket as much as possible before the shit hit the fan.


I ordered so much stuff from Fab back in the day! I thank the investors for their largess


They also have extensive partnerships with certain brands — their homepage currently serves ads disguised as content from a handful of companies.

(I say this as a user of and contributor to Unsplash)


So ads? I thought it would be something a little more inspired.


Yes. Of the first 7 images 3 are adverts. Then you can scroll down a long way on mobile without any more ads.


That's pretty clever, actually. If you're just browsing, you're not going to hover over every image to see that it's sponsored, but a subtle product placement within that image might not be enough for your brain to recognize it as an ad bait. Not bad...


So that's acceptable? Not having any easily visible marks? I'm surprised to see this sentiment on HN which is so averse to ads of any kind.


Acceptable? That's a frame of reference. Is it widely practiced on the internet, on every social channel? Oh, yeah, it's everywhere. They are not doing anything different, but they are executing the ads rather well.

I am curious how far they are going to grow before they turn the ad machine full throttle, and whether or not that will have a huge adverse effect.

We all kind of know that Instagram is a giant online store for everything, at least it has been for a few years now. Unsplash contributors seem to think that free will last forever. It'd be very interesting to see if they stay or leave.


They are probably banking on the Pinterest model. Eventually you will see a picture of the couch and a link that couch from your nearby store. Very profitable business at a scale, even if individual conversions are small.


Given the cryptocurrency involvement, I'd guess some sort of fraud? Probably simply "take money from investors for as long as possible", which doesn't even require a cryptocurrency.


So they must have some form of business model, right? All I can see on their website is that everything is for free, even their API which can be used for free without any limits or restrictions. And given the already huge Imgix costs, abusing or over-using their API to request and resize images would directly increase their overall costs. According to Crunchbase (https://www.crunchbase.com/organization/unsplash#section-fun...) they raised $10m last year, but they must be churning through $3m+ per year on staffing and hosting costs.

I use and love the service but I can't see how it can last for more than a few more years.


"He said it’s too soon to know exactly what that [business] model will be, but it will involve blockchain technology and cryptocurrency" [1]

[1] https://techcrunch.com/2018/02/15/unsplash-simple-token-seri...


Is there a way I can short this company?


I think the way to do this for VC companies is to sell them expensive stuff until they go out of business. This post and many others like it seem to indicate it works well.


My uneducated guess would be that they have no outstanding stock, as they have not done an IPO. So you can't short against their zero public stock.


This reminds me of Soundcloud a few years ago when they were giving away API access for free for unlimited plays but then were almost close to bankrupt and pretty much shut down all API plays. I ended up shutting down my 3rd party app then but not sure what's their API limits now. Twitter recently made huge changes to their API limits too which impacted a lot of 3rd party apps.

I hope Unsplash doesn't end up similar to those.


On the note of Twitter's restriction of API usage, Jack talked about it recently on the podcast Tales from the Crypt [0]. He doesn't give any specifics but attributes Twitters early struggle with uptime to unrestricted API usage [1].

[0] https://podcasts.apple.com/us/podcast/tales-from-the-crypt/i...

[1] https://overcast.fm/+KiHpBJHGo/15:30


That seems silly to me. They could have easily served their api from another server.


I imagine the problem wasn't API servers but data access.


I meant both api servers and database replica.


Jack has proven himself a liar many times with his false promises regarding taking care of the abuse problem on his platform without success so I’d take anything he says with a pinch of salt.

Let’s also not forget that they have a vendetta against third-party clients (so they can push their own client with the bullshit algorithmic timeline and ads) so it wouldn’t surprise me that he says that just as an excuse. In reality their web access was just as unrestricted as their API so it should share some of the blame too.


> He doesn't give any specifics but attributes Twitters early struggle with uptime to unrestricted API usage.

Of course, that was also critical to Twitter's success.


Wow, these are some serious expenses!

I run a service that does ~30 million requests a month, and I only spend $15 on it. It's an apples to orange comparison, but looking at their costs breakdown, I'm sure a couple engineers with a couple weeks can dramatically reduce their costs.

For me, the biggest cost savings were at browser caching. With `immutable` and far future cache expiration dates, webp images, etc, I could pull of the 30m and growing stats without the server load going more than 0.40.


That's very tempting to do so but not knowing the exact details of the processing involved, you probably make a mistake on how easy it would be to optimize their costs.

It could be true, but would mean engineering time not dedicated at features and likely more risk and more maintenance. It's often hard to admit but costs are not that easy to assess and trade of are tricky.


If you're able to, could you share what service you run?


The numbers being thrown around on this post are staggering. Not the bandwidth, storage or or the number of images —they're all pretty average for a small image host— but the casual tone that they're willing to throw money at services rather than engineering some quite simple problems.

Seriously, I know I'm not in their shoes but I know what they do, I know popular options in these areas and I know they are literally throwing cash away by not trying to in-house any of this.

The worst thing is they seem proud about their decisions; that their current situation has justified $100K/month expenses. I'm not the only person in here that thinks they could be delivering the same —even better— service on a fraction of their current budget, even after hiring. I think they should be ashamed.


Doing a cost benefit analysis and deciding that using a service instead of hiring and paying humans is a perfectly rational decision. I’ve downvoted because I see a lot of vitriol like this and it’s exhausting to argue each time that these decisions are carefully considered and work well for the people making them.

This is no different from saying people should be ashamed because they pay so much for food, then it’s much cheaper (almost free!) to grow it yourself in a pot.


Being able to come to the decision that in-housing isn't good value, is of course possible. But they do it for everything. And they've been exhaustive enough in their breakdown here that it's trivial to any experience developer that they're spaffing money away.

They're not in an all-or-nothing situation. There are so many individual elements that they could improve efficiency on and they're just not.

And I think cooking your own food vs going to a restaurant is a better analogy (because it focuses on a service provision, not resource gathering). They're currently eating out, seven days a week.


Even if they are currently eating out, seven days a week, that was a decision to focus entirely on the product without any distractions.

https://medium.com/unsplash/scaling-unsplash-with-a-small-te...

It's absolutely nothing for them to be ashamed of. We can all disagree with the decision and say that we wouldn't do the same, but we shouldn't call it a bad of shameful decision.


I don't understand your stance in the context of HN. We're here to make money, celebrate people who can make money from the things they build, the services they provide... And most importantly to learn from it when it goes right and learn from the mistakes when it doesn't.

Unsplash is eating out seven days a week. But they don't have anything to show for it. They're burning through cash. The product is basic. There's no magic, nothing clever. It just is.

They're saying everything is great! I think it's a shambles; simultaneously mediocre at both making money and being an excellent example of engineering. Why is anybody here celebrating or defending this result?

I maintain what I said. Unsplash should be ashamed. They shouldn't trot out numbers like this label them a success. It's a record of successive failures.


Actually they stopped using Keen since it didn't work out well for them. They do build some stuff in-house, just prefer not to.


I hear you, but everything is a trade off. You and I would optimize the expenses more diligently, but they have chosen to focus 110% on the product.

Could they do both, yes. Do they want to, no. The goal, imho, of a startup is doing what you love on your own terms. They clearly love the product and have optimized accordingly.


> The numbers being thrown around on this post are staggering. Not the bandwidth, storage or or the number of images ... but the casual tone that they're willing to throw money at services rather than engineering ...

What I found interesting, is that just a sliver of the budget is spent on software rather than hardware. I don't think this is uncommon, few teams are willing to pay for software they can get for free or make themselves. It's a bit surprising (but explainable), that commodity hardware that everyone has access to dominates the budget.


The curious thing is that they don't pay the photographers. So that kills the stock photo business model for everyone else. You can't take nice pictures and expect that some day someone will pay you for the un-watermarked versions now UnSplash has taken that market away.

So money is now going to these CDN hosts and other software suppliers, spent like water with a business plan that is not in the article. I guess the likes of Adobe can bung them a few million a year as the service is worth that to them. Same with Squarespace, they can partner and pay a few million to keep unSplash's lights on.

So this has some interesting negative effects from local communities where people once had their own websites and their own local photography businesses. Although unSplash might not directly be reaming out every small town in the world making it impossible for anyone to make a living from stock photography, Squarespace kind of are. Web pages have made themselves complicated today, once you needed Notepad and an FTP program, nowadays you need a team of developers to spend months in order to put 'hello world' online. So people are nowadays going with Squarespace and other highly marketed services, that, in turn, rely on UnSplash.

The images initially look very impressive with UnSplash but I was once really impressed by a Pret A Manger cafe. Then I realised there were many Pret A Manger cafes in many towns and they all were the same inside. Quality stuff but identikit. At the same time, each outlet was not exactly identical, there could be a different amount of chairs, the counters could be laid out differently. (Readers outside the UK could swap Starbucks for Pret a Manger).

I see UnSplash as a photo version of a Pret a Manger cafe. After a while the images are all very much the same. This is good and it is bad. It is bad in that so much of the internet becomes predictable and generic. It is good in that it becomes quite easy to do better. Much like how you can do better than Pret a Manger at making a sandwich, just by making your own, so it is with UnSplash, you can get far better photos for your project by taking your own photos and doing your own editing.

I would not bet against the success of UnSplash any more than I would bet against the success of Pret A Manger. However, I will be making my own bread or visiting independent cafes, oh, and taking my own photos rather than using 'free' formulaic stock photos.

Enough of the analogies, a critique of UnSplash and what they are not doing. If you look at a Squarespace site that has some UnSplash banner then the image will be several thousand pixels by several thousand pixels. This is good but bad. Good in that native phone camera resolution images are nice to see, bad for bandwidth.

This bandwidth is mitigated by CDNs. The CDNs do the different resolutions but it is quite dated tech being used to do this. I think that for most website owners, e.g. one's small business, there is a performance benefit in using one's own hosting and running Google Pagespeed to optimise and serve the images. This can work with the low bandwidth browser flag and do responsive source sets.

This means putting the original image dimensions in the HTML for Pagespeed to write out the srcset images, putting in a picture element and so forth. The images also get sent as webp when needed.

Now if I did want to show some thousand pixel images then I would want to do that with OpenSeadragon. This takes the deep zoom aspect to a whole new level. It works fine for a small website without a complex CDN.

Now there is no reason why I could not 'steal' a few UnSplash images and serve them my own special way with Google Pagespeed doing all the work on my own server but I am not going to be running generic stock photos anyway.

Google's Pagespeed falls in between the gaps between large compartmentised teams so nobody is using it. Why would you when for the same reasons as nobody ever got fired for buying IBM, nobody gets fired for spending a fortune on a CDN? However, if Squarespace and other UnSplash clients decide they can actually offer a better, faster product due to improved tech for serving images then they can start doing their own stock photography.


They are a VC backed high growth startup. Their focus should be on whatever they've decided is the success metric for their startup.

Many in-house projects end up costing 10x the cost of an external vendor. Of course, that's not something you read about often on HNews, its super common though.


I'm curious what their price-per-GB is from Fastly/Imgix ... the monthly total doesn't really tell me the full story.

At the last place I worked I was in charge of a team that ran our image service so I have a very solid understanding of the infrastructure around it and what resources we had in place to develop/monitor/support/etc. It wouldn't be hard to do a calculation and see if they are using money inefficiently.

I will say that most people don't really have any idea of the cost of simply pushing bits. The majority of what I did was video distribution which gets into eye-watering costs pretty quickly. Without knowing a bit more detail on the precise volumes they are storing/moving I don't think it is fair to criticize.


I’m stunned that a business of this size would still be using Heroku.

Heroku’s sweet spot (to me at least) is for anyone with less than 3-4 servers. After that, it just becomes so expensive there’s no rationalization in the world that makes sense.


Seeing how heroku is a relatively minor item cost wise, they should probably focus on the bigger cost items first before thinking about getting away from heroku


Moreover what if Heroku goes out of business? Then they’re f’d. Hosting on k8s, using Terraform or whatever would give them a lot of automation and provider independence.


I’m guessing I’m being grayed because I’m suggesting Heroku might go out of business? Ok let me explain. That is one scenario, it’s unlikely but there might be other similar scenarios that pose risk being tied in to one providers stack. DDOS, TOS issues, simply out growing them, Heroku themselves having TOS issues with their cloud. There are almost weekly posts on HN of “this major service is down” and “that major service is down”.

I understand a scrappy startup needing to focus on growth and using the most convenient tools, but once you’ve been around a bit it’s time to think of boring stuff like what if stuff goes wrong. It might be technical, security related, business related or political. Eliminating SPOFs is wise.

Happy for you to downvote this but if you do please drop a sentence to say why I’m wrong. I’m happy to be proved wrong it helps me learn.


Downvoter (of your previous comment) here.

Your comment is not only bad, but outright dangerous. This is a company that is burning through hundreds of thousands of dollars on hosting and staff with no business model yet. The primary risk to this company is not their host going away; the primary risk to this company is that they won't figure out how to make money. Worry about optimizing your serving infrastructure after you have a business that you know is worth saving.


Cool thanks it’s nice to hear the story behind the downvote. I agree you’d need to be flying their plane so to speak to know the reasoning behind their business decision. I still forget there are businesses that burn through that much cash looking for product market fit etc. it’s not the European way so I often don’t correlate spending 100k/m on hosting with fighting to stay alive. Anyway I was sort of making a general point about lock in. It may not apply to this company.


I didn't downvote, but...

"I understand a scrappy startup needing to focus on growth and using the most convenient tools"

Well, there's your answer. As a scrappy startup, you can't afford to plan migrations you aren't planning to execute. It's that simple. Heroku is not going away (they're owned by salesforce) and they have a good reputation for uptime.

The "what if" worst case scenarios have to be weighed against other business concerns. And this company appears to have done a good job of navigating such things.


Heroku mandates the use or 12Factor apps, which makes it much easier to migrate. They support docker as well. Pretty much every app I write I usually starts on Heroku and migrate later if the client / traffic needs it, and it’s always been seamless. Most apps have Dev/test on Heroku with a staging and production on AWS, and because of the 12 factor principles, which also work on Beanstalk / ECS / normal servers it’s never a problem.


Yep. Or bake your app into a docker image using https://buildpacks.io/docs/using-pack/building-app/ and deploy to any k8saaS such as GKE, EKS, or AKS.


Well let’s see where to start. Terraform is not magic. All of the provisioners are specific to a platform.


Can we please, please, please kill the myth that Terraform in any way helps in any way with provider independence?


I find it much easier to maintain terraform code that allows me to use multiple cloud providers than if I had to do the same without terraform.

You seem to have some insight to share here though, could you expound?

*edit typo


How would Terraform help you move between cloud providers when all of the provisioners are specific to the cloud platform.

If you’re a small startup, the least of your business risks are one of the major cloud providers shutting down.


> How would Terraform help you move between cloud providers when all of the provisioners are specific to the cloud platform

You don't move. You load balance across the clouds prior to the disaster.

You have your stuff work on both. I mentioned K8s because you could set up a managed cluster of that on a few cloud providers, and most of your TF will be the same in terms of setting up k8s, with some differences on how you set up those clouds.

This might be overkill for many people though so see below...

> If you’re a small startup, the least of your business risks are one of the major cloud providers shutting down.

I agree. a major cloud provider wont shut down....

... but they might shut YOU down.

Why? Billing Issues / TOS / 'Suspicious Activity' [0] / etc.

Now do you mitigate for that? Not necessarily but it is worth considering if you need to.

At the preparedness extreme you have a probe that detects the problem and flicks you over to cloud 2. Or a load balancer as I mention above (which would probe). That's probably too much for a scrappy startup.

But a middle ground is you have tech that is easy to move. Doesn't have to be k8s/TF but maybe a bash script you run on a new debian VM or whatever. Then you phone one of you awesome developers at 3am and tell them to migrate to AWS or whatever, and because it's easy they'll figure it out as they go, most stuff running by 3:30am and everything dandy by 5am.

The other extreme is you are tied in heavily to specific stacks by specific providers, and it take X hours/days to get back online again.

I'm not recommending to anyone what to do here - but I am saying consider the black swan events. You might consider them and say no I want my devs adding feature X so we can sell more. Fine, but I think when you can spend $100k a month on cloud you can probably afford to think about it a bit.

[0] Source: one of the major cloud providers cut all our services for 12 hours due to "suspicious activity". Turned out later it was due to a reused IP we were given from a pool that someone else f'd with. They gave us some credits to be nice afterwards.


You don't move. You load balance across the clouds prior to the disaster.

Again, this is a small struggling company, do you really think they should be spending resources having a backup plan just in case AWS has a multi AZ or multi region outage? Is that really their largest business risk?

Also do they really want to go from the simplicity of Heroku all the way to k8s?

The only things that a company at this level needs to be concerned about are reducing burn rate, finding a way to better monetize, and getting another round of funding.

But, I seriously doubt that a company spending 100K a month would be on their free support plan and not have a business or Enterprise support plan where they wouldn’t have someone to call at AWS with a much smaller SLA than 12 hours. We are a small company and I can just open a support ticket and get someone on the phone/chat immediately.

And if you are a small startup, you aren’t just using AWS with a few VMs. You’re probably also using a lot of other managed services that aren’t VM based. If you are hosting everything yourself. You might as well be at a colo. If you are using your cloud provider as an overpriced data center hosting VMs, you’re probably doing it wrong.


Unless you use cloud specific services I still don’t understand why it’s so hard to move between providers without k8/terraform.


If you just want compute/storage then its not, but the value in cloud providers is in using all the cloud services.

Otherwise there are far cheaper server providers than can get you monster machines and unmetered bandwidth.


I'm pretty sure that worrying about your hosting provider going out of business is one of the least important things to think about when you have a startup that's still trying to build its business model.

And seriously, not everything has to be hosted on k8s, not everyone has to use Terraform, it's ok running your application on Heroku, DO or some other hosting. If Heroku goes out of business, you'll have a few months to preapre a migration plan. It's ok to handle it then, right now there are probably more important things for them to do.


$99K/month in hosting costs and no visible source of revenue...

I think this baby is ready for an IPO!


The losses are not large enough for an IPO. They need to scale at least 1000x to be IPO ready.


Not until they have a few more rounds of financing, with each round increasing their valuation until they unicorn (verb).

Then the baby is ready for IPO.


I'm a little surprised as to how you guys afford to give such a quality service away for free. Really cool of you guys to share this data though. Wishing you guys continued success!


It’s interesting to see how so many people here focus on specific line items on their costs. What it comes down to is management focus. It’s a lot easier to switch a CDN than to lay off two engineers as needs change.

I Do wonder why they don’t charge for API at all as even currency APIs charge and that is effectively public knowledge.

Very interesting though as this is some serious scale indeed...


Everyone here seems to be screaming about how high their Heroku bill is, but I’m more interested in how they keep their New Relic bill so low. I find that New Relic are fine early on, and then the costs explode as you scale up and there’s no way to justify spending huge amounts of money on their platform.


10(+) billion images served monthly for $100k = 1/1000th of a cent per image. This is not bad but I think Netflix is 2c a gigabyte; you are 10c (I'm assuming 100k per image).


Netflix sends their own cache machines to tens, if not hundreds, of ISPs and interchanges all over the world, each of which have multiple terabytes of storage and act as cache nodes. I don’t think Unsplash is going to do that.


Now I'm interested to see what their business model is. It seems like they have a lot of money to throw at services instead of developing it themselves, so I'm interested in their ROI or opportunity cost decisions now.


Holy cloud. Well I guess the silver lining is they're leaving lots of room for cost improvements...


Does anyone know their bussiness model?


That's a great blog post. Very transparent. I'm especially glad he's willingly giving up cheaper alternative for having more value from his services.

I just would have liked some more details on what it means technically... :-)


Well, it's transparent about their costs, but I would have liked a little transparency about how they actually make money


Very true but I suppose we can't have it all, plus maybe it's sensitive information.


I'm surprised that at that scale you are still not building some core functions internally.

Though it makes sense, buy over build is almost always the best choice early stage. It's hard to slow down product velocity to fix things that work fine when costs start to add up. (Also it's easy to say "do you want to save 3k a month or deliver features x,y, and z" and let costs slide)


Is there any particular reason they are still using Heroku instead of something like DO or AWS+S3? They can probably save a lot of money that way. Unless Heroku is offering them some good deal?


They tell you in the blog post.

"We continue to use Heroku as our main web platform. Despite its premium cost over AWS, Azure, and Google Cloud, Heroku’s built-in deployment and configuration tools allow our team to move faster, more confidently, and more reliably.

As we’ve detailed previously, the alternatives would undoubtably be cheaper on paper. But in reality, the increased simplicity and freedom offered by Heroku for a small, product-focused team is a major cost savings advantage."


I’m far from a Devops guy. But deploying to AWS using their native tools are Azure using Microsoft’s tools is dead simple.


Yeah something doesn't add up.

I'll believe Heroku is easier, but can it be that much easier? For people spending piles of cash on their webscale?


I work for a company that runs multiple major websites pulling in tens of millions of users a month. We run on Heroku. We made that decision when we didn't have a single user. Right now we're a team with 25 engineers.

Every time we need to re-negotiate our contract with Heroku we ask ourselves whether we should move away from Heroku or not. So far, the answer has always been no. Complexity would go up and the time spent on DevOps would go up. Right now, any idiot can do a deployment, or a rollback. Most of the things we need are taken care off for us by Heroku and it doesn't cost us any effort. We use Heroku Postgres for our database and we get point-in-time restore for free, without doing anything. We can upgrade our database or create a new follower in minutes. It just requires a couple of clicks. On top of that, all the monitoring is taking care of by Heroku. It just gives us a nice dashboard.

As for costs, on an negotiated Enterprise contract, it's not that bad. If we'd move to AWS or DO for example, we might save a little bit, but the extra engineering resources would quickly make those savings useless. Even if it's a bit more expensive to run on Heroku, it's still cheaper than cobbling together all of those features on another platform.

We've made a couple of small engineering investments to get features that Heroku doesn't offer. For example, we built a simple auto-scaler for dyno's that consume from RabbitMQ. This helps a bit to keep our costs down and respond to massive spikes quickly.

We run our web servers, database, caching, data pipeline etc on Heroku. However, we are not _that_ dependent on Heroku. We use Docker for deployments, so we can easily reproduce what's happening and it reduces our vendor lock-in.

I hope this explains a little why some people might choose to stick to Heroku. It works for us, it might not work for you or anyone else.


We've made a couple of small engineering investments to get features that Heroku doesn't offer. For example, we built a simple auto-scaler for dyno's that consume from RabbitMQ. This helps a bit to keep our costs down and respond to massive spikes quickly. We run our web servers, database, caching, data pipeline etc on Heroku. However, we are not _that_ dependent on Heroku. We use Docker for deployments, so we can easily reproduce what's happening and it reduces our vendor lock-in.

I missed this part. If you are okay with using managed services, why are you doing so much “undifferentiated heavy lifting”? Setting up autoscaling based on queues, managed RabbitMQ based messages (Amazon MQ), and without knowing what your data pipeline consists of, probably that too could all be done with managed services.


Don't imagine anything fancy here. This is roughly 100 lines of Python code that scales based on some tresholds. It took a day or two to develop, test and deploy.

However, we recently found a managed service to do this for us so we'll be switching to that as soon as we can.

If we'd have to spend more engineering resources on our infrastructure than the occasional simple tool, then we'd consider switching away from Heroku. The benefits still outweigh the potential engineering effort of not running on Heroku.


> We use Heroku Postgres for our database and we get point-in-time restore for free, without doing anything

For the record, you get this with Azure too. Not sure about AWS and GCP, but I would have assumed the same.


It isn't an individual feature that makes us stay with Heroku. It's all of them in a single platform with a unified interface that is very easy to use that makes it very attractive.


I do both at work, small sites on Heroku and big / compliance required ones on AWS, and the Heroku ones don’t need a devops team, while the AWS deployments do. Given the cost of building the devops teams and the time they spend, it is a big factor. Prefer to default to a Heroku unless there’s a clear cost benefit analysis that includes human and opportunity costs.


if it is a simple deployment that you could do with Heroku, it would be just as simple with Elastic Beanstalk. You just give it your zipped artifacts.

If later on you need to do something more complex you have all kinds of extension points.

If you want something cleaner and just to use a Github -> CodeBuild -> Code Commit/Lambda deployment and have something closer to traditional deployment pipeline, there are the CodeStar templates.

We are a small company without a Devops team. The leads all know how to setup a from scratch system on AWS.


I’ve done both, and many projects start on Heroku then move to AWS once they hit scale. And every single time I keep cursing at how many hidden man-hours are spent on AWS maintenance. If you’ve tracked the time spent on AWS that wouldn’t be spent on Heroku, and plotted that against the cost of your leads’ and devops team’s time, and your opportunity costs, and you still feel it’s worth it, by all means go straight to AWS.

The point is that costs on Heroku and AWS are two intersecting lines with different slopes and y-axis starting points. The dollar cost is the basic dimension, but once you add team hourly rate and opportunity cost you’ll see that Heroku is cheaper than AWS until you hit X000 req/sec or users. That number is varies from team to team, but it’s higher than you think.


I’m mostly a developer but I am also the person they trot out as the representative of our (nonexistent) infrastructure team to clients and lead most of the “cloud native” initiatives.

We do a lot more with AWS’s managed services that would go beyond what Heroku could do.

That being said, if we were just a simple database+website. How would AWS maintenance be any more than going through the VPC creation wizard one time and using Elastic Beanstalk for deployments? I’m very much an advocate of managed services so it’s not that I’m anti Heroku - that would be hypocritical if I’m saying using EB - But how is it easier?

With Elastic Beanstalk if later on if you do need to add complexity, you easily can through startup scripts and .ebextensions (cloud formation).


To get Heroku gives you, you'd need to set up EB, RDS (with performance insights), CloudWatch (for logs and metrics), Code Pipeline (don't remember if EB supports GitHub pushes). If you use the heroku run command, you'll need to set up a jump box with SSH keys properly configured inside your VPC with DB access.

All this literally takes one click with Heroku.


Even though it’s not considered best practice to tie your RDS with your EB stack, part of the wizard for EB is setting up a database and it will create your RDS cluster and the security groups for you. It also automatically sets up your CloudWatch logs. EB can use Github.

Again, I’m far from a dev ops expert, but I cringe at how much EB (and Heroku) does that’s magic and I would much rather use CodeStar with the templates that create a standard CodeBuild/Code Deploy/CodePipeline/CloudFormation set up, but I am comparing like for like.

I’ve never deployed anything to Azure but I have used VSTS (aka Azure Devops) with various combinations of hosted builds, on site builds with agents on the build servers and deployment servers and that’s even easier than AWS’s offerings. Even if the GUI setup would make a real Devops person cringe.

Yes I realize Heroku for all intents and purposes is just another managed service that sits on top of AWS and that you can even buy Heroku services from AWS Marketplace for consolidated billing (https://aws.amazon.com/marketplace/seller-profile?id=0112b5d...)

I’m also not arguing not to use Heroku just because it cost more. We always choose a managed service over having to manage things ourself even going as far as preferring Fargate even though regular EC2 based ECS would be cheaper.

But, I haven’t seen anything that Heroku gives you that couldn’t be duplicated with EB or if you need a more traditional approach a CodeStar generated template.


Curious, anyone who knows what the business model is?

The service seems pretty cool :)


how do they monetize ? i can't find the monetization ...


Maybe API partners


I work with these guys since they are customers of Stream. Super solid team, really respect how they have such focus in their execution.


it would be nice if each category would line up in this graph: https://cdn-images-1.medium.com/max/2400/1*Bvw2zcdE146WaXB7i... ... kinda hard to read as-is


looking at that image fullscreen, the bars seem to be moving to me. Interesting effect


visualization fail though


I'm using unsplash for a while, They have the best quality free to use pictures on the internet.

As someone who appreciate creative marketing i like how they generate millions of backlinks by asking you to link back to their site and credit the author.

In few years they became among the most popular sites in their niche, a lot thanks to their creative growth approach.


How does unsplash make money?


TLDR; $98K/mo. Year ago they were 10X smaller and used to cost only $17K.


Sounds like they were focused on that 10x growth :D


First time hearing about unsplash, thank you!!! This is awesome for us OSS devs.


It's cool that they're sharing this information, but using Heroku at this scale is wasteful and fiscally irresponsible.


> We continue to use Heroku as our main web platform. Despite its premium cost over AWS, Azure, and Google Cloud, Heroku’s built-in deployment and configuration tools allow our team to move faster, more confidently, and more reliably.

This is just bullshit, really. I'd buy this excuse from a service like Snapchat/Facebook in their early years, but unsplash is nothing that changes dramatically all the time.

Every single point in their analysis just shows how they like to waste money. Imgix as the main image CDN? Start serving images using thumbor and a cheap CDN like BunnyCDN or KeyCDN.

They could easily save 60% on their monthly bill. That would enable them to hire 1 or 2 engineers capable of maintaining a few dozen servers on AWS, DO or GCP.


>They could easily save 60% on their monthly bill. That would enable them to hire 1 or 2 engineers capable of maintaining a few dozen servers on AWS, DO or GCP.

So blow up their currently working infra to save 40k to reallocate that 40k to build a replacement? How does that make sense? This idea seems penny wise, pound foolish - run the numbers yourself.


1 or 2 engineers is not enough to fully resource a 24x7x365 on-call roster. Sure, $42k/month seems like a lot of money, but proposing to hire 2 engineers to replace that seems naive to me. If you need to build/run/maintain a service that doesn't stay dark all weekend if something breaks at 6pm on a Friday, you need more than two people who're prepared to answer their phone on weekends.


> They could easily save 60% on their monthly bill. That would enable them to hire 1 or 2 engineers capable of maintaining a few dozen servers on AWS, DO or GCP.

If they reduce their bill and then hire engineers to do the same job they've gained nothing other than an increased headcount for no discernible benefit.


If they're not really adding features then its the same staff. Not every change needs a new developer.


I think what the repliers missed from OC was that using that extra money for devops people will then allow even more savings after having consolidated down from Heroku to AWS/GCP/DO etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: