Hacker News new | past | comments | ask | show | jobs | submit login
What does Unsplash cost? (crew.co)
120 points by snake_case on April 18, 2016 | hide | past | favorite | 36 comments



I applaud the scale and reach that Unsplash has achieved. It's a great resource. But at a certain point, cost savings do matter and it makes sense to make a change.

At this scale Heroku is always going to be more expensive than AWS or Digital Ocean. What happens when the Heroku bill hits $10K/mo?

Once they hit the next S3 bandwidth tier (350TB), Imgix becomes roughly 50% more expensive than just putting images on S3.

Everyone has their own heuristic for these decisions, but once you get in the realm of "1 or 2 people's entire salaries of savings" that's when I start really thinking hard about changing infrastructure providers.

> $18k is a lot of money to spend each month. Understanding the scale of Unsplash though can help explain the costs.

But does it justify the cost? Are you able to attribute new business from Unsplash? Does it bring in enough money to cover the $18K AND the salaries of the people who mantain it AND the opportunity cost of what they could be working on?


Hey Callmeed,

Appreciate your thoughts.

This is stuff we've absolutely thought hard about. Believe me when I say that I would love to cut the costs by switching providers for certain things, but the tradeoffs at our current size aren't worth it in our opinion.

I'm working on another post which outlines the tradeoffs of the different services and why we chose the ones that we use. That's probably the missing piece here — we've thrown out a lot of numbers but didn't give the reasons behind it (we wanted to keep the article focused and short).

Re justify-ing the cost, we absolutely can justify the cost. We've thought about all of the things that we spend time on and we wouldn't do something unless we thought it was the best use of our resources.


Can you share which imgix features you are using? We were using imgix for resizing only. That was not a good use of money.


So the resizing alone makes Imgix very valuable to us. We're in an interesting situation though, where we have relatively few `master` images, but hundreds of renders per image, which are then seen millions of times each month. That's a perfect fit with Imgix's pricing model.

We tried Imgix on a few other products where we had tens of thousands of images being uploaded each month and only seen a couple hundred times per image. That became prohibitively expensive — which is probably a similar situation that you ended up in (high number of `master` images to render ratio).

In addition to realtime resizing, we use:

- face detection - typesetting - overlays - cropping/point of interest cropping - color palette - exif/image metadata - client hints - automatic content negotiation

Pretty much everything except their watermarking endpoint haha ;)


Out of curiosity, why are you seeing hundreds of renders per image? I'd think it'd be significantly fewer than that, unless you've got a lot of stuff going on in the background that isn't really obvious--it sounds like you're sometimes doing some significant editing beyond what the photographer has already done.


That's a lot of stuff! I certainly couldn't make a good argument for writing a replacement for all those features. Thanks!


We had a pretty high bill with Imgix just from resizing, and I agree, it makes sense to switch (we wrote https://github.com/humanmade/node-tachyon for use on AWS Lambda instead). However, some of the Imgix features are pretty killer if you want to do advanced handling there.


I think this post is turning out to be quite a nice piece of marketing for Imgix. After reading about it I signed up immediately and already entered my CC details.

For me the killer feature is being able to crop images such that a detected face within the image is centered.


I've been using Thumbor[1] on an EC2 instance and once OpenCV is installed the face & feature detection has been great. Combined with a similar style of "put the transforms in the URL" on demand method of serving images it's been awesome.

[1]: http://thumbor.org/


Couple things:

1. At any scale pretty much AWS / GCE are less expensive than Heroku. They also come at an non-zero operational cost. Say, switching to Google App Engine (the cheapest from my understanding) -- would save them 75% of their bill (just back of the napkining). -- That's not much, compared to the cost of having someone migrate the platform over (multiple work-weeks)

2. S3 != Imgix. Imgix is a CDN + Image resizing service. At a couple employers ago we tried to run our own image resizing service in house, and it was a royal pain in the ass. I can go into this if you're interested.

The other thing is S3. My current employer uses S3 pretty heavily, and S3 is really not meant for end-user data access it seems. It suffers from lots of latency spikes, and a non-zero error rate. Slow, and unreliable sites make your users go away.


> S3 != Imgix

Ok, well, S3 + CloudFront just about is. Yes, I understand Imgix has other features like on-demand resizing. But it looks like Unsplash is a rails app–in which case a library like Refile [0] can handle that.

[0] https://github.com/refile/refile


Imgix isn't just Imagemagick. Good read here: https://news.ycombinator.com/item?id=9963680


I wouldn't mind hearing about the pain points you encountered when hosting your own images...


So at a cost of $18k per month you are getting 30M pages served, 140M API calls, 2.2M background jobs and 143TB of bandwidth. That sounds like a lot of bang for your buck.

When deciding on making any changes you have to consider the cost/benefit. Say an engineer costs you $10k for a month (fully loaded cost). For them to spend a month on reducing costs then what kind of reduction makes it worthwhile? For me, I would want it to take a year or less to pay back. So for me, a developer would need to reduce the monthly bill by $1k in order to justify the effort of a month of their time. Maybe they have ideas to easily do that, or maybe not.

Over time, as the bill keeps rising, you eventually reach a point where the savings become greater than the developer cost and so you go ahead and do it. So just work out your numbers (engineer cost, minimum payback period) and the decision becomes easy.


Bang on right. That's exactly how we're looking at it.

One thing we also consider as well is the overhead of adding another person to our team. We're learning a lot as we go (we've never built anything like this), so we have to be very careful that each teammate we bring on is aligned on vision, has the tools and resources they need to be successful and make smart autonomous decisions, and can fit into the current team without disrupting too much of the other teammates. That means that we can't just double our team size overnight and stick two new people on optimizations and cost reduction, even if it financially and procedurally made sense.


It makes sense to weigh up your options. I'll propose one you may want to consider.

By hosting on Heroku (or any other cloud hosting provider) you're saving on devops, but you're also paying over the odds for your hosting. However, if you had an option that retained most of the devops simplicity of Heroku, but also cut costs, evaluating this would make sense.

I would suggest that this option exists, and that option is application containers (Docker, etc...). If you can build your infrastructure around application containers, not only do you have the option of using dedicated hosting when you have access to the processing capacity to do so (and thereby save money), but you also have option to scale into the cloud if/when the local capacity is exceeded. A number of the cloud hosts support Docker, including but not limited to OpenShift:

https://blog.openshift.com/openshift-v3-platform-combines-do...

Furthermore, it's a change you can make gradually. Developers can start using application containers as development environments, and you can roll them out more broadly once the implementation issues have been ironed out.

Does this sound like something you would consider?

EDIT: Worth noting that Heroku also supports deploying using Docker containers:

https://devcenter.heroku.com/articles/docker


For those wondering how they can justify these costs, a newer buzzword marketing trend is 'side-project marketing' or 'tool marketing'. Make a useful side project or tool that vaguely relates to your business which drives a lot of traffic, and some of that traffic/goodwill will spillover to your site.


The first think I asked myself was 'How do they make money!' there are no ads, I found this on Quora and it made a lot of sense: https://www.quora.com/How-does-unsplash-com-make-money


The biggest chunk is the bandwidth charges from imgix. They do appear to be giving you a break on their published pricing, but not a huge one ($0.075/GB vs $0.08/GB). The CDN they are using appears to be Fastly, which also has a published price of $0.08/GB. So, there doesn't appear to be any overzealous markup on imgix's part.

At your scale, there are CDN providers that can get down to $0.05/GB or lower. That would cut your bill by $3500/month...roughly 20% lower than current. I assume moving your legacy CDN stuff at the same time might add to that savings.

Not sure how tightly bound imgix is to fastly, if fastly has some feature that's not available elsewhere, or if imgix would even be willing to entertain support for alternative CDNs.

Edit: Also curious if you've ever done any analysis to see if bots are using a significant portion of your bandwidth. High res images sounds like a popular target for leechy bots...maybe some savings to be had in detecting/blocking that?


I don't work for them, but I've had nothing but good experiences with CDN77. You can contact them for enterprise level service, and they'll be happy to make an arrangement with you as a growing company even through the enterprise level 'technically' starts at 500TB per month.

132.2 TB of bandwidth through them is 1,712.196 per month---6 pops EU/US.

Is it as good as Cloudfront? No. CF has 32+ PoPs.

But you really need to question "do I need < 5ms latency to everywhere on the planet including Asia (huge costs) and Australia (insane costs) for my free service?"

I hope you think about this seriously.

Though if you think you need a service comparable to Cloudfront---contact High winds or even Akamai, they're much easier to negotiate with than CloudFront, and you will save money.

---

Aside: I run a media group with massive bandwidth requirements (1PB+ per site) so I've churned through quite a few "enterprise level" CDNs while building up traffic over the past 10 years.


Any CDN that doesn't have Australian POPs is tier 2 or 3 at best, and isn't worth using.

You can't ignore an entire affluent continent and expect to be taken seriously. Australian bandwidth has dropped massively in cost since you last took a look at it, so there are no excuses.


CDN77 has Australian PoPs, they're just not included in the publicized 'high bandwidth plan'. High bandwidth sites have different needs from the rest.

For instance, our network servers 45% US/Canada traffic, 10% UK traffic, 15% German traffic, and 20% Korean.

Australian? That's 1% of 1%. We get more traffic from New Zealand than Australia.


CDN77 has a POP in Sydney.


This is fascinating.

Is there more of these types of blogs?

I always wonder how much companies / startups are spending for their infrastructure.


For another detailed breakdown of web app cost, see the thread for Cushion: https://news.ycombinator.com/item?id=10875879


This is great. I see/hear too much of this:

"Well, you can run some kind of little site/app for free or around $5-10 month. So, I imagine a much larger and more serious app would be like... $50/month? At most?"

It seems right. I don't blame them. We know it's not true.


This is akin to the fact that non-technical people (and, hell, even technical people) have no conception of the costs of software development.

"My nephew made a Tetris game in a week, so you should be able to make a billing system in... I dunno, two weeks? It doesn't even need sound effects!"


Technical people? Hell, even us blokes writing the software have no clue. I have 20 years in this, but guestamating costs and timelines? Nope. I'm a little better, but I mean, graduating from t-ball to little league.

The only time I hit it on the head, is when I'm doing a project, the type of which, I've done before. And I tend to avoid those.


I still use the rule that my mum taught me: Take your best guess at how long you'll take to do the job, and double it. That's the number you tell your team. Double that, and that's the number you tell the client.


Is it me or have they forgotten to mention how much they earn? How in the hell would they earn 18k a month?


They've written previously about how they consider Unsplash a valuable side project: http://blog.crew.co/how-side-projects-saved-our-startup/ From the wording in that blogpost, I would guess that they consider it part of their marketing spend.


The article is about costs. Unsplash is made by https://crew.co/ and featured in their "labs" section, so I assume the project is payed for by other projects they do.


The site is a marketing tool for their other projects; they talk about it here: http://qz.com/281725/how-side-projects-saved-our-startup/


Interesting that they pay only $1000/m for Keen.IO. The statistics that are posted seem to indicate they would go far over the 15mln events per month that Keen charges $1000/m for.


Great post. I wish more companies would be equally transparent about their infrastructure details. I think the lack of cost optimization is underrated though.

The problem is as you scale the lack of optimization scales with it. Ultimately a late optimization and migration affect more users, complexity etc..

I can't wait for the next post to hear about the details of the choices that have been made.


This price breakdown, particularly the CDN costs, reminds me of how fascinating it is that Cloudflare can compete in content delivery at scale without billing by bandwidth.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: