30K Page Views for $0.21: A Serverless Story

pjc50 · on Aug 17, 2016

The interesting thing is "serverless" batch processing. Amazon have essentially reinvented the 1970s mainframe batch processing business model, complete with charging for CPU usage. And made it fairly readily available for any developer anywhere.

marktangotango · on Aug 17, 2016

Almost, does lambda track resource usage down to cpu cycles, heap usage, and network bandwidth? I've contemplated adding scripting language support for my BaaS project with this level of accounting, but from what I've determined, it would require creating a new language, or a new runtime for an existing language, or Lua. Lua's is simple and easier to modify from what I can tell. Unfortunately, no ones writing web apps in Lua, aside from the OpenResty guys.

The goal being the utility model of computing, pay for exactly the cycles you use, no more, no less. Off premise on demand. Like tap water.

admiun · on Aug 17, 2016

They do, yes. Rounded up to the nearest 100ms, scaled for chosen RAM capacity: https://aws.amazon.com/lambda/pricing/ Note that this doesn't measure actual RAM use, although they do display it after lambda runs.

Data traffic is already measured and priced accordingly in AWS so that's not something new for Lambda.

sokoloff · on Aug 17, 2016

That's not exactly what marktt was asking. For example, calling sleep (explicitly or implicitly while waiting on network input) would not incur much CPU cost under marktt's accounting method but does incur wall clock charge under Lambda.

(It's totally understandable and fair that it does; it's just different than what he asked.)

michaelt · on Aug 17, 2016

One could think of Lambda as charging for (A * network bandwidth + B * RAM usage * wall clock time + C * CPU time + D * Disk IO) where C and D are both zero.

marktangotango · on Aug 18, 2016

From a capacity planning perspective, your goal should be 100% cpu utilization per box. Allocating excessive, unutilized wall clock time is wasted capacity. Same applies for ram/heap per call. These are the same considerations that played out in client/server vs mainframes for so many years.

Mainframe budgeting of cycles, memory, an IO was highly effective and efficiently utilizing resources. It's a model of computation that has disappeared, but is still relevant. When google app engine first came I had hoped it would utilize this model, but instead went the containerization route.

ewindisch · on Aug 17, 2016

Lambda does not bill on these, nor do they provide you with these metrics out of the box. This is something that 3rd-parties can provide, however... it's something that we collect with IOpipe[1] (disclaimer: I'm CTO & founder)

[1] https://www.iopipe.com

sudhirj · on Aug 17, 2016

Lambda charges by CPU time, not cycles. If you either sleep or make network requests inside a Lambda you're better off with a server.

buremba · on Aug 17, 2016

imagine a serverless distributed database. fault tolerant, scalable and cheap, that would be great.

marktangotango · on Aug 17, 2016

That's exactly what we offer [1]. Serverless, in process, hosted Sqlite served by apache and mod_lua. We have application level caching, so for high read, low/medium write applications, scale is no problem, you're mostly served from redis in that case. Applications that do a lot of inserts/updates aren't ideal in our case. Sqlite's WAL can handle a lot, but our service hasn't been stressed that way yet.

Why Sqlite? It's really an awesome database, and using it in process from apache, we can achieve truly massive multi-tenancy, and really low cost.

We also have static file hosting, we're going for hosting of single page apps, but the database API is available over CORS for any domain.

[1] https://www.lite-engine.com

SmellTheGlove · on Aug 17, 2016

Do you allow any other server side scripting? For instance, I'm building something that may fit this pretty well, but I have some PHP to geocode addresses that are submitted without lat/long coordinates before the row is written to the db. And my next step is some backend scripting to get some automated feeds on some interval (likely daily) into the database for the web interface to consume.

marktangotango · on Aug 17, 2016

I'd like to eventually, but not at this time (see my Lua comment above). You can of course process the data offline and use our database API to update the table on a schedule. I realize that's not optimal to what you're asking thougth.

marktangotango · on Aug 17, 2016

Or, you know, code an aws lambda to call our service and get your data, process it, then post it back. :) Apologies, for spamming this thread, thought of this after the edit window expired.

asimuvPR · on Aug 17, 2016

There are no details about pricing in your page.

marktangotango · on Aug 17, 2016

Thanks for checking, true no info on pricing. We just launched a few weeks ago and haven't seen much interest yet. We're just offering the rate limited free tier for now.

Out of the gate our users get a subdomain on our site. But to serve webapps from a custom domain (ie www.mysite.com), or access the database API from CORS from a different domain, our tentative plan is to charge a small monthly fee (less than $10 a month) and remove the rate limit.

marknadal · on Aug 17, 2016

Shlabam, we built it for you: https://github.com/amark/gun , and we're funded by a couple billionaires (Tim Draper and Marc Benioff of Salesforce). Come chat with us in the https://gitter.im/amark/gun !

tmikaeld · on Aug 17, 2016

Been eyeing this for quite a while, plan to try it when Auth and Spam is solved.

marknadal · on Aug 17, 2016

Thanks for mentioning those things, they help guide the roadmap. Right now we're focusing on performance, then a SQL engine, then auth tooling. Could you expand more on spam? Like DDOS and rate-limiting?

tmikaeld · on Aug 18, 2016

Alright, so about a year away?

Yes, DDOS mostly, rate limiting is quite trivial.

marknadal · on Aug 18, 2016

You must be an engineer yourself ;) using the 4X rule. Haha.

We've built a tool called panic (https://github.com/gundb/panic-server) to test stuff like this, however correct we haven't finished integrating all the pieces.

Yeah, shoot me a message ( mark AT gunDB DOT io ) now, and I'll send you a ping when we're testing that stuff. :) Thanks for the feedback!

panettone303808 · on Aug 17, 2016

google firebase?

danso · on Aug 17, 2016

From 2013: "How I served 100k users without breaking the server- or a dollar bill." (on S3)

http://blogging.alastair.is/how-i-served-100k-users-without-...

I've been using S3 for serving so much of my web work that I can't believe it has only been 3 years since I learned how to do static page serving on S3.

stadeschuldt · on Aug 17, 2016

Do you use any particular framework to generate the static pages like jekyll or middleman?

Slightly OT: If you are less of a ruby guy, what are the go to frameworks in other languages? Especially when you are not looking for blog cms (e.g. pelican or hugo).

danso · on Aug 17, 2016

I use Middleman almost exclusively. It gives me almost the simplicity of Jekyll but with all the flexibility of writing and including Ruby whenever I feel like it (which can be bad...). I've written things that are almost akin to a small read-only Rails app except the data is stored as YAML files.

If I had to move from Ruby, I think I'd give Lektor (made by the creator of the Python Flask web framework) a deep look. Besides being a static site framework, its creator has endeavored to build a client-friendly GUI, making it a potential way to build static sites that can be maintained by laypersons. It has a very opinionated structure, which had caused me to rethink how I arrange files in my other static projects: https://www.getlektor.com/

aleem · on Aug 17, 2016

21 cents isn't the true cost but cool nonetheless. He's on the free tier available to new signups for up to a year. For that matter he could have run it on T2.micro and it would still have cost pennies. Still, a neat setup. (Edit: Lambda's free tier isn't time limited, as noted in other comments below)

Lambda requires a lot of boilerplate and configuration to get things setup. Even when you use Serverless/Apex/Gordon, there's quite a bit of configuration to do. I found ClaudiaJS (https://claudiajs.com/) is much quicker to get up and running and it sets up the API gateways for you based on your API signatures.

A 35mb jar deployment has gotta be painful (zip, upload, deploy). It helps to have a local dev and testing setup off the get go. Once you get past all that, working on your codebase can be fun again.

hipjiveguy · on Aug 18, 2016

Thanks for the claudiajs link - that's nice and tons of examples too!

zepolen · on Aug 17, 2016

Paying $0.21 to serve 30k pageviews over an hour is expensive.

Paying $0.21 to serve 30k pageviews over a month is dirt cheap.

He did the latter, stop comparing to the former. The smallest VPS you can get is 10$/month.

vidarh · on Aug 17, 2016

The smallest VPS I found by checking a single provider clocked in at GBP 2.99, or $3.89.

Otherwise, point taken, but I'd imagine the scenarios where cutting cost by ~$10 or ~$4 per month matters are relatively few.

ViViDboarder · on Aug 17, 2016

I know Scaleway is €2.99. Curious to know who is £2.99?

wahnfrieden · on Aug 17, 2016

https://billing.prgmr.com/index.php/order/main/packages/xen/...

nogox · on Aug 18, 2016

That's the point.

Serverless is only cost effective in low traffic cases. If you get a stable or high volume of request, a traditional server is way cheaper.

unethical_ban · on Aug 17, 2016

Smallest VPS at DigitalOcean and Vultr is $5/mo.

retroafroman · on Aug 18, 2016

They're cheaper than that, if you're willing to deal with shady(ish) companies: lowendbox.com

brassic · on Aug 17, 2016

I pay $15/year for a VPS at ramnode.

MatthewRayfield · on Aug 17, 2016

Unfortunately Lambda falls apart a bit when you want to do any on-request HTML generation.

You're also kind of stuck with hash URLs if you want to link to anything in your app.

If you've got a truly single page app hitting an API, or you can generate pages and upload them to S3, you're golden.

But there isn't currently a way of having Lambda serve up dynamically generated HTML pages without some funny business.

I'm sure this will change in the future, but it's stopped me from using it for now.

adzicg · on Aug 17, 2016

API gateway takes a bit to set up, and there are silly limits on error responses etc still, but returning a html page is very easy. Check out the web HTML example project for Claudia https://github.com/claudiajs/example-projects/blob/master/we...

pmelendez · on Aug 17, 2016

"The Free Tier – My whole scheme doesn’t work without the free tier"

I know it will still be cheap, but I wonder the increase after the 1-year free tier period expires.

wongarsu · on Aug 17, 2016

This is a free tier that's not limited to the usual one year period.

From https://aws.amazon.com/lambda/pricing/ :

>The Lambda free tier includes 1M free requests per month and 400,000 GB-seconds of compute time per month. The memory size you choose for your Lambda functions determines how long they can run in the free tier. The Lambda free tier does not automatically expire at the end of your 12 month AWS Free Tier term, but is available to both existing and new AWS customers indefinitely.

The free tier is actually big enough to run a single 128Mb calculation for 37 days each month.

dexterdog · on Aug 17, 2016

Without the free tier that would only cost you $6.66 anyway so if your model can't work without the free tier existing it's probably not a model.

voltagex_ · on Aug 17, 2016

$5/month is a small droplet at DigitalOcean, or hell, doesn't the nano server fit that?

dexterdog · on Aug 17, 2016

Yes, but it doesn't scale up easily and it doesn't scale down at all. If you are doing exactly that amount of traffic on a predictable basis the fixed server makes sense.

pyrale · on Aug 17, 2016

IIRC lambda's free tier is not currently timed.

cagataygurturk · on Aug 17, 2016

Regarding to JAR file size, you could use minimizeJar flag of maven-shade-plugin, add only needed aws sdk artifacts as dependency. You could also use https://github.com/lambadaframework/lambadaframework which is the JAVA-AWS Lambda framework I created.

marknadal · on Aug 17, 2016

This is pretty neat, stuff like this is slowly creeping in and going to cause dramatic decreases in costs not just for startups but enterprises a like.

Similarly, we ran a stress test where we saved 100M+ records a day (~100GB) for about $10 a day (all costs, machines, disk, backup).

How it works - https://www.youtube.com/watch?v=sG5qtN8E-6Q

End/summary of the stress test - https://www.youtube.com/watch?v=x_WqBuEA7s8

S3 + Lambda are gonna be killers!

bikamonki · on Aug 17, 2016

If serverless is a paradigm shift, why did FB killed Parse and left the business? A question that remains unanswered...

Jtsummers · on Aug 17, 2016

Not their core business? Their version wasn't profitable enough?

Hadn't really read about Parse before. It seems it was primarily aimed at mobile developers. Did it have significant use-cases beyond that? In comparison, AWS Lambda, Google Functions, and Azure Functions seem more generically applicable.

nzoschke · on Aug 17, 2016

This is a great writeup and really demonstrates how "the cloud is the computer".

As many comments point out there are other ways to do this.

But doing it entirely with AWS primitives of Lambda, S3 and automatic triggers is something very novel and almost certainly the best way to use "the cloud".

superflit · on Aug 17, 2016

There is an excellent Django package/adaptation for Lambda or serverless mode [1]

[1] - https://github.com/Miserlou/Zappa

tapirl · on Aug 17, 2016

not cheap

you can get 30k pagesviews on GAE/github for paying nothing.

jon-wood · on Aug 17, 2016

The title is actually understating things a bit. This is $0.21 dollars both for ~30k page views of a static file, but also a fairly involved data processing pipeline which feeds into the generation of that static file.

diggan · on Aug 17, 2016

Not sure why this even needs a backend though, as the post above you mentions, this could be made all in the frontend. Doesn't seem like very complicated math.

However, then again $0.21 is not very much but, when it could be free...

detaro · on Aug 17, 2016

How do you scrape external websites in the frontend?

I guess you could rope some CI service with a free tier into service instead, updating a file on GitHub, but...

imtringued · on Aug 17, 2016

It should be possible to scrape the data with AWS lambda and then just push the results onto github and let the browser deal with the data. All for free.

cesnja · on Aug 17, 2016

Is such use of Github in compliance with their terms of service?

I've been thinking about using such public git providers to store small amounts of data, possibly encrypted. For example important documents that I don't want to lose. It seems that it'd be ok as long as you manually create the account.

admiun · on Aug 17, 2016

Not really although you probably need to use quite a bit of traffic before it becomes a problem. Your example of personal use only seems fine.

There was a post here a while ago (that I cannot find anymore) about the devs of a package manager of sorts who were asked kindly by Github to do something about their excessive data usage. So it's probably not a good idea to build a company on it.

mnkmnk · on Aug 17, 2016

It was probably this- https://github.com/CocoaPods/CocoaPods/issues/4989#issuecomm...

flipp3r · on Aug 17, 2016

There are even package managers which host all their data / indexes on Github ( https://github.com/CocoaPods/Specs/tree/master/Specs ). I personally also back up encrypted data on Github.

pyrale · on Aug 17, 2016

That sounds like working for $0.21 a month, since if you don't get rid of lambda you don't really create an alternative.

icebraining · on Aug 17, 2016

That sounds inefficient. Why force every client to repeat the work that can be done once on the server?

detaro · on Aug 17, 2016

right, the delivery to the end user clearly could be optimized. I probably interpreted the parent comment wrongly.

tapirl · on Aug 17, 2016

yes, on GAE, this is far under the free quota.

saynsedit · on Aug 17, 2016

Why is this not a single Python script on a cron that uploads to github pages?

sp332 · on Aug 17, 2016

"Data is typically made available at different days/times throughout the week by different external sources, so each Collector is triggered by a CloudWatch cron job." Also it's in Java, but the author mentions at the bottom that python would be better if you need low latency.

saynsedit · on Aug 17, 2016

Run the Python cron job every five minutes. It checks the condition of all data sources. You don't need a server to do this, you can run this from your house!

sp332 · on Aug 17, 2016

Maybe, but I guess I don't see why you're so worried about it. It works and it costs less to run than it would cost in electricity if he were running it from his house.

_lflx · on Aug 17, 2016

They are worried about it because they are tired of being sold bullshit.

(That is what I am inferring from their messages, not necessarily my own opinion)

saynsedit · on Aug 17, 2016

The Lambda method seems like a more complex and unreasonable way to solve the problem. It's like using a hammer to tie your shoes.

This can literally run from his laptop whenever it's online. You don't need a fully fledged server to do this.

sp332 · on Aug 17, 2016

He doesn't have a "fully fledged server". He's sharing some servers with tens of thousands of other people. It's literally less hardware investment then a RPi let alone a laptop, and it's priced accordingly. And it runs even when his home computer isn't powered on or online.

saynsedit · on Aug 17, 2016

That's like saying "I use this hammer to tie my shoes and it's good because now I have a hammer."

Buge · on Aug 17, 2016

Then you have to pay for a server to run the cron jobs which would be more expensive. You could run it from home, but then you need to manage uptime yourself.

saynsedit · on Aug 17, 2016

A simple cron job doesn't require managing uptime. It just needs to run frequently enough and then upload the generated HTML to github. I'm sure he uses his laptop at least once a day. This is crazy!!!

Jtsummers · on Aug 17, 2016

I don't know about OP, but I know that I often don't use my (personal) laptop once a day. I think I went a month earlier this year where I only touched it twice, and that was to move it out of the way so I could write a letter at that desk. Setting up something like this that's so inexpensive is a reasonable selling point for hobby projects that fit the model AWS has been designed for.

overcast · on Aug 17, 2016

What an unfortunate domain name, my first thought was "Fuck My Life Nerd", not "Fantasy Movie League Nerd".