- cheap VPS for hosting, postgres for the db, nginx as a proxy, redis for everything else, including caching.
- deploy python projects by packing them as zipapp with shiv, then use fabric to ssh and performs any migrations necessary. No, not even ansible.
- build, lint, format and test, like all automatizable stuff, are made using pydoit. If you are solo, you don't need a CI service. Your laptop is the CI machine.
Scaling is always the same story:
1 - start with the cheapest VPS possible. This will force you to code with some constraints. Not crazy, just enough to balance it out.
2 - once you start seing your load average rising, just move the db on a second cheap VPS. You just tripled the load you can sustain. Not doubled, tripled. Most solo projects will stop here actually.
3 - wait again, collect some perfs data when you see the site starting to be slow. Check for slow queries, code hot paths, etc. Optimize that, add some cache, setup tasks queues. Use cron to purge / regenerate cache if you need to smooth out the curve. Maybe slap some varnish for extra peps, or just cloudflare.
4 - now you got what looks like a real load, and a more realistic code base, so when you peak again, request more perfs for your server, or migrate to another bigger one. You can scale vertically to crazy highs nowadays. Really, really crazy. You can get terabytes of RAM, 64 cores servers, etc. At your current scale, your service should generate enough money to pay 10 times for it anyway. But you probably won't need to. Even cheap servers are beast, look at the current leaseweb offer: https://www.leaseweb.com/dedicated-servers#NL
To put it in context, myspace used to serve all its users with 2 servers only, until they reached 500,000 accounts. Only then it was too much. This was with hardware (and price!) in the years 200x.
5 - you will probably never reach this point. This is the point where kubs, load balancers, sharing, etc. start to be interesting.
How to preserve your data:
- Raid
- Dump the db with a _randomized_ cron
- rsync the dump and all assets. You can do it to another server, or just your laptop at the beginning.
You can get fancy with database replication if you want, or use backups that stream in real time.
But there is one trick: not all data are equal. Identify some data in your db that you can't afford to lose, and make sure this one is saved separately and very regularly while being given priority.
There are plenty of data, if you have a hole in it, most people don't care. E.G: if 1 tweet out of million from 10 years ago is missing, do you think it affects the service?
Monitoring:
- Sentry
That's all. It's free (or cheap), and for a small service, you don't need real time. It's ok to be down a few hours once a month for most services at first. I have a service with 700k unique users a day, it still goes down sometimes. It's a blip.
Sometimes, log into the servers, run htop and checks what's up. You can install open telemetry later if you really need to, but for now, even HN hug is not going to kill you.
Summary:
- Modern software and hardware are amazing. Max them out. Horizontal scaling is hard, and expensive.
- You don't need a perfect service. Unless you are handling patient cancer data, that is. Don't worry about perfect uptime, 0 data loss, etc. If you are a solo dev, the cost for that is huge. Just do 97% right⋅
- what worked 20 years ago still work today, and will likely be there tomorrow. And you can move from that to the cloud later. The reverse is not that nice.
- cheap VPS for hosting, postgres for the db, nginx as a proxy, redis for everything else, including caching.
- deploy python projects by packing them as zipapp with shiv, then use fabric to ssh and performs any migrations necessary. No, not even ansible.
- build, lint, format and test, like all automatizable stuff, are made using pydoit. If you are solo, you don't need a CI service. Your laptop is the CI machine.
Scaling is always the same story:
1 - start with the cheapest VPS possible. This will force you to code with some constraints. Not crazy, just enough to balance it out.
2 - once you start seing your load average rising, just move the db on a second cheap VPS. You just tripled the load you can sustain. Not doubled, tripled. Most solo projects will stop here actually.
3 - wait again, collect some perfs data when you see the site starting to be slow. Check for slow queries, code hot paths, etc. Optimize that, add some cache, setup tasks queues. Use cron to purge / regenerate cache if you need to smooth out the curve. Maybe slap some varnish for extra peps, or just cloudflare.
4 - now you got what looks like a real load, and a more realistic code base, so when you peak again, request more perfs for your server, or migrate to another bigger one. You can scale vertically to crazy highs nowadays. Really, really crazy. You can get terabytes of RAM, 64 cores servers, etc. At your current scale, your service should generate enough money to pay 10 times for it anyway. But you probably won't need to. Even cheap servers are beast, look at the current leaseweb offer: https://www.leaseweb.com/dedicated-servers#NL
For €320.09 per month, you get:
To put it in context, myspace used to serve all its users with 2 servers only, until they reached 500,000 accounts. Only then it was too much. This was with hardware (and price!) in the years 200x.5 - you will probably never reach this point. This is the point where kubs, load balancers, sharing, etc. start to be interesting.
How to preserve your data:
- Raid
- Dump the db with a _randomized_ cron
- rsync the dump and all assets. You can do it to another server, or just your laptop at the beginning.
You can get fancy with database replication if you want, or use backups that stream in real time.
But there is one trick: not all data are equal. Identify some data in your db that you can't afford to lose, and make sure this one is saved separately and very regularly while being given priority. There are plenty of data, if you have a hole in it, most people don't care. E.G: if 1 tweet out of million from 10 years ago is missing, do you think it affects the service?
Monitoring:
- Sentry
That's all. It's free (or cheap), and for a small service, you don't need real time. It's ok to be down a few hours once a month for most services at first. I have a service with 700k unique users a day, it still goes down sometimes. It's a blip.
Sometimes, log into the servers, run htop and checks what's up. You can install open telemetry later if you really need to, but for now, even HN hug is not going to kill you.
Summary:
- Modern software and hardware are amazing. Max them out. Horizontal scaling is hard, and expensive.
- You don't need a perfect service. Unless you are handling patient cancer data, that is. Don't worry about perfect uptime, 0 data loss, etc. If you are a solo dev, the cost for that is huge. Just do 97% right⋅
- what worked 20 years ago still work today, and will likely be there tomorrow. And you can move from that to the cloud later. The reverse is not that nice.