Hacker News new | past | comments | ask | show | jobs | submit login
Reclaiming the lost art of Linux server administration (pietrorea.com)
602 points by prea on Jan 28, 2022 | hide | past | favorite | 468 comments



I have over 20 years of Linux/FreeBSD sysadmin experience ranging from universities to major silicon valley companies in both cloud and on-prem.

When it comes to companies I mostly support cloud these days but when it comes to me and my family I accept every downside and host as almost all of our digital lives in a 42u rack in a gutted closet in our house with static IPs and business fiber.

I know where our data lives and no one can access it without a warrant and my explicit knowledge. I also save myself several hundred a month in third party cloud provider fees to host the same services and can reboot upgrade or repair anything whenever I want, but in general no more maintenance than cloud servers . I also never end up with exciting bills when experiments are forgotten about.

You pretty much get all the pros and cons of home ownership. For me it is mostly pros. Also keeps me dogfooding all the same practices I recommend to my clients.


I remember how surprised people were when I demoed a $200/month bare metal server outperforming by a huge margin RDS MySQL instance that they were paying something upwards of 16k/month.

IIRC we ended up using it as a disposable replica for some non-real time but heavy operations.


It's 2022 and we're about to rediscover something we know for 40 years already: mainframes are freaking expensive.


It's 2022 and we're about to rediscover something we know for 40 years already: mainframes have freaking performance.


>mainframes have freaking performance

Yes and no, really depends for what workloads. But freaking reliability that's for sure.


reliability + security. performance is poor they were designed for batch.


>performance is poor they were designed for batch.

That's once again an oversimplification, performance is excellent for the intended workload.


You misread something, it's 2022 and we discover that mainframes can be cheaper then a AWS-cloud-machine-park ;)


I thought AWS was the mainframe in the parent comment?!


Please educate yourself what a mainframe is.


Of course AWS is the mainframe in this analogy! It's a huge throughput- and reliability-optimized overhyped crazy expensive server that you hire someone else to run for you. It's designed the same way, with dedicated storage, compute and backplane servers. Even the discourse is the same to the 80s: "why we have to rent a $$$ monster when a $ commodity server outperforms it". How is this not obvious is beyond me.


I've briefly encountered mainframes in the 90s. I thought it was a perfect analogy to AWS where "intelligence" is in the center and we're just piping stuff over the network for AWS to compute. I don't understand how selfhosted/self-provisioned servers is supposedly comparable to a mainframe, on the contrary.

> It's 2022 and we're about to rediscover something we know for 40 years already: mainframes are freaking expensive.

My interpretation: AWS is freaking expensive, AWS is in the center performing all computations; AWS is the mainframe we can well do without.


It's a wrong analogy, but i also misunderstood you. Sorry for not understood your comment, but not-sorry for using the term mainframe wrong ;)


Here's what the bare metal server didn't come with:

API access for managing configuration, version updates/rollbacks, and ACL.

A solution for unlimited scheduled snapshots without affecting performance.

Close to immediate replacement of identical setup within seconds of failure.

API-managed VPC/VPN built in.

No underlying OS management.

(Probably forgot a few...) I get that going bare metal is a good solution for some, but comparing costs this way without a lot of caveats is meaningless.


> Here's what the bare metal server didn't come with:

[bunch of stuff I don't need]

Exactly. Imagine paying for all that when all you need is bare metal.

Now imagine paying for all that just because you've read on the Internet that it's best practice and that's what the big guys do.

Way back the best practice was what Microsoft, Oracle or Cisco wanted you to buy. Now it's what Amazon wants you to buy.

Buy what you need.


Your point is well made... The other thing you need to think about is that all those extra services can come with a reliability cost if your provider is not 100%. Most outages we've encountered have been because the HA infrastructure has decided to have problems rather than the underlying hardware or actual service we need to run.

Having all that "best practice" service is great if it works well, but when it becomes a checkbox on the purchase order then it can cause far more problems than it solves.

I have found that a lot of the push to outsource hosting is simply an attempt to deflect responsibility for problems rather than in expectation of actually providing more reliability.


It’s great if you need that reliability in practice - and not just that you think you need it.

For every company or startup that thinks they need 100% uptime the reality is that they can not only get away with much less but in practice will end up with much less anyway because the effort and moving parts (load balancing, distributed database, etc) will typically result in something failing anyway even if the underlying hardware is indeed 100% up, and somewhat surprisingly to quite a bit of people here, manage to survive and thrive despite that (the recent AWS outages took out a lot of services and products and they still seem to be around somehow).


best practice

Ugh. I hate that phrase. The translation into plain English is almost always "What I read in some blog" or "Because I want to" or "It's what our sales rep told us." Even from C-levels who should know better.


I generally use it as shorthand for, “when and not if we call the vendors with this configuration, they cannot pass the buck and say it isn’t a configuration they support”. “Best practice” as an endorsed configuration blessed by the vendor support engineering organization is my go-to sign-off to get quick cooperation from the vendors.


It should mean something. There are things that are just good ideas you probably shouldn’t stray from. But the moment a concept like this materializes it gets hijacked by marketers and celebrity developers and distorted into oblivion.


> Exactly. Imagine paying for all that when all you need is bare metal.

Yes.

The opposite is also true though: Imagine not wanting to pay for that and needing it!

There's a reason why most homes connect to the power utility companies. Yes, we can run generators ourselves. Does it make sense to do that? Not usually.

Same thing with this server. If it makes sense for your use-case, outstanding. In many cases, people are better off offloading this to another company and focusing on their strengths.


>"Yes, we can run generators ourselves. Does it make sense to do that? Not usually"

The minute gens/solar/wind/batteries combo becomes less expensive than public utility I'll switch. For now it makes no financial sense.

With the clouds it is the other way around. My dedicated servers running my software kick the shit out of AWS performance wise for a fraction of the price. And no I do not spend my days "managing" it. I can order new dedicated server and the same shell script will reinstall all prerequisites, restore data from backup and run it in few minutes add (however long it takes to import database from backup). Where needed I also have standby up to date servers.

Other than running this script to test restoration once a month my management overhead is zero.


Exactly! Who needs a kitchen in their home, it's better to just order in all the time. All that complexity of cooking daily, preparing and storing food, cleaning dishes, etc, etc. I mean, who's going to even do the cooking!

See... anyone can pick <random thing>, and describe how people don't do it. Of course, you cite a generator, others use solar cells, right?


Yeah imagine cutting your own hair and diagnosing your health problems. Growing your own food and making your own chips.

Looks like sensible/nonsensical based on who you are and what are your needs. Not everybody needs cloud, not everybody can self host.


> bunch of stuff I don't need

And that's fine. People should make choices like that. I was objecting to saying people paid X for RDS and others paid Y for one bare metal server. Those are not in the same category, there's no point comparing those prices without a lot more information.


I am buying IaaS it is so nice to use VPS with snapshots every 4 hours that I don't have to think about.

I don't care where those snapshots are stored and how much space those take. In case I need to restore my IaaS provider gives me 2-click option to restore - one to click restore and 2nd to confirm. I sit and watch progress. I also don't care about hardware replacement and anything that connects to that. I have to do VPS OS updates but that is it.

I do my own data backups on different VPS of course just in case my provider has some issue, but from convenience perspective that IaaS solution is delivering more than I would ask for.


Snapshots every 4 hours? That doesn't sound impressive at all. In 2022 that's laptop tier capability.


Well my dev laptop could run 1000x more traffic than what I need on my servers.

But of course I am running build tools and bunch of other dev tools locally that are not even installed on our servers.


I'm not talking about traffic I'm just talking about expectations as far as software capabilities.


What on earth are you running on your laptop?


Windows 11 with Edge. Nothing else /s

But in all reality, laptops are really, really powerful these days.


I use customer cast offs and even employee cast offs. My desktop was cast off by a customer because Win 10 ground to a halt. I slapped an ancient nvidia card in to get a couple of screens and popped in a cheap SSD. According to dmidecode it is a Lenovo from 01/06/2012.

My laptop at the moment is a HPE something ... HP 250 G6 Notebook PC according to dmidecode. I used to have a five year old Dell 17" i7 based beast but it ... broke. I whipped out the SSD, shrank the root fs a bit (with gparted) and used a clonezilla disc and an external USB link to get my system onto a Samsung EVO M.2 thingie from the SSD. This laptop was an employee cast off/

As is probably apparent from the above, I use Linux. I have some decent apps at my disposal but in general I don't need much hardware. Decent: RAM >= 8GB and SSD storage are key. I don't play games much. I've always specified 17" screens in the past for my laptops but now I have to use a 15" jobbie, that's becoming less of a hard requirement.

I am a Managing Director (small business - IT consultancy) but I do spend rather a lot of my time doing sysadmin and network admin stuff. I also do business apps. I once wrote a Finite Capacity Plan for a factory in Excel with rather a lot of VBA. Before you take the piss, bear in mind I used the term finite and not infinite. That meant that quite a few people had employment in Plymouth (Devon not MA) in the 1990s. I won't bore you with my more modern failures 8)

Anyway as you say, modern laptops are phenomenally powerful. Mine have wobbly windows 8) I can't be arsed with MS Windows anymore for my own gear - it gets in the way. Me and the wife said goodbye at Windows 7.


I think I found a simpatico friend!


> Buy what you need


> [bunch of stuff I don't need]

They tend to be the things that you don't need until you absolutely need them right now (or yesterday).


I'm too busy to tend to the care and feeding of a database server. I'll gladly pay amazon to do it for me.


Typical statement from someone that doesn’t actually pay the bill (a.k.a a software engineer in most companies).

It’s easy to trivialize $20-30k a month when it’s someone else’s money and it’s less work for you.


When considering the cost of having a secure server room, a commercial internet line, electricity, cooling and hardware - plus the cost of my time (or someone else's on the team), AWS gets more and more attractive. The business side paying the bill agrees fully.

People on my time would rather be developing software than doing sys admin stuff


There is a continuum between "I run servers in my basement" and "I pay mid 5 figures monthly to have autoscaled autofailover hotstandby 1s-interval backed up cluster across three AZs".

I use AWS, DO and Hetzner (bare metal) for (different) cases where each makes sense.

In the past few years, maintenance burden and uptime between AWS and Hetzner hosted stuff was comparable, with the cost being an order of magnitude less for Hetzner (as an added benefit, that machine is more beefy but I wasn't even comparing that).

> People on my time would rather be developing software than doing sys admin stuff.

I am not surprised software developers would prefer doing software development :-)


my point is $20-30k/m for an AWS database can actually be a bargain


> The business side paying the bill agrees fully.

No, they often don’t understand the alternatives available because the wolves are guarding the henhouse (software devs who don’t want to work with the pace of on-prem IT).

> People on my time would rather be developing software than doing sys admin stuff

No shit, that’s why software devs aren’t sysadmins.


Sure and you could buy a digital ocean droplet or Amazon lightsail instance for 5$/month. Both include 1TB of free data transfer.


Of course there are a lot of benefits of using hosted databases. I like hosted databases and use them for both work and personal projects.

What I have a problem with is:

- the premium over bare metal is just silly

- maximum vertical scaling being a rather small fraction of what you could get with bare metal

- when you pay for a hot standby you can't use it as a read only replica (true for AWS and GCP, idk about Azure and others)


Readable standby instances is something AWS just recently added actually: https://aws.amazon.com/blogs/database/readable-standby-insta...

Though it seems to require you to have 3 instances rather than just letting you read from the standby... I don't quite get the rationale for that.


Even though a hot standby is technically the exact same thing as a read only replica, it solves 2 very different problems (uptime vs scaling). Personally, I never want read only queries going to my hot standby, otherwise I risk the secondary can’t handle the load in case of failover. Probably a reason like that with AWS as well, as they are selling a “hot standby” product to ensure the uptime for your RDS.


> when you pay for a hot standby you can't use it as a read only replica (true for AWS

I'm not sure what you mean here. At least for MySQL you can have an instance configured as replica + read-only and used for reads. Aurora makes that automatic / transparent too with a separate read endpoint.


A hot standby is not a read replica. It's a set of servers running in a different Availability zone mirroring current prod, that is configured to automatically fail over to if the primary is offline. It's been a few years since I personally set this up in AWS, but at the time, those servers were completely unavailable to me, and basically doubled the cost of my production servers.

The fact that a hot standby is usually in some sort of read-replica state prior to failing over is a technical detail that AWS sort of tries to abstract away I think.


They recently announced readable standby instances.

https://aws.amazon.com/blogs/database/readable-standby-insta...


I'm not sure, I guess it depends what you're looking for. I'm a hobbyist, with no real professional experience in server admin, so I am probably missing some important things. But most of this can be replicated with ZFS and FreeBSD jails (or on linux, BTRFS and LXC containers).

>> 1. A solution for unlimited scheduled snapshots without affecting performance.

You can very comfortably have instant and virtually unlimited snapshots with zfs/jails (only occupying space when files change). Very easy to automate with cron and a shellscript.

>> 2. API access for managing configuration, version updates/rollbacks, and ACL.

>> 3. Close to immediate replacement of identical setup within seconds of failure.

There is a lot of choices for configuration management (saltstack, chef, ansible, ..). I run a shellscript in a cron job that takes temporary snapshots of the jail's filesystem, copies to a directory, and makes an off-site backup. A rollback is as simple as stopping the server, renaming a directory, and restarting it. It's probably more than a couple of seconds, but not by much. I think I'm uncomfortable exposing an API with root access to my systems to the internet, but I'm not sure how these systems work. I don't think it would be hard to set it up with flask if you wanted it though.

>> 4. No underlying OS management.

I don't know what this is, but I'm curious and looking it up :D.

In most of the posts I'm reading here, people have really beefy rigs. But you could do this on the cheap with a 2000s era laptop if you wanted (that was my first server).


> You can very comfortably have instant and virtually unlimited snapshots with zfs/jails

Yes, but that both requires manual scripting it and remains local to the server. Compare to scheduled RDS backups which go to S3 with all its consistency guarantees.

> There is a lot of choices for configuration management (saltstack, chef, ansible, ..)

Sure, those are an improvement over doing things manually. But for the recovery they can do only so much. Basically think how fast can you restore service if your rack goes up in flames.

> I don't know what this is, but I'm curious and looking it up :D

It means - who deals with kernel, SSL, storage, etc. updates, who updates the firmware, who monitors SMART alerts. How much time do you spend on that machine which is not 100% related to the database behaviour.

I wasn't recommending everyone use RDS. If your use case is ok with a laptop-level reliability, go for it! You simply can't compare the cost of RDS to a monthly cost of a colo server - they're massively different things.


Thank you very much for this reply! Those are all very good points. You’re right, this is a service, and not having the worry about hardware or scripts at all is valuable when you have a million other things to manage.


Everything has a cost. The cost of the cloud is that you’re out of control, the cloud provider owns your stuff, and a single vulnerability can make half the Earth get pwned. Whoops.

EDIT: I do use cloud services for some stuff. My point isn’t being anti-cloud, just that nothing is perfect.


True, and also it costs more money.


Some of these strike me as things where the software _could_ exist, but it's either not FOSS, or I haven't heard of it yet.

For instance, there are LVM2 snapshots. Maybe those do affect performance. If the cost difference is big enough though, couldn't you just account for that in the budget?

I agree that literal "bare metal" sucks, but self-hosted with cloud characteristics (containers, virtualization) is not totally obsolete.


That's the point, duh.

When you're using the cloud you're paying someone else a very good margin to do all of those things for you.

If you do them yourself you can save quite a lot. Hardware is surprisingly cheap these days, even with the chip shortage factored in, if compared to cloud offerings.


A lot of these aren’t as important for something that’s fairly static and can’t even be autoscaled or resized live (excluding Aurora - I’m talking about standard RDS).

No access to the underlying OS can actually be a problem. I had a situation where following a DB running out of disk space it ended up stuck in “modifying” for 12 hours, presumably until an AWS operator manually fixed it. Being able to SSH in to fix it ourselves would’ve been much quicker.


Don’t forget all of the outages associated with all of that crap you don’t need.


Most selfhosted distros have some/all of these features baked in: yunohost, freedombox or libreserver come to mind (note the latter is maintained only by a single person).

"Identical setup within seconds of failure" being the exception as you need to deploy from backup which can take minutes/hours (depending on backup size) even if you have the spare hardware. Fortunately, that's the least-needed feature in your list.


> Here's what the bare metal server didn't come with:

It's called software, you put that on top of a server, it's not some kind of magic.


You can get nearly all of that on your bare metal if you use something like OpenStack. I know a couple people that run vmware on home lab also.


That should cost more but not 80x more.

Sure, you're paying the cost of some DBAs and SREs with that price. Still, seems too way above what it costs.


Also physical security, power, physical rack cost, networking, salaries to manage everything, and all those other hidden costs


There are no "hidden costs". They're well known.

Factor them all in, and "the cloud" is 1000s to millions of times more expensive than bare metal. I've seen people pay $100k/month, for services which could run on an $140/month bare metal server failover pair.

Meanwhile, people still spend enormous resources managing "the cloud", writing code to deploy to the cloud, dealing with edge cases in the cloud.

There are no savings, time wise, management wise, or money wise, with the cloud.

You are paying for ignorance.


I was watching a video by Neil Patel and he pays 130k a month to Aws for one website, UberSuggest. Now that is a guy with money to burn. He could put that on bare metal and use the savings to buy a hotel and food for the homeless.


Hidden costs? I've been in companies with their own datacenters from tiny to enormous.

The costs were very clearly spelled out and always lower that SaaS.

Why do people here think that cloud companies are charities? The cost of their hardware leasing, personnel, electricity, building management, insurance... it is all paid by the customer.

Plus a beefy profit margin. All paid by you.


Using cloud-managed backups can lull you into a false sense of security. I've heard enough of horror stories of the snapshot/restore mechanism failing for some reason, especially on certain edge-cases when you use huge amounts of data. What then? All your cloud marketing magic becomes a liability.


fully agree with this, in my free time I would fully go for the baremetal, but if I have to save money to a company, by placing all the headaches that are solved by AWS then I just say goodbye


It also comes with 100x bandwidth pricing.


LOL. Priceless. Having these skills is very valuable. Us old farts used to do a lot with what today would be called "bootstrapped". Scarcity is no longer a "problem", except that it is. Scarcity keeps you sharp, efficient, on the edge - where you need to be. It's also - cheaper.


Who would have guessed having local, low latency, high iops drives would be better than VM using iSCSI-attached drive for storage, right? ;-)


Tbf iscsi and especially it’s future demise - nvmeof can be made quite fast. The problem that in gp2/3, gce-pd cases it’s probably not some native storage but another layer of software-based distributed store which makes stuff slow/expensive


But also less prone to single drive failures. And easily auto expanding storage sizes. Everything is a trade off.


I actually don't suggest that people get off the cloud. My gripe is with the devil may care attitude toward costs. I am still very cost-conscious about it (even those most business aren't). Newer engineers just don't think of it as a factor. Push a button, get a load balancer. It all adds up.

Your AWS bill is directly proportional to the number of developers, because they all spin up cloud toys they are fond of.


Except people end up running systems on them that have their own replication built in like kafka, cockroach etc. In all those cases you’d be better off with your node just hard crashing if disk goes bad rather than the latency penalty that inevitably follows the failover


As a young person I agree with you.


I’ve picked up a second hand Dell R720 last year. Best purchase in years. It cost me €1500 for 48 cores and 192GB RAM.


I want one. What’s it done to your electricity bill, though?


If you want to use 4Kn drives, SSDs (w/trim), or easily switch between RAID and HBA modes, I would recomend an R730 with H730 controller instead. They're available on eBay with 128GB of RAM for just over $1,000 USD. If you don't want those features, an R720 is a good choice, and around half the price.

Idle power consumption of an R730 with no drives is typically about 85W, though higher-end CPUs drive that up a bit. The older R720s are a bit higher in the 100W range. When the extended warranties expire, high-spec 2nd hand equipment floods the market.


Yes, R730 would have been better. I learned that post fact.


Not criticizing at all. The R720s are still excellent servers. You can add a different RAID/HBA controller into an R720, flash LSI firmware onto the one you've got, or just put a bit more effort into ensuring you buy compatible drives. SAS SSDs just a few years old didn't even have TRIM, for example.


Yeah, I believe it’s possible to fit the perc controller from r730 to r720 so that the throughput goes up to 6mbps but I just couldn’t be bothered.


Just checked one of ours R620, dual E5-2690 v2 @ 3.00GHz, 256Gb RAM.

20% load (9 VMs with 11GHz usage), 196W.

Historical trends for Last Week:

Avg: 203 W | 693 BTU/hr

Max: 279 W | 952 BTU/hr

Min: 150 W | 512 BTU/hr

Historical Peaks:

465 W

Cumulative Reading:

Since Mon Jul 13 11:32:33 2020 2723.893 kWh


R820s idle at ~200W and consume ~500W at 100% CPU load on all 80 virtual cores.


So, if we say it draws around 350W, that makes it around $32 /mo power (rough calculation based on a 12.5cent (US) kWh rate, right? Not bad for all that beef.


And for Western Europe readers as a comparison...

UK <Apr2022 = £50.96 ($68 USD), £0.2022 price cap rate

UK >=Apr2022 = £76.16 ($102 USD) £0.3022 likely new price cap rate

Thus the reason r/homelab users in Europe tend not to have Craigslist blade servers of any sort!


Yeah but for the 16k a month they have someone they can sue for damages and costs if there is a serious issue. You're more likely to win a payout against a massive bluechip than a teeny weeny IT outfit or your own sysadmins.


Just out of curiousity but what did you get out of it? I mean if they get to pick it, get the responsibility of AWS on you, but do you also get to get paid those money after saving them money? I mean the way I see it, is that 16k/month don't only pay for the hardware, but also to keep headaches away and to have someone to blame


IIRC I got to move some heavy analytics queries, which could not be ran on a read only replica (rollups) and without hearing the word budget once. Main db remained on RDS.


Not a fair comparison and you know it. Now add to the $200 month bare metal server, the yearly salary of the 3 admins you need to manage it. One as backup, one for day time, one for the night plus a weekend rotation. Add to the admin salaries social security, insurance and a margin of safety if one is unavailable due to sickness.


> Now add to the $200 month bare metal server, the yearly salary of the 3 admins you need to manage it.

That's the line the clouds, MSPs, and service contract providers sell you on, but it's never true. When you need help you're getting some minimum wage slave who hired on 6 months ago, and is struggling to juggle the dozens of other clients competing with you for his time, too. When you don't need help, they get to take your money for doing nothing.

Anything your provider can do for you, you can do at less cost. Cut out the middle man and you're far better off.


> Cut out the middle man and you're far better off.

All that knowledge is now in-house, and you have access to it at a moment’s notice.


Around here, you'd spread the work of one new server to folks you've already got or spread the cost of new hires to a large number of servers.

I think it's also worth considering that many outfits wouldn't get good value from the 24/365 coverage you propose and don't care to pay for it.


Or you'd hire an MSP. Having a full time employee is a greater level of service than anyone here is getting from a cloud provider. Lots of folks can get by with an admin who stops in once or twice a week, or remotes in and takes care of the daily tasks on largely a part-time basis.


I think this is valid in many cases. A decent sysadmin in the US is (speculating) 200K USD, a good one in Europe is 120K+ euro. You have cheaper but they probably can't explaint a UDP datagram or know how ECMP over BGP works. It's the understanding of all of the underling tech which make them worth there weight in gold.


No Sysadmin in Europe is being paid 120k Euro. It‘s learned trade and you‘d be lucky to pull 70k.


The calculation from the management side of 120K Euro is correct. And you are also correct that they are lucky in Europe to get paid 70K.

To pay them around 70K, you need to budget for 120K. It will go into social security, taxes, training, overheard of management, HR team, other costs centers like a margin to account for sickness, absenteeism, people who are trying to "find themselves", personal crisis etc...


The sysadmin you need to manage your cloud infrastructure isn't going to be any cheaper, and if you want high-reliability 24/7, you still need your own staff on call 24/7, so you still need multiple sysadmins.

If some cloud setup can halve your sysadmin labor, then a company with 20 sysadmins can go to 10 sysadmins, but a company with 1, 2 or 3 sysadmins still needs all of them.


I can't tell you how ECMP over BGP works and I was an SRE at Google. Your idea of what's needed to run Linux servers is well over the top.


Cloud services don't need management or specialized workers?


Everything but the most mission critical applications can get away without any of that. I get your point though.


why you'd need 3 admins watching it 24/7?

____________

That's how I've seen it working in datacenter:

cheapest junior admins (450$/month) were having night shifts

and if something broke, then they were calling an engineer


The cloud is no magic place. You need the same people for your cloud infrastructure. Sure you don‘t have a bare metal server failing but you can replace that with a lot of AWS infrastructure glue which could fail too.


> a $200/month bare metal server

For 1 bare metal server? You do know these things run on electricity, right?



For €400/mo all in, split between my friend and I, I've got an air-conditioned office at the local business park with the back room stuffed with ~10 servers, redundant internet links with BGP announcements of our /24 so failover is transparent, and all of that is 10 minutes walk from home. Granted, the setup is more than a bit YOLO; the servers are in two piles rather than a proper rack and the fiber to the servers is just flapping in the breeze from the AC, but we've been able to fix literally everything that went wrong except for a power outage ~a year ago when somebody got a bit overzealous with a backhoe within 30 minutes.

The main difference between what we've got and a proper setup is on the order of about €5k of fixed costs, all of which have a lifespan of ~decades. The savings from owning your own hardware might shock you.

Also, if you think that the "professional" hosts do any better, consider that that Hetzner box you linked is a desktop machine shoved onto a rack shelf.


> Also, if you think that the "professional" hosts do any better, consider that that Hetzner box you linked is a desktop machine shoved onto a rack shelf.

This.

People think "professional" and act like it hides the fact that it's all the same people running stuff on a shoestring budget.

That $50/mo boxes in Hetzner tend to stay up for years it a testament to that there isn't any black magic or huge costs to keep them running.


I'm in a similar boat. I had a server in a closet but switched to a NAS a few years ago and haven't looked back. I do run into people who thinks NAS is old hat, that everything should be backed to a cloud somewhere. These are the people who think they are watching 4k because they select that option on youtube. I want to watch 4k without buffering, without artifacts. NAS lets me do that. And when I go away for a few days or weeks, everything is turned off. No data leaks possible when the "spinning rust" isn't spinning.


I also prefer 480p with a good scenario than a 4k-60fps turd of the past 15 years. Still manage to have a hundred movie or so below a TB.


yes so this, and 4k but what bitrate, 1080p but at what bitrate


Same here. There is, at least for me, one big serious downside: what happens when I die.

My wife is quite concerned about her being left with a web of home automation, hosted emails etc. I understand her and I am trying to find a way out.

My current idea is to document how to de-automatize the home and how to deal with emails and fiber access (the main things to worry about).

Any ideas are very much welcome


Am at the same point. Have been thinking about a variety of options 1. bidirectional pacts with self-hosting friend 2. creating a network and culture of self hosting techies that develop the skills to assist (money and documentation required here but more resilient than no 1) 3. investing in your kids.

Technically option 3 should be the best since it also engineers around generations (options 1 and 2 would roughly be locked to a narrower age group). But it also can be a double edge sword, what if they don’t like tech? Or if you overdo it with trying to make them interested in tech and self hosting and it backfires?

So yeah, no real solution yet. But I’d subscribe to that newsletter if there was one


1 and 2 are interesting but I have some doubts about the feasibility in my region. It is already quite hard to find people interested in automation with Home Assistant, so having someone who would be willing to understand the setup and scripts of someone else would be really tricky.

3 could have been a solution but my kids are not interested in development (if they would, it would have been my first solution)


Wow that's something I never really considered before. I think the documentation would be a good first start.

Maybe some kind of script(s) that could be run that just do all the de-automation?


My plan is to have a technical documentation on how to tear down the installations that are "smart" to use "normal" stuff.


Yes but you have talent and a lifetime of experience, plus space for a noisy 42u rack full of servers, but not everybody does...


People who have been woodworking for decades can own very expensive tools so that they can create very complicated things.

People who are experts in cars can own very expensive cars and tools to tune them.

People who have been working in music can have very expensive instruments and expensive headphones, microphones, sequencers, etc.

We seem to be looking down on experienced "computer experts" and wanting to take their tools away. It's been grinding my gears lately.


I don't think that's the point the commenter was making. The analogous situation would be if someone posted that they made their kitchen table from scratch, and the commenter said that it's great but not everyone has a lathe in their basement, so good that places like Ikea exist as well.


bruh nobody's looking down on people running they're own metal

Server hardware is fun but it's not trivial to manage, buy or run.

So when someone talks about how they've managed servers for 2 decades, own a house where they can install a 42 rack and how much better it is than a hosted solution. A lot of people rightly point out that this is hardly feasible for most people


I started self hosting when I was 12 with whatever last gen hardware at goodwill no one wanted.


People don't need a 42U rack to put a server in their closet though. You buy some surplus server on ebay and throw it on a shelf, no big deal.


It's really not that easy. There's a lot of considerations to make. Which rack size, model you want. Finding someone who will ship that brick to you. Not trivial


Well it is, to get started. If your needs are "mild to moderate", you can buy a brand new intel NUC or a mac mini for <$500 if space is a priority, or go on ebay and buy a beater laptop (I found 5 advertised for <$100 within a minute of searching).


Yes it's trivial. Just go on ebay and buy something. Finding someone? Literally ebay.com. Found.

Also I said _not_ buy a rack, just put it on a shelf.


Got an old laptop laying around?


Not sure why you would need heaps of servers these days given everything is x86 and VM tech is robust and free


GP is upfront about the tradeoffs.

> You pretty much get all the pros and cons of home ownership.

And that's the heart of the matter. Everyone here is arguing in circles based on their feelings about cloud vs their feelings on bare metal (and from what I can tell, pride in their own abilities), but at the end of the day it's a cost-benefit tradeoff. Everyone picks that tradeoff for themselves. As my immigrant parents are getting older, I'm thinking of getting them a business internet line and a SIP phone in their house so that if they need me in an emergency (health or otherwise) they can reach me quickly/reliably. It's something I'm weighing based on the cost of the service/infrastructure/maintenance against my parents' technical and social capabilities (which are limited as older immigrants from a non-Western country.)

It's all about tradeoffs. As computing professionals we have the knowledge and the skills to uniquely take advantage of these tradeoffs in our personal lives. Much like I know quite a few cabinetmakers who do actually make furniture for their house with their skills, their tools, and the shop they work in. That doesn't stop most people from buying furniture from Ikea or most businesses from using cloud hosting.


You don't need a 42u rack. You can run a cluster of Raspberry Pi's hidden in your basement ceiling rafters like me.


Short throw from that to this classic:

> <erno> hm. I've lost a machine.. literally _lost_. it responds to ping, it works completely, I just can't figure out where in my apartment it is.

http://www.bash.org/?5273



Ha. Classic.


Racks are not that expensive, and are a very good way to keep things connected, powered, accessible and tidy.

Heaps of Pis in rafters will quickly turn into a cable spaghetti hell tied into ugly knots.


I agree, when you get to a larger scale. 3-5 RPi's won't make to much spaghetti.


Even for that I'd get a 8-12U box and connect everything using cable organizers and a patch panel.


What makes you think cable management for a bunch of Raspberry Pi's is any more difficult than cable management for a large rack?


Sounds like a new category of specialist rack is needed.

Introducing the... "RafterRack™" ;)


what if you live in van, studio or apartment? You can't just fit a rack anywhere


They never suggested otherwise. They simply shared their current setup.


And a tolerant wife/spouse.

My closet of RPi's are quiet.


yes


I can do the same thing with the 15x8x20cm Synology sitting above my bathroom (don’t ask, that’s where the internet enters te house :P).

Your main point still stands though, not everyone can or wants to do that.


I have done pretty much the same as you, except that instead of an on-premises 42U rack of stuff, a lot of my personal and family infrastructure lives in a single beefy 1U server (32 cores, 512GB of RAM, set up as a hypervisor) in colocation at a major IX point.

Then there's a set of about six or seven $4 per month KVM VM at geographically distributed offsite 3rd party hosting companies for things like secondary authortative nameservers and such.

And an offsite backup system that mirrors everything on the system in colocation to a medium sized disk array that lives in the corner of a closet in a family member's basement.


Awesome! Kudos for owning your own data and infra!

I don't think systems administration is a lost art, people just wanted to do things easier for less money and staff, and no one can blame them or the industry as a whole. The art is still right there, linux hasn't changed that much in 20+ years, I still have my first discs. It also takes some experience to "feel" your way through systems administration problems, working from the bottom up of course, feeling like its a network problem still has a high rate of success when troubleshooting for me!

I try everyday to teach some of these skills to people I work with, I just call it devops, or SRE, or some cyber cloud support position someone makes up, they are all systems admins/engineers to me still. I enjoy watching people learn and apply that knowledge to future problems, getting to the root cause or close to it, and the satisfaction that comes from fixing the issue from start to finish on your own, looking things up is not cheating in systems administration!


Reading this as my home Wireguard-host has just gone mysteriously unresponsive for the second time in a month[0], rendering all my home-hosted service inaccessible - you're right about the pros _and_ the cons. I'm not shying away from this self-hosted adventure, but I'm also realistic about the downsides.

[0] https://raspberrypi.stackexchange.com/questions/135610/conne...


The TL;DR of [0], for visibility:

  [  150.076220] WARNING: CPU: 0 PID: 0 at net/sched/sch_generic.c:468 dev_watchdog+0x308/0x30c
  [  150.076255] NETDEV WATCHDOG: eth0 (bcmgenet): transmit queue 1 timed out
Essentially Ethernet chucks a wobbly until reboot and it's only possible to connect via Wi-Fi.


I assume/hope you have good, tested, off-site backups if the important data & config…

I run my stuff from home too, though it is smaller scale than yours currently. Off-site & soft-offline backups are on encrypted volumes on servers & VMs elsewhere.


The backup server in my home is the offsite backup. The primary data lives in the datacenter.


Until recently my home server was not capable in many respects to the outside machines (picked for cheap storage, not CPU/RAM considerations, going with the "redundant array of inexpensive backup locations" method), but I do currently have a variable dedicated server that might see some live workloads moved to it.


This sounds very interesting. Do you have a blog describing your experience just curious? Thanks for sharing~~


No but I founded and help run https://hashbang.sh where we give out free shell services and mentor people in security, privacy, and digital sovereignty.


curious that the souce metions MIT license at top but in the bottom it says "2. Don't use our resources for closed source projects";

was that intentional? my current understanding of MIT license is a "do anything, i don't care, just don't blame me".

if i were you and wanted to use "dont use our resources" part, i would set it as AGPL or even SSPL if you are brave enough.


There's a difference between what you are allowed to do with the code, which is what the MIT license covers, and what you are allowed to do with a specific hosted platform on which the code is running, which is what that term is part of.


oh. Do... you need to specify another license for that? this is interesting.


It's not a license, it's rule #2, where "This _network_ has three rules: [...]"


You have a 42U rack to accomodate your family's home-computing needs? How large is your family?


The lower half of it houses a kitchenette.


You can also pick and choose how far you go with self management. To me, as much fun as it is, I can’t be bothered to run the physical server in my home(not because I can’t, but because I won’t). That’s why my solution is to lease a bare metal server on Hetzner, pay EUR 38/month and host all my personal projects on it. This way I get a reliable network and operation for my server and I still hold reasonable control over my data and what I run.


I self-host my stuff as well but with a 2012 Mac Mini running Proxmox. I use it as a means to learn sysadmin things.

I get my static IP by using the smallest VPS from vultr with a wireguard tunnel forwarding http/s traffic to a docker container running nginx proxy manager.

For those wanting to learn, I highly recommend joining r/homelab and r/selfhosted. Those communities have a lot in common and you can learn a lot.


Why not simply use DyDns to your local home server ?

Most modern routers have an option to integrate with dynamic dns providers.


I'm where I'm today because of my love for Linux. In the 90s my go to distribution was Slackware. I had it spread across several dozens of floppy disks.

Knowing how to keep your server running and understanding internals is a great skill, but that doesn't mean that progress should stop.

Standalone servers are great, but this greatness comes at a price. It takes time to maintain server, it takes time to configure additional services. But at the same time they bring you joy (and frustrations) and much more knowledge and deeper understanding of what goes under the hood.


Woah, 42u, that's quite something! How many of those slots are you using? What kind of cooling do you have to have for that closet, and how is the volume for the rest of the house?


Not OP but you find a lot of these kinds of setups on Reddit. Check out /r/homelab if you’re interested. Even crazier is /r/homedatacenter where you’ll find dudes with full tape libraries and everything. Super interesting to browse through.


Thanks for sharing this! I tend to forget but there are people who love servers.


FWIW, I recently purchased maybe 21U worth of servers for less than $2000. Mostly IBM 2U servers (all dual 6-cores / 12-threads Xeon, most for spare parts), NAS (enough for triple >12TB redundancy), and LTO-4 loaders and drives to go in a rack I picked up for free locally which also came Cisco switches.

I'm gonna run my private cloud merging 3 different un-backed up physical computers, and migrate services off Google.

That's my second free 42U track, but the other was mostly used as shelf space. I've got a third rack rusting in my backyard which I bought for locally for $200, originally intended to run my former employer test infra, that I brought back home after they laid us off.


All of them, but to be fair the bottom half of the rack is mostly multimedia gear that serves the whole house and a 6U gaming PC.

I have venting into my ceiling which connects outside but I am mostly using consumer gear like Raspberry pis and Intel NUCs and the rack fully fills the doorway so volume and heat is not a real issue.


42U rack?? What do you have in it? I have a 2U self built SAN and a 2U Dell server. The SAN has 24TB on it and can scale up to 96TB. At most I'd need a backup of that and a replica of the other server for most ideal purposes at home (which are obscene for home use), so seriously what could you possibly need beyond a good virtualization server and lots of storage?


I'm curious how much you're paying for business fiber. Do you have IP transit in your home, or is it just a business connection?


Centurylink gigabit business fiber is about $140/mo, and a block of 8 static IPv4s is $25/mo


100mb FiOS Business is on the order of $70.


$300/mo for 300/300 with 5 clean static IPs suitable for nameservers, mail servers, etc.


You can do an amazing amount of work on a raspberry pi these days. I have a couple of R620s that were running several services. I lost a drive and one went down so I stood up a Pi4 (8GB) to quickly get things back up, and I was amazed that it handled a huge piece of the load, much more than I expected. Definitely worth a try for somebody just starting out.


> I know where our data lives and no one can access it without a warrant and my explicit knowledge

As far as you know?

Your data is exposed to The Internet so someone could be accessing it.


You forget the wonders of TLS when you control both ends.


I'm not talking about people snooping your data, I'm talking about a configuration error or exploit which gives people access to your information.

Maybe I am naive, but I think the chances of this happening on a cloud based service like Dropbox are going to be a lot lower.


It used to be the running joke that another large commercial database was just found unsecured on AWS! It's not as common in the last year or so which is a good sign!


Oh, here we go! Just a week ago, thousands of unsecured databases on AWS!

https://infosecwriteups.com/how-i-discovered-thousands-of-op...


> when it comes to me and my family I

Who will keep maintaining your infra when you die, say you get hit by the infamous bus later today?


I can't answer for OP. But I don't think that the situation would be any better if the services are hosted in the cloud. Stuff needs to be maintained either way.


Not full services like Gmail pCloud etc... Non-tech people can keep using those services and can access tech support even if OP is no longer available.


42u for a home rack seems like excessive (unless you run all 3u-4u servers).


Just curious, what data do you worry about a warrant for? Asking to protect my own data.


With a lawful warrant they can look at whatever they wish with my explicit consent and knowledge to make sure they do not exceed the law, because I have heard enough first and second hand horror stories to know how abusive investigators can be, particularly from ex investigator friends. I work with a lot of people in the security industry, particularly those in countries or organizations that might not take kindly to those activities. I have a similar responsibility as a journalist to protect my sources, vulnerabilities still under embargo, etc.

I however mostly detest how almost every third party SaaS sells out our metrics and data for profit, long term consequences be damned. The less I empower profit maximizing machine learning to manipulate me and my family the better.


42u? That's quite a family!


Bonus the servers keep the house warmer in the winter :D


What do you do for off-site backups?


I threw a NAS in a friends house that is configured to bootup and reverse ssh tunnel to my rack where it then accepts scheduled encrypted duplicity backups.

I do not even need to trust my friend as duplicity encrypts all data against a yubikey held pgp keychain before it leaves.

The backup NAS could phone home from anywhere with internet access.


This is very slick. Apparently, I need to make some techie friends. Tarsnap is pretty affordable for off-site backups though!


the friend doesn't need to be techy just have a house


Well, a house and some kind of NAS or at least a computer that is always on. But I take your point. This idea is brilliant though!!


I expect one buys the NAS themselves and pay electricity


What services would you be paying “several hundred a month” for - in what currency? In US dollars my personal cloud services don’t run more than $20 a month.


A very weird thread that degenerated into: "PaaS vs self-hosted/self-owned hardware".

I'm pretty sure most people sysadmin'ing their Linux servers are actually doing it with rented dedicated servers. TFA btw specifically mentions: "don't manage physical hardware". Big companies like Hetzner and OVH have hundreds of thousands of servers and they're not the only players in that space.

They don't take care of "everything" but they take care of hardware failure, redundant power sources, Internet connectivity, etc.

Just to give an idea: 200 EUR / month gets you an EPYC 3rd gen (Milan) with shitloads of cores and shitloads of ECC RAM and a fat bandwith.

And even then, it's not "dedicated server vs the cloud": you can have very well have a dedicated server and slap a CDN like CloudFlare on your webapp. It's not as if CloudFlare was somehow only available to people using an "entire cloud stack" (whatever that means). It's the same for cloud storage / cloud backups etc.

I guess my point is: being a sysadmin for your own server(s) doesn't imply owning your own hardware and it doesn't imply either "using zero cloud services".


I generally agree, I have a cheap AWS Lightsail VPS (mainly for email hosting since my ISP blocks port 25 because I'm a "consumer" and they want to protect the world for consumer spammers) but also for flexibility. I like that the Internet is not at my doorstep (no open ports at home). So, cheap VPS, Wireguard and my home machines to serve whatever I want. I don't pay extra if I use a ton of CPU or disk IO, for example.

Here is my Wireguard server (cheap VPS) and client (my home servers) config:

# # Client (the actual self-host local server) #

[Interface] ## This Desktop/client's private key ## PrivateKey = redacted

## Client ip address ## Address = 10.10.123.2/24

[Peer] ## Ubuntu 20.04 server public key ## PublicKey = redacted

## set ACL ## AllowedIPs = 0.0.0.0/0

## Your Ubuntu 20.04 LTS server's public IPv4/IPv6 address and port ## Endpoint = redacted:12345

## Key connection alive ## PersistentKeepalive = 15

# # Server (in the Wireguard context, exposed to the Internet) #

[Interface] ## My VPN server private IP address ## Address = 10.10.123.1/24

## My VPN server port ## ListenPort = 12345

## VPN server's private key i.e. /etc/wireguard/privatekey ## PrivateKey = redacted

PostUp = iptables -i eth0 -t nat -A PREROUTING -p tcp --dport 80 -j DNAT --to-destination 10.10.123.2 # Add lines for more ports if desired

PostDown = iptables -i eth0 -t nat -D PREROUTING -p tcp --dport 80 -j DNAT --to-destination 10.10.123.2 # Add lines for more ports if desired

[Peer] ## Desktop/client VPN public key ## PublicKey = redacted

## client VPN IP address (note the /32 subnet) ## AllowedIPs = 10.10.123.2/32


This is exactly the solution I landed on as well. It has served me well for a few years now :).


> it doesn't imply either "using zero cloud services"

Enter big-three cloud egress pricing. Designed to make sure that you have to go all-in.


And a good reason to stay out.


How good is their 50 euro package? I wanna try it just for fun and dicking around.


Hetzner is pretty good in general. $70/month gets you a 16 vcpu VM with 2TB free egress/month. Their additional egress pricing is peanuts compared to Google. Google, AWS etc egress is so overpriced - as if we are still living in mid 2000s.


20 TB free egress, not 2TB, and additional TB is 1€/TB


Sounds amazing, thank you guys!


Yep. You can even set up K8 networks. There is a project called 'kube hetzner' in github with which you can easily set up a cluster. There are even easier tutorials:

https://community.hetzner.com/tutorials/k3s-glusterfs-loadba...


for fun try https://www.hetzner.com/sb

Their prices have spiked because of electricity costs, the entry level used to be around 20 euros. They come with 1Gbit


When my SaaS app started scaling, I saw how badly cloud can be priced if you have even slightly unusual use-cases. It occurred to me that instead of spending ~$600/mo on GCP, I can invest in a $3000 PowerEdge server with much better hardware, run it out of my home office, and it pays for itself in less than a year.

Running your own server is an investment that doesn't make sense for everyone. If you can get it, it is better than you might imagine. Being in full control--the master of your own destiny--is so liberating and empowering. It feels the difference between constantly ordering Lyft/Uber/riding with friends, vs. owning your own car.

Not to mention, again, my hardware resources are so much better. This one server can run multiple profitable SaaS apps / businesses and still have room for experimental projects and market tests. Couldn't be happier with my decision to get off the cloud.


> I can invest in a $3000 PowerEdge server with much better hardware

And when some component of the server fails, your app is unavailable until you can repair it. So you need another server for redundancy. And a load balancer. And a UPS. And a second internet connection.

If your app is at all critical, you need to replicate all of this at a disaster recovery site. And buy/run/administer DR software.

And hardware has a limited lifespan, so the $3000 was never a one-time investment.

I think there is often still a case to be made for self-hosting but the numbers are not as rosy as they seem at first glance.


So, just buy another and leave it as a hot (or cold) standby in a different data-center. Or use AWS as the DR site an spin it up only if the local HW fails.

This sounds expensive if your talking one server and vs a year of AWS charges, but is a tiny bump if it turns out you need to buy a dozen servers to replace a large AWS bill.

Plus, I think most people underestimate how reliable server grade hardware is. Most of it gets retired because its functionally obsolete, not because a power supply/whatever fails. Which brings up the point, that the vast number of failures with server grade hardware are on replaceable components like power supplies, disks, SFP's, etc. Three or four years out those parts are available on the secondary markets frequently for pocket change.


> Or use AWS as the DR site an spin it up only if the local HW fails.

Yep. This seems like the obvious setup to me:

1) make the usual case as economical as possible (and ownership and the associated control will probably help here, unless you have to lease the expertise too)

2) outsource the exceptional case (ownership is less likely to matter here, and will matter for less time even if it does)


This is exactly what many of the really big enterprises do. At their scale the cloud becomes laughable. But as a backup, it's the clear way to go. Primary on prem with automatic failover to the cloud.

I worked with a large customer to help build this for them. OpenShift running on-prem, but they had some failover equipment that would ansible them an OpenShift cluster on aws. Depending on the nature of the failure it did take a little time to fail over, but 15 to 30 mins of downtime in the event of a catastrophic failure is often worth it to save hundreds of millions per year.


Yeah. We run servers into the ground where I work. We have around 20 of them. Average age is around 11 years old. Oldest is around 18.


Holy crap. I work in a relatively young sector and we retire them no later than 5 years. Do you mind sharing the type of business you work for?


It's a smallish, family ran company. We don't have any particular sector, and historically have done both business and consumer software. Currently we're primarily working on a game - where many of us definitely feel like a fish out of water - but we have a few old websites and apps that pay the bills.

Everything has been bootstrapped going back to around 1999.


> most people underestimate how reliable server grade hardware is

This. And there are millions of dollars of cloud marketing materials and programs that are at least partly to blame.


The original intended use for AWS was to be additional capacity for existing non cloud setups during spikes. Things like investment firms and algorithmic trading setups that cost money during downtime.

They found a larger market in being the full infrastructure, so they started down the road to cloud appliances making it easy for any average Joe to spin up infra.

This was in ec2 classic era before VPC was a thing. So your active/passive setup description is actually using AWS appropriately based on it's original ideas.


You need to replicate in Cloud too, most people tend not to because they think the cloud is magic, but it's computers and computers can fail- even if they're someone else's.

Also "if some component fails or the app is critical" has a lot of nuance, I agree with your sentiment but you should know:

1) Component failures in hardware are much rarer than you think

2) Component failures in hardware can be mitigated (dead ram, dead PSU, dead hard disk, even dead CPUs in some cases: all mitigated) The only true failure of a machine is an unmitigated failure due to not configuring memory mirroring or something' or a motherboard failure (which is extremely uncommon)

3) The next step after "single server" isn't "build a datacenter", it's buying a couple more servers and renting half a rack from your local datacenter, they'll have redundant power, redundant cooling and redundant networking. They'll even help you get set up if it's 2-3 machines with their own hardware techs.

I do this last one at a larger scale in Bahnhof.

also, $3000 will get you about 3-5 years out of hardware, at which point, yeah, you should think about upgrading, if for no other reason than it's going to be slower.


I don't know what they call that logical fallacy cloud fanatics use when they say "if blah blah just make build your own datacenter".


> And when some component of the server fails, your app is unavailable until you can repair it.

So you have some downtime. Big deal. If this happens once every few years and you need a day to repair it, your uptime is still better than AWS.

Not just everyone hosts a realtime API millions of users depend on every second of the day.


I am not the guy you replied to, but I also self host my web apps. I think every project is different and not all projects demand near 100% uptime. I certainly strive for HA for my projects but at the appropriate budget and my users understand.

If you are trying to go commercial you might have a different attitude but for those of us who do this mostly for fun and for some donations on the side, over complicating our setups to ensure we add a 10th of a percent to our uptime stats just isn't worth it.


This is an important point. My customers don't love outages (who does?) but I've had them and it doesn't really hurt that badly. My products aren't that critical. They're understanding as long as you communicate.


Plus they still happen on AWS (or other critical bits like GitHub) so you’re not immune anyway


github took down our eng dept probably half a dozen times last year. I don't even see how they hit 3 nines last year. It was insane. Even before that we had to switch off Azure. That was a sad joke of a service. They must be running their entire ops out of a log cabin connected with the bare minimum of cabling and bandwidth necessary.

That's also the thing the serverless/API-all-the-things/cloud promoters don't get. This interconnected web of services is incredibly fragile. It's to the point that every day is a new failure. One day github is down. Next day your CI process breaks because Docker shits the bed. Next day your E2E service is hosed. Following day you hit an API limit and need to go dump more money into the firepit. Everything is broken all the time.


>And when some component of the server fails, your app is unavailable until you can repair it. So you need another server for redundancy. And a load balancer. And a UPS. And a second internet connection.

Most applications don't actually have a four 9s uptime requirement. Like, how many otherwise healthy businesses closed up shop during the cloud providers we've seen in the last year because they didn't have their stuff implemented and deployed such that it would remain fully functional when these issues happen?


There's a big difference in service criticality between your business website and your NAS full of pirated tentacle hentai. Cases like the latter can accept extended outages, and are very cost-effectively served by home-based infra.


> it pays for itself in less than a year.

https://news.ycombinator.com/item?id=13198157

On one meeting we had a typical discussion with ops guys:

- "why wouldn't we optimise our hardware utilisation by doing things a, b, and c."

- "hardware is crap cheap these days. If you need more capacity, just throw more servers at that"

- "is $24k a month in new servers crap cheap by your measure?"

- "comparatively to the amount of how much money these servers will make the same month, it is crap cheap. It is just a little less than an annual cost of mid-tier software dev in Russian office. We account only 12% increase in our revenue due to algorithmic improvements and almost 80 to more traffic we handle. A new server pays back the same month, and you and other devs pay off only in 2 years"


I’ve found this to be an unsuccessful approach in practice.

Performance is a complex, many-faceted thing. It has hidden costs that are hard to quantify.

Customers leave in disgust because the site is slow.

No amount of “throwing more cores at it” will help if there’s a single threaded bottleneck somewhere.

Superlinear algorithms will get progressively worse, easily outpacing processor speed improvements. Notably this is a recent thing — single threaded throughout was improving exponentially for decades so many admins internalised the concept that simply moving an app with a “merely quadratic” scaling problem to new hardware will always fix the problem. Now… this does nothing.

I’ve turned up at many sites as a consultant at eyewatering daily rates to fix slow apps. Invariably they were missing trivial things like database indexes or caching. Not Redis or anything fancy like that! Just cache control headers on static content.

Invariably, doing the right thing from the beginning would have been cheaper.

Listen to Casey explain it: https://youtu.be/pgoetgxecw8

You need to have efficiency in your heart and soul or you can’t honestly call yourself an engineer.

Learn your craft properly so you can do more with less — including less developer time!


Does it have a backup schedule (and did you prove your restore process works)? Is it replicated to another physically-offsite location? Do you have to manage your own security keys? Load balancing? Multi region availability? How do you admin it remotely? Firewalled? Notifications of emergency situations like low disk space, downages, over-utilization of bandwidth, memory leakage, SMART warnings, etc.? What's your version upgrade strategy? What's your OS upgrade strategy? Failover? IPv6? VPN access? DMZ?

Basically, I think cloud provides a loooooot of details that you have to now take on yourself if you self-host (at least if you want to do it "legitimately and professionally" as a reliable service). It's not clearly a win-win.

That all said, I recently canceled my cloud9 dev account at amazon because the resources I needed were getting too expensive, and am self-hosting my new dev env in a VM and accessing it from anywhere via Tailscale, so that's been nice.


> Does it have a backup schedule (and did you prove your restore process works)? Is it replicated to another physically-offsite location? Do you have to manage your own security keys? Load balancing? Multi region availability? How do you admin it remotely? Firewalled? Notifications of emergency situations like low disk space, downages, over-utilization of bandwidth, memory leakage, SMART warnings, etc.? What's your version upgrade strategy? What's your OS upgrade strategy? Failover? IPv6? VPN access? DMZ?

So yes, for those of us who have done Systems Administration as a lifestyle/career, yeah you do all of those things and it's part of the fun. I started doing OS upgrades, monitoring, firewalls, and home backups of my own Linux Servers some time in High School. Over-utilization of bandwidth isn't really a "problem" unless you're doing something weird like streaming video, a 1Gbps circuit can support thousands upon thousands of requests per second.


I'm just curious, because to me it seems a little bit unrealistic.

How do you handle traffic spikes, especially from the networking point of view? What kind of connection do you have? How do you make your service as fast for all customers around the world (saying you have a succesful Saas). How do you prevent a local blackout from taking down your service? Where do you store your backups, in case your building gets flooded or your machine blows up? What would you do in case a malicious process takes over the machine? These are some things that are managed in a cloud environment.

I understand investing in a datacenter rack where you own your hardware, if you have the skills, but running it in a home office cannot support a successful business nowadays IMO.


I've been doing my own hosting for 20 years, this just doesn't happen often enough to concern myself with it.

You need to disassociate yourself from the start-up mindset when you DIY a side project app or site. Having said that, there are ways to cache and improve your write performance and maintain HA on a budget. The only thing that's hard to replicate in self-hosting is a high performance global presence.


CenturyLink provides me gigabit internet on a business connection. I get traffic spikes of ~100/rps and it's no problem. Could probably easily handle another order of magnitude or two. Local blackouts are mitigated with a UPS https://en.wikipedia.org/wiki/Uninterruptible_power_supply

To be fair, I'm not 100% off the cloud. Backups are on an hourly snapshot thru Restic https://restic.net/ and stored in Google Cloud Storage off-prem in case of catastrophes. Also, my Postgres database is hosted in Cloud SQL because frankly I'm not feeling experienced enough to try hosting a database myself right now.

It's really not as unrealistic as most people seem to think. People have been building online businesses for years without the cloud. Believing it's suddenly not possible is just their marketing going to work for them making them new customers imo.


> How do you handle traffic spikes, especially from the networking point of view?

I don't know about GP but managing your own server doesn't mean you cannot use a CDN with your webapp.


A CDN wouldn't be enough if the traffic spike involves writes.


Oh no a website went down! Oh, wait, that's not an emergency. Where did the idea that every site and service needs five nines availability come from? A side project goes down or is read only for a few hours. Who gives a shit? It's not an insulin pump or nuclear control rods. It's ok for people to be mildly inconvenienced for a short period of time.


"Always-on culture"

I guess people who use Twitter don't like seeing bad things written about them on Twitter. Though the solution is to disown Twitter and all who use it.


In my first job as a system tech for an IT company in 2014 we had a backup process run at 17:30 and whichever admin left last would take the backup HDD with them home lol. It worked! There was also onsite redundancy with replicated windows servers in an office across the street, which was enough. Simpler times even just 8 years ago!


Which is ok if there isn’t a local disaster which wipes out your office and your friends?


If there's a local disaster that wipes out your office and your friends and all you can think about is data then you should ask a professional to screen you for sociopathy and whether you have any treatable mental illness.

No shame in it, I myself have one, but you shouldn't be concerned with this scenario even if it happens.

Life is too short to waste on ruining your life over planning on how to ruin your life even further after a rare life ruining event has occurred. I doubt you'd be working, or having any real coherent thoughts, for many years in that scenario.


> If there's a local disaster that wipes out your office and your friends and all you can think about is data then you should ask a professional to screen you for sociopathy and whether you have any treatable mental illness.

So you'd lose your friends and potentially your livelihood and any ability to financial support yourself, or any surviving friends or family, just because you don't want to use cloud backups or a professional backup service with the correct contingencies in-place ?

One could easily imagine a less dramatic scenario where your office is close to your work, you go home with your backups in your backpack; However a flood hits your town, you accidentally leave your your backpack behind at evacuation time. Your office and home are flooded and then your backups and actual data set is destroyed, no one dies but you've lost your company and not just phsycial assets. Seems a little unnecessary ?

I actually have lived in towns where floods, lighting storms and tornadoes have caused situations where this scenario could've easily played out.

I also feel the need to point out there's no need to suggest people need a psychiatrist when planning for a worst case scenario, some people are are just prepared for the worst, whether or not it's just how they're born or their paid /trained to do so.


Born, raised, and never moved out of tornado alley. Work for a major university, our university wide "off-site" backups are a datacenter across the street except for critical student information that is encrypted cloud backed up. We understand fatalism with tornado sirens every 10am the first Tuesday of the month.

What I'm against isn't the cloud existing, it's your 24x7 uptime guarantee during the equivalent of Hurricane Katrina at the cost of over engineering.

If you're an engineer in that scenario, you're either running, helping clear debris, or giving first aid/supplies. Your computer is nonsense at that point and yes it's a sign of a problem if you are clinging onto your devices in such a disaster scenario.

EDIT: Just re-read, I might have gotten you confused with someone on a different thread entirely about Kubernetes... Maybe I typed on the wrong tab, oops. Your comment is about data not uptime.


How many times have you known this to happen?

And how are you currently preparing your company against a solar flare?


I've definitely worked for companies where off-site backups are protected from the impacts of solar flares.

> How many times have you known this to happen?

That's irrelevant, if it can happen, it will happen, and it's too late once you've lost your data. Maybe for some businesses, data isn't important, but I don't know of many?


Do you have a static IP?

I have a homelab too but getting “enterprise grade” service from comcast seems to be my biggest barrier to scaling without leaning on aws.


Hey, you actually have a few options, notably, doing nothing!

Comcast doesn't actually change your public IP address between DHCP renewals and thus it's effectively static. The only time that it'll change is when the modem is powered off for an amount of time, or the upstream DOCSIS concentrator is powered off for maintenance or otherwise.


So: arbitrarily and without warning.


Sure; however, I’ve had the same IP on Comcast residential for a little over a decade. There are many endeavors that could fade that low a frequency of IP changing.


I would be more worried about violating the ISP's terms of service. Running a business based on that seems pretty precarious.


What’s the worst that can realistically happen?


If you don't have another provider option, you stand to lose Internet as a utility for your residence, which many would consider a pretty poor situation.


Has that ever happened in practice? I find it very hard to believe that a company normally so desperate for new business as to spam people with sales calls and lock people into contracts (to make sure they can't leave once the service ends up being sub-par) would suddenly start saying no to people's money and cutting them off based on a technicality.


I wrote a script that basically hits an offsite DNS with my latest IP. It's worked quite well. I think in the past 4 years I have had Comcast, they haven't changed my IP once.


Dynamic DNS (dDNS) works here[0]. You have free services like no-ip, and also most paid domain registrars support this. I know both Namecheap and AWS Route 53 support it if you want it at your own domain. Essentially, it's a cron curling with an API key from the host machine, that's it. Works great in my experience.


Keep in mind you will have a small downtime until the new IP is registered. Also the cache TTL of your domain will be very low, so your site will have a small loading time penalty from time to time.


I use this. It's fabulous.


Rent a $5 VPS, run a VPN tunnel from your lab to that box, and run a reverse proxy on it. You'll get some additional latency, but that's about it.


Caution: you may end up with your packets blackholing on the way for unknown reasons after temporary loss of connectivity.

I think it might have something to do with the NAT forgetting about my UDP "connection", but haven't found the culprit yet.


Could be because you’re using a terrible ISP-provided router. Switch that to bridge mode and connect the server directly (if necessary, use the server as a router with a second network card for your LAN) or use a proper, enterprise-grade router (OpenWRT would work as well as long as the underlying hardware is performant enough).


> Switch that to bridge mode

Sadly, not possible on a shared residential connection. I tried the setup where the server was also the router, and that was much less convenient in case of failures than having a dedicated router.

But yes, the abundance of routers is definitely a problem here, with an OpenWRT and a FritzBox in the mix. Perhaps I should try to get IPv6 running and forget about NAT woes.


For a while with my home lab I cronned a python script to look up my external DNS and update my domain in Route53 (and Dyn before that). Worked out pretty well, I think it only updated once or twice in two years.


I have a static IP address / gigabit thru centurylink


When I read small/medium business administrators horror stories on reddit and other forums I think IaaS is a god send.

Especially when someone else is making money decisions. Of course in horror stories admins have to run their servers in broom closets or in sheds because business owner is too cheap to get a proper space for something that whole company is running.


Personally I wouldn't want to become an auto mechanic just to drive my own car for my business, but you do you. (Makes sense if you have a fleet of vehicles, but for one car?)


Why can't people just pick the right tool for the job? The truth behind these managed services is that, for the correct usecases, they are VERY cheap. And for the wrong usecases, they are RIDICULOUSLY expensive.

Most businesses have nightly cronjobs generating some kind of report that is then emailed to stakeholders. Why on Earth would you run a dedicated Linux box for that anymore? Glue a nightly trigger to AWS Lambda, send the report via AWS SES, and it's free. Literally, because it fits quite easily within the free plan. No $5/month VPS box, no patching, no firewalling, no phone calls from execs at 6 AM wondering why their report isn't in their inbox and you track it down to the server being down when the cronjob was supposed to fire.

With that said, if you come to me and tell me what you want to add a new business feature to stream video for our customers off AWS, I'll first ask you why didn't you tell me you won the lottery, then I'll berate you for being stupid enough to spend your lottery winnings on the AWS bill.

Pick the right tool for the job.


This is the real truth. People complain about certain services and theirs costs (Lambda being one I've heard) but I have a full side project that runs on lambda with extremely bursty traffic and it couldn't be more perfect. If I had sustained activity I might look into something else but it really does come down to picking the right tool for the job.


> Glue a nightly trigger to AWS Lambda

You are forgetting the future cost of losing knowledge and control over your infrastructure.


agreed.


> Pick the right tool for the job.

This is such a tired expression. It basically means nothing in the industry, and exactly because of comments like yours.

Exactly who are you to say what my infrastructure desires are? Software is personal and people ignore this completely.


Some people just don't know any better or have starlight in their eyes and they manage to convince the business they need it. No brakes, no caution, no reflection on if they reallly need it. They just want it cause it is fashionable. Then when they eventually realize they are deep in a hole or it that it was the incorrect solution to their problem, then it becomes everyone else's fault (aws's fault, cloud computing industry blah blah blah) and they write blog posts about how terrible everything is, all while making businesses and other people trust us less and less each year. But that one clown not "using the right tool for the job" will never admit it or they disappear or move onto the next fad (he's probably a crypto bro now after implementing a piss poor chatbot somewhere).

Measure twice, cut once. Fail fast is load of nonsense and burns a lot of money for no good reason.


Time is money, and the more time I spend on infrastructure, the less time I spend on product. And thus is born the incredible demand of infrastructure as a service.

Thankfully one person's cloud is another person's on prem infrastructure so sysadmin skills will always be in demand.

From my perspective in enterprise computing, I now see people taking 2 paths. One where they become super deep sysadmins and work on infra teams supporting large scale deployments (cloud or not) and the other being folks who write code and think of infra as an abstraction upon which they can request services for their code.

Both are noble paths and I just hope folks find the path which brings them the most joy.


I find it very similar to how understanding of OS and hardware fundamentals can make one a much better software engineer, how infrastructure in the cloud/servers are setup helps make better design decisions.

At least in my experience, my hobby of maintaining my own home server helped out immensely in my path in the industry due to knowing what tools are available when working on multi-faceted software designs.


It does, but you don't wanna have to deal with it constantly, if you want to be working on a lot of feature work as an application developer.


Definitely agree with you on that. Making use of layers of abstraction and delegation is absolutely necessary when working on more and more impactful work.


This is a straw-man argument. Of course if the cloud could save me time and energy, I would use it, but it doesn't. In my experience, you spend just as much time in the long run tweaking/configuring the AWS console as you do simply running bash scripts on a baremetal server. That's why "AWS consultant" roles exist as a full-time job. The cloud does NOT save time for me, and it's far worse in many ways: opaque, more expensive, and can lull you into a false sense of security (what do you do, for example, if your cloud providers backups just fail one day?).


If you look from business perspective AWS consultant is as opaque as any SysAdmin with bare metal servers.

You don't even know if Sys Admin is doing any backups at all.

From perspective of a person that does not know anything about administering systems tweaking stuff in AWS is I would say a lot easier than setting up a server properly.

So people that know nothing about administering systems pay more because they don't have the knowledge.

If you have the knowledge then yes it is cheaper to run your own sever but what is obvious or easy for one person is not really true for someone else.


That's when it clicked for me.. comparing my hourly salary rate vs. the cost of running these services "in the cloud." Entirely eliminating "system administration" from my duties was absolutely a net win for me and our team.


> Entirely eliminating "system administration" from my duties...

... and adding "cloud administration".

What is it with people doing completely one-sided analysis even when they experiment the thing by themselves? Is cloud administration less time consuming than system administration? That's not my experience, so I'm quite interested on how it got so.


Can be. Paying for SaaS offerings like the gsuite or o365 is a great deal for say, 100 seats, instead of paying someone to administer on prem email. "Cloud administration" can be more work, less work, or about the same work as classic systems administration. That's why carefully running the numbers first should be necessary.


Oh, sure. Offshoring email is much easier than running it yourself.

The same isn't true for less standard kinds of service. The more standardized something is the easiest it is to decide what to hire, troubleshoot, and learn to configure your options. The less standardized it is, the harder all of those things become. VMs are very standard, email servers are less so, but not by a huge margin. Web accessible disk space and on-demand interpreters are completely non-standardized and a hell to do anything with.

Also, some services do need more upkeep than others. Email is one extreme that requires constant care, file storage and web servers demand much less attention.


Too many of these people can't/won't/don't see past C-Panel.


> Is cloud administration less time consuming than system administration?

Infinitely, and if you look at it from a startup lens it only makes sense. One needs to point only at the recent log4j incident. This is obviously a gigantic black swan event, but even just ongoing security patching at the OS level can be a full-time gig. There is absolutely no substitution for being able to ship code to a platform that just runs it and scales it for you.

Andy Jassey had a great slide a few years back at Reinvent, when talking about Lambda -- "in the future, 100% of the code that you write will be business logic". If you really think about that: how many times have you had to write some kind of database sharding logic, or cache invalidation, or maintaining encrypted environment variables, whatever. That idea that you can toss that -- and what that gives to teams, not having to spend massive timesinks and budgets and hiring and all of that on -- effectively -- solved problems, you really start to understand how you can move faster.


> in the future, 100% of the code that you write will be business logic

The present reality of Lambda is quite different though. Even though the code of the function itself is more or less "business logic" (although this is a really meaningless term when we're talking about known programming languages and computers), the scaffolding around it with Terraform/CloudFormation/Serverless/etc. is substantial, riddled with quirks and is really time-consuming to figure out and update. I don't think I spend less time on this accidental complexity now when we have most of our logic in Lambda, compared to the times when we were just running Flask apps in a VM.

This is not to mention how hard one has to fight to overcome the limitations of the runtime, e.g. adding some "warmers" scripts to reduce cold-start latency (no, provisioned concurrency doesn't help and is ridiculously expensive). And then comes the bill after you accidentally created invocation loop between two functions.


Of course -- you're still writing code to delete a key in Elasticache within your lambda. You're writing yaml code for deployments. Hence the "in the future" portion of this slide.

The scale-to-X and scale-to-zero features of Lambda, along with the guaranteed interface to your lambda with predictable input and output requirements, is incredibly empowering for an engineering team. I can absolutely guarantee that we have spent far, far, far less time maintaining our infrastructure than what we would need to be doing if we had a big-buncha-EC2 setup.

Imagine that the environment issues get taken care of, because Amazon has teams and teams and teams of engineers who are working on just that. Cloudflare has the zero-cold-start isolates. All these platforms are heavily invested in making your development and deployment experience as easy as it can be. Concentrate on writing your code, and you'll reap the benefits.


> ongoing security patching at the OS level can be a full-time gig

That's just a cronjob. I know some people don't like doing it that way, but that's on them. I've seen this work for years in production with minimal trouble.


It’s not even a cron job. It’s a single apt install.


log4j was different because each proprietary vendor has their own prepackaged version of it for some reason. Even SAN software or VMware.

Still, the mitigation was like 4 hours to hunt and disable, and 2 more hours when a full patch came around. Not too difficult.

It was only time consuming for those who worked in Cloud providers where all this crap is centralized and understaffed. In real world scenarios there were tons of teams completely unaffected.


> in the future, 100% of the code that you write will be business logic

How much business logic is there across all businesses?

I'm aware of a joke at Google about how it could reduce 90% of the employees without any impact on the business.


The remaining useful 10% probably only does the "mundane" server administration that nobody wants to do!


> How much business logic is there across all businesses?

Immensely more than tech logic, that's for sure.

Just implementing and updating something like a single country's tax code produces more lines of code than the entire Linux kernel, that devs like to brag about :-)


> but even just ongoing security patching at the OS level can be a full-time gig

Oh boy, if that's what your server admin guys told you before you went full cloud then I'm sorry, I've got some bad news for you.


Take a look at January security patches from Microsoft and how domain controllers had bunch of issues and people had to uninstall patch.


Yeah, all those people saying that server management takes little time (me included) are explicitly not talking about Windows.

If you want to offshore all of your Windows servers, all I can say is "go for it!"


Counterintuitively, engineers that run their own servers and infra tend to gain a deeper understanding of what it takes to provide an actual running service to end users. And therefor they write software or at least there is better teamwork with "devops" infra folks.

This is off course the highly subjective meaning of a greybeard unixadmin.


I'd say that running infrastructure in the cloud still requires the same deep understanding of what's going on under the hood as running your on-prem infra. A lot of annoying things are taken out: some stuff patches automatically, some other things have an easier updating procedure (and obviously the "physical" aspect is taken care of).. but you still only get the basic elements for a proper infrastructure. Your servers need an environment set up, you need to build a network, add load balancing and replication, monitoring etc. etc..

You can provision some of these things from cloud providers, but your infra is going to go to shit unless you actually understand what they're really providing you and how to use it. If the only thing you can do is upload a docker image to a cloud provider and click the "create server" button, then that's not really infra work at all. It's like Wix for sysadmins.


It's also a competitive advantage. Look at Backblaze, their business model simply wouldn't be possible on a cloud provider.


I'm curious how you completely eliminated it. I've heard the claim from others, but it hasn't matched my experience at several companies over the past 13 years (the first place I worked at using AWS was in 2009).

While cloud may have a lot of advantages, I don't think it's trivial to run or manage. The AWS dashboard is simply overwhelming. Trying to decide between different, but overlapping services is time consuming. And while you can rebuild somewhat easily, you're also almost certainly going to have to do that as you learn about Amazon's little quirks. Your general RDBMS experience doesn't map to DynamoDB very well and you'll be in for a rough time when you learn you can't just add a new index or whatever.

Then you have all these provider-specific APIs. My experience with both Amazon and Google is that their services will return errors that they claim should be impossible, so you get to have fun debugging that in a service that you don't manage. Your application will invariably add a bunch of handling for exceptional cases and accumulate your best guess about how this impossible situation came about.

Then you have the constantly shifting devops orchestration tooling space and "best practices". I've lost track of the number of times I've needed to pull my Terraform state file and manually edit this gigantic JSON file because some plugin updated an internal struct in an incompatible way.

I'm sure there are people that get an environment up and running in one of IaaS platforms using just the web console. I've never seen it managed that way at any company I've been at. Instead, the devs own the IaaC stuff. It's certainly easier to be in multiple regions that way, but I have a hard time believing any time or money is really saved. Sure, no dedicated ops people, but now your expensive devs have to deal with and probably be on call. Moreover, all that archaic time-sucking Unix admin knowledge acquisition everyone is worried about is just replaced by time-sucking knowledge acquisition of a proprietary service and all its quirks.

Maybe we're talking about different levels of "cloud"? I can buy that Heroku is easier than AWS.


To me System administration is much simpler than cloud administration. Especially for security, I'm never confident cloud applications are fully secure.


That‘s true! And everyone reading it here is a startup idea: Analyze a cloud infrastructure and report which has access to what, combine that with a simple DSL so i can say that IAM x is only alloweed read-only access to some service. Find ways to break those configured security settings.

If on a test run or onboarding this application will find just one security hole (e.g. public s3 bucket) you will have the customer lifetime because he‘s afraid he will make a mistake again.


My brother is a CPA running a practice solely focused on auditing and optimizing cloud expense.

The amount of money set aflame is astounding.

You don’t go cloud to save money. You go cloud to get flexible and reduce capital expense. It’s like leasing a building vs buying. More about tax and accounting.


So why are there so many job adverts looking for people with "cloud experience"? If there were no system administration for the user to do, that wouldn't be necessary; they could be looking just for people with experience of running the actual software their job is all about, on top of the "no administration needed" infrastructure.


But the cloud is...full of crap that wastes ton of my time. Simple ass server saves a lot of time.

The cloud is also too damn expensive.

People are getting ripped off big time and don't want to be embarrassed by the truth, plane and simple.


Full disclaimer, I'm very much not a sysadmin or devops guy.

However, every team I've been on recently has spent a lot of time struggling with gluing their AWS stuff together, diagnosing bugs etc. It didn't seem to save a heck of a lot of time at all.

I couldn't figure out AWS. But I could figure out how to host sites on a linux VPS.

So what's the story here - is serverless something that only makes sense at a certain scale? Because with tools like Caddy the 'old fashioned' way of doing seems really, really easy.


> is serverless something that only makes sense at a certain scale?

Other way around. With enough scale you should be able to make hosting your own datacenter work.

The problem is that the people you hire tend to go off buying too much Enterprise-class shit and Empire building and the whole thing winds up costing 10 times as much as it should because they want stuff to play with to resume stuff and to share risk with the vendor and have them to blame.

Only thing Amazon did to build out their internal IT ops exceptionally cheaply and eventually sell it as the AWS cloud service was to focus on "frugality" and fire anyone who said expensive words like "SAN". And they were ordered in no uncertain terms to get out of the way of software development and weren't allowed to block changes the way that ITIL and CRBs used to.

I didn't realize how difficult that would be to replicate anywhere else and foolishly sold all my AMZN stock options thinking that AWS would quickly get out competed by everyone being able to replicate it by just focusing on cheap horizontal scalability.

These days there is some more inherent stickiness to it all since at small scales you can be geographically replicated fairly easily (although lots of people still run in a single region / single AZ -- which indicates that a lot of businesses can tolerate outages so that level of complexity or cost isn't necessary -- but in any head-to-head comparison the "but what if we got our shit together and got geographically distributed?" objection would be raised).


Serverless was always around and always had a place. It used to be a shell and sftp account on a shared Linux server where you could upload your html and cgi scripts or php files into your own Apache virtual host and get billed per hit on your web site or bandwidth usage. It now just has a lot of tooling around it with a big learning curve and vendor lock in, but at least you get your hits and bandwidth for cheaper than you used to.


A lot of it is lack of awareness of things like Caddy or any other tools that simplify the process.

I did not know about it until I googled it right now. I have spent days/even two weeks figuring out how to set up Nginx and for all I know I did it terribly wrong. I paired it with other tools that I do not even remember. But I would be starting from scratch again if I needed to set another one up.

So a lot might come down to that. I was on a team that transitioned from a owned server to cloud as one day one of the test servers went down and after a week of trying, nobody knew how to fix it. We realized at that point that if a server caused a production error, we were utterly screwed as someone who had left set it up and nobody had a clue where to begin fixing it beyond reading endless tutorials and whatever came up in Google searches.

The server infrastructure was cobbled together in the first place and for a period was theoretically maintained by people who didn't even know the names of all the parts.

At least with cloud, there is an answer of sorts that can be had from the support team.


Even with Caddy, there are so many rabbit holes ro get down into. My current one is rootless. I feel like in completely different world compared to rootfull. Learned a ton though


Caddyserver is Apache/Nginx killer and we will talk about Caddy in a couple of years as if it's been always the default ("why did we kept fighting with apache/nginx all those years, silly us"). Seriously. It's just a completely different way to think about web servers and automation. I'm just amazed it took all these years to emerge.


I love Apache, do not love Nginx and won't be looking at Caddy (it's written in Golang). Apache (even Nginx) are easy to set up reverse proxy and can simultaneously serve static content and serve as the https certificate main, as well as a few dozen other things like load balancing and rate limiting, etc.


I'm just a "user" aka sysadmin who sets up stuff, so can't really comment on the Golang. Even just the fully automated LE cert magic in Caddy is a win in average on-premises environment, where TLS certs are still a big pain and massive manual work drain. And the JSON API automation is a big plus also. At least for our use cases Caddy resolves one of the biggest issues.


>> (it's written in Golang)

That means no memory related security bugs, which is a huge plus.


Not really. JVM is similar in this regard and one of the thing we wrote in our basic Java course was the code that produces memory leak. And when you have memory leak you also may end up with other fun things like applications behaving unexpectedly in low memory conditions.


That‘s a really weak argument. You can build memory leaks in every language and at some point your application will have a problem.


Java has a garbage collector, you shouldn't have memory leaks unless you are just incorrectly allocating memory, which isn't technically a leak which refers to allocated memory with no reference. Without a reference the garbage collector would pick it up.


I use serverless for internal systems.

- Small API that is used by a couple people everyday? Lambda.

- Need to store some data? DynamoDB.

- Need to store some files? S3.

- Cron? Step Functions.

- Need services to communicate with each other? SQS.

Those end up being either free or very cheap.

If you have high traffic, serverless is actually really expensive. It is only worth it if you have high scale but unpredictable / bursty traffic.

> However, every team I've been on recently has spent a lot of time struggling with gluing their AWS stuff together, diagnosing bugs etc. It didn't seem to save a heck of a lot of time at all.

I understand AWS so others don't need to and I'm making the money of my life. All 3 big cloud providers have really terrible developer experience and I feel really sorry for folks who just want to get their shit done.

I used to think some other player would come in and offer something much easier and simpler to take over this space, but I'm not really seeing any serious contender. Just low code / no code platforms and managed k8s stuff.


I've used serverless on small personal projects where paying for a VPS or EC2 instance would be cost prohibitive, e.g. would I really want to pay $5/mo for a small throwaway weekend project? Probably not.

But what if the cost is $.0001 per request? It becomes a very convenient way to make all of my personal projects permanently accessible by hosting on S3 + Lambda.

Even in large workloads it makes sense. Much of AWS is migrating from instances to AWS Lambda. There are some workloads where persistent instances make sense, but a lot of common use cases are perfect for Lambda or similar serverless technologies.


From my experience I would never recommend giving up control of your servers to some third party. Weeks wasted waiting for useless support teams to get back to you on something you could have fixed in 10 minutes if you had root. Opaque configuration issues you can't debug without said useless support team. Needing permission and approval for every little thing on prod. If I was ever at a high level in a company I'd never go farther than some AWS load balancing or whatever on cloud instances we still have root on.


Yes, the absolute opaqueness of so-called serverless is a huge hit to productivity.

Numerous times there's something weird going on and you're stuck trying to guess and retry based on largely useless logs until it somehow works better but you never really know what the root cause truly was.

Meanwhile on my own server I'll ssh in and have complete visibility, I can trace and dump network traffic, bpftrace userspace and kernel code, attach debuggers, there's no limit to visibility.

Yes lambda/serverless saves you a day or three in initial setup but you'll pay that time back with 10-100x interest as soon as you need to debug anything.


> If I was ever at a high level in a company I'd never go farther than some AWS load balancing or whatever on cloud instances we still have root on.

Your competitors would salivate at this statement, fyi. Speed is a competitive advantage. AWS is not "let's rent a big ball of EC2 servers and call it a day", and anyone who treats it like that is going to get eaten alive. If you have not looked at -- for example -- Dynamo, you should. If you have not looked at SQS, you should. The ability to have predictable, scalable services for your engineers to use and deploy against is like dumping kerosene onto a fire, it unlocks abilities and velocity that more traditional software dev shops just can't compete against.


Our customer (300k+ employees) switched to AWS a couple of years ago and I simply hate it so much. The “predictable, scalable” service we’re using, MSK, is a pain to develop against. Log files are all over the place, thanks to microservices and because no one, myself included, has a clue on how to make them manageable again. AWS’s graphical user interface is a constantly changing mess. I hate clicking my way through a GUI just so I can download individual log files manually.

I wonder how you folks manage to work with AWS and not hate it.


This is just a bad setup.

Ship logs others places, clicking in the GUI outside of testing things out is a mistake, use Terraform or CDK or Cloudformation.

I've managed very large fleets of servers and its dead simple if you have a good setup.


Blame the folks demonizing/shaming having "pet" servers and pushing immutable infrastructure. Linux server administration is quite enjoyable, and with how well apps these days can scale vertically, it really takes a special kind of workload to need (and actually saturate) fleets of servers.


In my experience Pet servers are a good starting point (you really should _graduate_ from Pet servers into all the various immutable/cattle stuff), but it can quickly require discipline from the Admins.

They can't be doing one-off undocumented config, package, and network/firewall changes which make it impossible to setup another server reliably. At $company I moved us to Terraform+Packer (to get them used to immutable deploys, but still just an EC2 instance) then Pulumi+Docker+Fargate so we could fix our deployment velocity. The CTO was constantly afraid everything would break; mostly cause it actually would break all the time. Now basically anyone can deploy even if they're not a SysAdmin.

That's not to say you can't automate a Pet Server, but it's a lot more likely for someone to "just once" make some changes and now you don't trust your automation. In our case we had SaltStack and we were blocked by the CTO from running it unless it was off-hours/weekend.


The "missing magic" I'd like to see is a script that knows I started with Debian, installed these packages, and derives the set up script to reproduce this server including my configuration changes. That way I can make the changes in an environment that works for me and have technology do the boring part.


Do you have any tips in auto installing an OS on a server/desktop?

I'm completely missing. I have searched arpund and I have some solutions, but back in my head, some people have something else.

The part before Ansible or puppet to kick in.


1) Rocky Linux

2) Debian


NixOS.


I wouldn't recommend NixOS on servers. It is unstable and extremely difficult to use. It breaks way too much. I could have done so much more with the time I used to try and make NixOS work for me. The cost of getting it to work is not worth its benefits to me.

When I wanted to set up NixOS, I wanted to use it to manage a bunch of NixOS virtual machines (so, nixops), and then run a couple of services. Here are a few points from my anecdotal experience:

1. Learn an entirely new language that has absolutely horrendous documentation. It used to be (I've been told it got better but I abandoned Nix before this) extremely difficult to debug as well, since oftentimes you would get an error in some library file when the error was obviously in your own file.

2. Try packaging your own stuff. Again, barely any documentation. One time, I wrote a simple Rails application in a day. It took me more than three days to figure out how to deploy it, and that involved just figuring it out. Rails adds a PID file to the directory with your Ruby code in it, but that directory in a Nix derivation (fancy [which makes it harder] name for something like package from my understanding) is immutable. Good luck packaging complicated projects that someone else hasn't packaged yet.

3. Nixops was so incredibly outdated that it caused my server to say it was using an "insecure" library from Python 2.7 and my auto updates started failing. On paper, nixops looked nice (manage NixOS virtual machines/VPSes with Nix). But it broke my updates. I eventually wrote my own replacement for Nixops, which worked, but I still don't trust the rest of NixOS to not break. The tool might have some merit on a different Linux distribution, though. (I'd also note that I believe that nixops fixed their insecurity bug, but there were a myriad of other issues I don't remember that I had with nixops)

4. Setting up pre-packaged services sounds easy. They were, somewhat. But permission errors were pervasive. SystemD tempfiles were finnicky at best and gave path traversal errors that were extremely hard to debug.

Eventually, I got a new server, installed Arch on it, and use docker-compose for everything. It takes me maybe 20 minutes to set up a new service on a good day, instead of four hours of Googling for some obscure error from NixOS. While not perfectly "reproducible," neither was NixOS, because I could simply not trust a virtual machine on nixops to boot successfully—there would always be more required even if it worked once.

I might be open to NixOS on a desktop, though.


Have you tried GuixSD?

I run NixOS, but I recognize that user experiences might be very different depending on the packages you use. NixPkgs is huge, enormous. Some stuff is really well maintained, whereas other stuff isn't.

Furthermore, declarative systems make a difficulty tradeoff. Lots of stuff is much easier, but problems are sometimes harder to debug as there's an extra abstraction layer.

IMHO, these days the best distributions out there are either very simple imperative ones with binaries (Arch, Alpine, etc.) or Nix-like (NixOS and GuixSD). Getting experience in those two extremes is really valuable.


I find people dont know how amazing a "immutable" server fleet is until you've experienced it.

It was so trivial to terminate and restart dozens of servers at any given time since unless there was a mistake in the cloud-init, we could bootstrap our entire infrastructure from scratch within an hour.

It was amazing, never had to deal with something missing on a server or a config being wrong in a special case. Dozens of hosts just purring along with 0 downtime since the moment anything became unhealthy, hosts would start auto-booting and terminate the old instance.


"It was so trivial to terminate and restart dozens of servers at any given time since unless there was a mistake in the cloud-init, we could bootstrap our entire infrastructure from scratch within an hour."

Got any tips in this regard?

Ansible, Puppet, Python, Terraform, Openstack Pulumi springs to my mind. Still need to cover a PXE server. (Maybe with Ansible)


We use Salt Masterless with Packer to build AMIs.Terraform defines our infra. No matter what happens we ensure any service can boot by itself with 0 intervention


Sounds like you need a new CTO.


As turns out a lead developer can't unilaterally change the CTO. Not sure how it works for you. I can control tech, direction, etc. or move on to another job.

I chose to work with the CTO/Team to figure out a solution everyone could live with. I even chose a more annoying solution (Packer) initially just to make sure people felt comfortable and avoid changing things anymore than I had to.


I have to give you kudos in being more polite and reserved in your reply than I would ever be able to.


I work in IT Operations of a big IT house. 100% local gov customers. We fully manage around 5000 pet servers. ~30 sysadmins and some of us do architecture designing also. There's also a separate networking team of about 10 network specialists.

Senior sysadmins are really hard to come by today, not to mention someone who wants to do architecture also.

My hunch is that the 5000 onprem pet servers are not going away any day soon, because a massive amount of it is legacy systems that take a long time to migrate to cloud, if ever. Also the work stress is just ridiculous. So much stuff to do, even with automation. Only reason I still do this is that I like the "old school" tech stack vs. cloud IaaS/PaaS alternatives.


>>Senior sysadmins are really hard to come by today, not to mention someone who wants to do architecture also.

I am not so sure... I am a well seasoned sysadmin, been doing server, network, architecture. I consider myself a solid linux/network expert and have managed datacenters. When I look for a new/more exciting job, or for a pay raise, all I see are "cloud, AWS, devops". I never see "old school" sysadmin jobs e.g. as you say, we have a room full of linux boxes and we manage them with ansible/scripts/etc, but we design and maintain them ourselves, come join our team".


> When I look for a new/more exciting job, or for a pay raise, all I see are "cloud, AWS, devops". I never see "old school" sysadmin jobs

Because the old school systems just chug alongg with a minimum of administration, while those newfangled cloud thingamajigs need tons of adminning...?

I mean, that's one interpretation of your (admittedly, anecdotal -- but that's what I see too. Lots of anecdotes sums up to:) data. A valid one, AFAICS.


You should learn some AWS and use it as a trojan horse to get those jobs. As a former old school sysadmin I really took to it because it's so modular and feels like Unix's "collection of simple tools piped together". Plus, any old school admin is a treasure on any cloud team because the need to drop to shell is unavoidable. People think they don't need sysadmins on cloud teams but behind every good cloud team are a few good sysadmins.


I want to use AWS without credit card, because I don't want to make a mistake ending up with 80K in debt. I will never forget that particular HN post.


> Blame the folks demonizing/shaming having "pet" servers and pushing immutable infrastructure. Linux server administration is quite enjoyable, and with how well apps these days can scale vertically, it really takes a special kind of workload to need (and actually saturate) fleets of servers.

You don't need pet servers. Puppet or Ansible make your baremetal cattle.


I think most folks would argue "cattle" means using imaging to manage/replace your fleet. Using something like puppet or ansible against a fresh install implies a level of individualism towards each system as they "may" have minute details based on when puppet/ansible ran, even if they're part of a dynamic inventory of some sort.


I'm not following, sorry.

This is cattle:

* PXE boot server(s)

* Image contains masterless puppet bootstrap.

* Server(s) asks git - "give me the bootstrap for my mac address"

* Server(s) gets a list of classes to apply.

* Server(s) applies classes.

Done.


What kind of software do you use to configure PXE servers? Or is there anything else that I'm missing?

I'm looking for a solution to make unattend installation possible.

Edit: Can use Ansible to make a PXE server. (Egg or chicken thing)


There's a single pxeboot server that bootstraps everything. It sends back a single image common among all baremetal servers regardless of what asks for it. The actual configuration is done via a git repo that contains a mapping between mac address of a server and puppet classes.


I disagree. The 'cattle' vs 'pet' distinction is about replicability and replaceability. If you can spin up a new baremetal server quickly and easily in the configuration you need, then you're doing it right. The technology you choose is not relevant.


I've never seen anyone demonize or shame having pet servers, if anything people on tech news sites write about their pet servers constantly (understandably, it's fun!) Just like you're not going to make all your furniture by hand when you start a business but instead just buy whatever works from Ikea, likewise you make a more conscious decision to build or buy (as TFA touched on) based on the constraints of the business. And sometimes a business, for compliance reasons for example, may choose to keep their servers in-house, in which case you could potentially be a sysadmin.


This works fine from the perspective of the admin, but think from the perspective of a business owner hiring admins. What happens when you have employee turnover? You don't want all of the knowledge needed to deploy your critical IT infrastructure stored in one guy's head or personal notebook. Maybe your admins were all disciplined enough to make sure they never made a single change without documenting it and kept extremely detailed, easy to use runbooks, but maybe they didn't. You don't want to get Dennis Nedry'd.


Treat your people better and you may not lose them all.


Having scaled up various business initiatives, and working through countless scaling issues, I would recommend managed services like anyone else with experience...

However! When I spin up my own side projects. It is sooo much easier to just go into the command line and spin something up directly --- it does make me wonder whether some small amount of expertise can really change things. By the time your orchestrating AWS services, docker containers, kubernetes and more --- Would it have been so bad to run a 10 line bash script on few cheap VMs to set yourself up?

Even typing that, I realize how much time managed services saves you when you need it. Change management is really what those services offer you - even if a momentary setup is easier by hand.


I totally agree. I recently set up a service using docker, terraform, and AWS fargate. It was interesting, but everything felt like such an abstraction. Firing up a VM and running the app would have taken me as little as 10 minutes vs a multiple day research project. Or using ansible would have taken maybe a couple hours.


As a solo bootstrapped SaaS founder who literally relies on my app staying online to pay my rent and food, I chose NOT to use the cloud, even though I have 6 figures in cloud credits. I use the credit to spin up a single beefy baremetal instance and manage my own services. It's because I realized the cloud was based on a false promise which seemed possible a few years ago, but I now realized is impossible. There is no such thing as a "fully managed" deployment where you don't have to think about servers. See also: self-driving cars -- very easy to build a toy demo case, fails catastrophically in the real world. The cloud promised to save us time on managing, deploying and scaling servers; except in my experience: it doesn't. You spend just as much time, if not more time, dealing with the scaling issues of managed databases and "app engines" as you do SSH'ing into your server. Except it's worse because you lose touch with reality and have a very poor understanding of what your code is actually doing.

Last week my choice was vindicated: I ran into a critical hardware issue on my linux instance which required a complete OS reinstallation. Wiped my server clean, and was back up and running in an hour. I feel much more secure in the fact that I KNOW I can spin up a completely functional version of my app on any Linux server in the world in less than an hour, rather than relying on opaque cloud backup/load balancers/serverless configs which could fail in unexpected ways, and are usually locked in to a particular vendor. As for a few hours downtime here and there, my business is designed to handle it.


You really tell this story well. I could never voice my frustrations with cloud that I've had diving back into code lately. But you nail it - it's a time suck that really is a different Cloud DevOps (tm) skill set. Literal moment of personal clarity for me rn (however obvious this may be for the outside world and how dumb it might make me look). Seriously, thank you!


I used to reach for shell scripts to configure servers, then Puppet, then Salt, and then finally to Ansible. Configuring servers declaratively is such a massive improvement over shell scripts. The fact that Ansible is agentless is also very nice and works very well for when you only have a handful of servers.

Only thing I dislike is YML, which I think is yucky!


We took the same path, using config management tools to automate our deployments. But after a while, we realized that the servers only existed to run apps, and those apps could be declaratively described as containers and the whole thing pushed to Kubernetes.

That was our 'perfect world'. Reality was different and we still have a lot of servers running stuff, but what we did push into K8s really reduced our operations workload and we're pretty happy about that.


It's a leaky abstraction, though. The problem is that many systems people that were raised only on these abstractions lack the depth to understand what's under the hood when those other layers do unexpected things.


I dont think there is much abstraction about a apt get command and waiting for exit 0


YML is the exact reason I don't use Ansible anymore. Writing complex YML for simple tasks is a time burner.

After I discovered https://efs2.sh I switched over to this simple config management solution, which simply executes commands and scripts over ssh. It is so much simpler and faster (both in regards to creation and execution) than Ansible.


Out of curiosity, have you tried tools like Pulumi? I've never used it but as a longtime Ansible user it's something that has my interest.


Pulumi isn't for your own infrastructure, but for Cloud providers. I didn't search for a Cloud provider-less Pulumi/Terraform alternative yet.


A simple LAMP stack is a good place to start, and a single droplet from DO or similar is plenty to play around with and standup a real internet-facing system. Bonus points for starting vanilla and installing & configuring the AMP portion from scratch, which is an excellent primer on the basics of system config & admin.

Get your hands a little dirtier installing a lightweight desktop environment like LXDE, programming language of choice & an IDE. Install VNC and you then have a cloud desktop you can code in from anywhere at the same time that it runs your personal website.

Cost: ~$5 per month and a bunch of good experience. Or just do it once as an exercise and cancel after a month.


oracle cloud's free tier is equivalent to DO's $5 droplets (they also have a second free tier offering with ARM cpu's, with up to 4 vcpu/24gb memory for free(!), but it's high-demand and difficult to get a slot)


I'm afraid to touch anything Oracle for fear that a fleet of SUVs filled with teams of software auditors & a mobile law firm will show up on my doorstep with a $5M invoice and a lein on my house.


Honestly after discovering NixOS I have a new found joy of administering Linux servers. It's easy and painless, everything is declarative and versioned, and new machines can be set up for new projects or scaling in a matter of minutes.

This "cattle not pets" mentality doesn't make sense for everything and is highly inefficient if the OS itself seamlessly supports immutable workloads and configuration.


When it comes to server admin, nothing is painless. You just may not know where you're bleeding yet.


Fine, “considerably less painful” then.

The worst thing I’ve had to deal with recently is debugging some faulty RAM sticks and NVMe failure. Obviously hardware quirks are still at play and there’s not much that can be done there, but in terms of making life easier on the software side, NixOS and reproducible config definitely helps over traditional distros.


> It's easy and painless

It was probably the most difficult thing I tried to, unsuccessfully, use on my desktop. I imagine learning how to use Vim/Emacs as a complete beginner would probably be several magnitudes of order easier than learning the Nix DSL and the Nix way of doing things. And from what I've read about the experience of other people who do use NixOS and talk about both the good AND the bad, using it seems like an unhealthy relationship.

Not to mention that the Nix package manager feels slow as hell and reminds me of my unpleasant hours spent using rpm and dnf.


I hear you, Nix definitely does feel slow and I must day over the few years I’ve been using it it has definitely slowed down even more.

In terms of the DSL, I’m really surprised that people find it a problem as much as they do. When I first tried Nix I moved my first nonprod server over to it that very afternoon. It helped looking at the syntax to start with as “hmm this is a bit like JSON” and to worry about things like lazy evaluation later on. I guess it helped that I was familiar with JSONNET beforehand so maybe that helped.

My first few servers I got by with just a basic understanding of the language and copying/pasting examples from the website and using https://search.nixos.org/options

Using on a desktop, now that is a whole other experience. I personally quite enjoy it for my desktop but there is definitely more of a learning experience and you might not find the benefits of Nix worth it anyway as your desktop is an always changing environment.


> I hear you, Nix definitely does feel slow and I must day over the few years I’ve been using it it has definitely slowed down even more.

I don't know about others but once you become used to package managers like pacman and apk, it's a jarring experience to use apt and dnf, especially dnf. And Nix feels just as slow, if not slower, than all of them.

I have to install hundreds of MBs of metadata before I can install a package with dnf and the fastestmirror=true has essentially been useless in the history of my years of usage of dnf. The end result is me downloading hundreds of megs at speeds less than 1MBps.

> In terms of the DSL, I’m really surprised that people find it a problem as much as they do. When I first tried Nix I moved my first nonprod server over to it that very afternoon.

I find it much easier to use a combination Ansible, python/shell scripts, and dotfiles on multiple git forges. This setup may not be as declarative as NixOS but it is much easier to grok and use. I have no motivation to learn an obscure DSL to manage my system.


"cattle not pets" doesn't mean "use cloud images for everything." It means to do essentially what you're saying you do with nix. Hosts should be easy to make and easy to drop.


As someone who has to deal with legacy services from time to time I couldn't agree more. In certain industries, legacy software just doesn't scale at all.


Not quite the angle the author was getting at, but have noticed at $dayjob that staff who are able to do some incredibly complex automation against Linux-based stacks, containers, etc. - get quite lost when something low level isn't working right. Gaps in understanding of OS level troubleshooting and concepts gets them stuck.

You're wise to keep staff around who understand the low level stuff, in addition to the shiny new abstraction based tools.


Regular SA and DBA jobs will be almost completely gone within a decade or so. Same as there are hardly any auto mechanics anymore because nobody can fix any of the new cars but the manufacturer.

You'll only find those jobs at one of the handful of cloud companies. Nobody will know how to do anything for themselves anymore and all this experience and knowledge will be lost.

There are no more actual administrators. Just users paying rent.


I've been hearing this for the past 20 years. And now my sysadmin skills are more and more in demand. For the past 5 years or so I started making more money than a dev because of supply and demand.

Rent to AWS actually drives demand up quite a lot since the bills are huge and very few people understand what is under the hood and how it can be optimized.

I doubt very much things will change in the near future. In the far one... who knows.

Edit: car mechanics with their own shop make significantly more money than me and it seems to only get better for them as cars become more complex.


> Rent to AWS actually drives demand up quite a lot since the bills are huge and very few people understand what is under the hood and how it can be optimized.

A few years ago I participated in a Splunk deployment and the cloud solution utterly dwarfed an in-house enterprise solution, in regards to cost. Even in the event that cost was irrelevant, certain sectors (financial institution(s)) are going to have a difficult time pivoting to a cloud-based solution and relinquishing control over the underlying infrastructure.


The low level stuff won't magically disappear, you still will need someone who can debug the kernel or whatever is under the hood when shit blows up in everyone's face


Out of curiosity, how do you find old-school sysadmin gigs? I find that everything nowadays requires knowing the specifics of a particular cloud and their managed services as opposed to raw Linux or networking knowledge.


There are few to none outright old-school sysadmin gigs. You have to know the cloud providers specifics. Also everything about containers. And Go.

But once you get through the door at a new job the massive difference between somebody who understands what is under the hood and somebody who just got an AWS certification with no prior knowlwdge becomes apparent to all. Quick position and paycheck raises follow.


Hey I'm super interested in such sysadmin work if you could describe it here or share it on another channel (email in bio). Many thanks.


What exactly are you interested in finding out? What I do every day? How you can find such work? Or?


Yes, basically if I could peek into your activity for a day it would be awesome (I'm not asking that, although I would like it :P :)). I'm just curious of what a day's work looks like.


Typical day is actually mostly working to go around corporate "security" limitations that restrict everybody to not being able to do much for fear they will do harm.

The last few days:

- we have a commercial ssh bastion that requires direct ssh login and then input ip address. So no dns available for sshing. I wrote a bash autocomplete script that resolves dns host entries and uses expect to automatically input the ip in the bastion. This is especially important to be able to run ansible.

- our gitlab has it's ssh filtered because "security". So you can only use a security token over http. This prevents us from being able to use git submodukes, as the token would be in ckear commited to the parent rep. So I wrote a make file to compose the project, basically a poor man git submodules.

You might find all the above trivial, and it is. But it keeps the work flowing and devips more junior than I have no clue about make files, expect, etc.


Thank you! Yes, that was my suspicion as well.


They are called DevOps now, and management expects them to do everything that isn't pure development including classical IT, and also jump in to fix development if the respective dev is on leave.

Yes, I know that isn't what DevOps is supposed to be, but we all know how Agile turned out, management has a magic touch to distort such concepts.


>Same as there are hardly any auto mechanics anymore because nobody can fix any of the new cars but the manufacturer.

wait, what? definitely not in eastern eu

it seems like there's one mechanic per a few kms

but maybe due to the fact that average car is relatively old


I don't know, abstraction is the name of the game and it makes my job 1000x easier. I have multiple servers running in my house that host everything from Plex to small little apps I've written, it all runs in containers and I couldn't be happier. Is being able to setup a Wordpress site with a script really something we should strive for?

I've always been a fan on "standing on the shoulders of giants" and it's served me very well to have this mindset. I'm fine to dive deep when I have to but diving deep just to dive deep.... not so much.

Semi-recently I had need of a simple blog for a friends/family thing, I spun up a wordpress and mysql container and was done. Over a decade ago I used to setup and manage wordpress installs but it's not a skill I need.

I find this article a little odd since they talk about server admin but then also scripting setup script for your server which is more in the "cattle" category for me and less in the "pet" that I would consider "server administration".


‘Cattle’ always seems to imply spreading your infrastructure across multiple cloud services, and having infrastructure management tools etc.

I sometimes wonder whether we need another metaphor, something like a dairy cow, where you only have one, but when it fails you can shoot it and plug in another very quickly and simply (e.g. using a script).


> Compare this reality with cloud services. Building on top of them often feels like quicksand — they morph under you, quickly deprecate earlier versions and sometimes shut down entirely.

This rings true to me. On Azure anyway. Like the rest of tech you gotta keep up on the hamster wheel! Example: they canned Azure Container Services because of k8s - just imagine if you tightly integrated with that and now you meed to rewrite.

Also not mentioned in the article is cost. Hertzner is loved on HN for this reason.

That said k8s is probably a stable and competitive enough platform it makes a good tradeoff and by using it you invest in ops skills rather than specifically sys admin and I believe k8s skills will be long lasting and less fadish than proprietary vendor cloud skills.


Since the post pretty much says “go out and do it!”

Does anyone have a good source of learning that is comprehensive and practical? I’m talking about a good guided book/tutorial on how to administer a server properly and what things one should know how to fix, not just how to set up Wordpress.


When I learned this stuff I started with The Debian Administrator's Handbook (https://debian-handbook.info) and an O'Reilly book called Unix Power Tools. Since then I've read handbooks for whatever servers/frameworks/apps I needed to use. There was never one single source. I've also spent lots of time googling error messages and reading Stack Overflow/Server Fault and other sites.


This is a good start: https://www.ansiblefordevops.com/


I usually provide this as an intro:

http://www.linuxcommand.org/tlcl.php/

From there picking up configuration management should be pretty straightforward.


-"As for scripting, commit to getting good at Bash."

That advice can cause substantial headache on Ubuntu/Debian, where the Almquist shell is /bin/sh. This does not implement much of bash and will fail spectacularly on the simplest of scripts. This is also an issue on systems using Busybox.

A useful approach to scripting is to grasp the POSIX shell first, then facets of bash and Korn as they are needed.

-"As a practical goal, you should be able to recreate your host with a single Bash script."

This already exists as a portable package:

https://relax-and-recover.org/

-"For my default database, I picked MySQL."

SQLite appears to have a better SQL implementation, and is far easer in quickly creating a schema (set of tables and indexes).


> That advice can cause substantial headache on Ubuntu/Debian, where the Almquist shell is /bin/sh. This does not implement much of bash and will fail spectacularly on the simplest of scripts. This is also an issue on systems using Busybox.

At least for Debian and Ubuntu, that's why we start bash scripts with #!/bin/bash, of course.

Your point is valid for Busybox, though.


> that's why we start bash scripts with #!/bin/bash

That will also fail spectacularly, as bash does not behave the same when called as /bin/bash as it does when it is /bin/sh.

I have principally noticed that aliases are not expanded in scripts unless a shopt is issued, which violates POSIX.

Forcing POSIXLY_CORRECT might also help.


I would assume if you put /bin/bash as your shebang that you're expecting to get bash-isms. I think the problem you're complaining about (which is a real one) is people putting /bin/sh and expecting bashisms. Debuntu being problematic here is more a side effect of bad practice.


Bash has a POSIX mode.

Knowing when to switch into and out of this mode, and what impact it has, is a more advanced subject that should not burden those learning the Borne family.

It is better to start with Almquist, or another pure POSIX implementation, with documentation specific to standard adherence.

More advanced shell features should wait.


I have mixed feelings. On paper I agree with you, people should start with POSIX shell. In practice I'm not sure how relevant that is anymore. I'm not really convinced bash should be the default thing people learn, but I think there's a decent argument that people should just start off with the shell they're going to actually use. You should, however, be aware that there is a distinction, and if you're learning/writing bash and not POSIX shell you should specify /bin/bash not /bin/sh. But you don't necessarily need to know all the nuances of how bash differs unless you have a need to write POSIX compliant shell.


For me, I just want a shell that works.

Without a nuanced understanding of standards, extensions, and platform availability, new bash users will get large amounts of shell usage that doesn't work.

To avoid that frustration, learn POSIX. That works everywhere that matters.


Not sure what you are saying, bash behaves as bash when invoked as /bin/bash, and Bourne-shell-ish when invoked as /bin/sh. Lots more detail in the man page.

I've never seen use of aliases in a bash script...? They are generally for CLI convenience.


Alias expansion within scripts is mandated by POSIX.

When bash is not in POSIX mode, it violates the standard.

  $ ll /bin/sh
  lrwxrwxrwx. 1 root root 4 Nov 24 08:40 /bin/sh -> bash

  $ cat s1
  #!/bin/sh
  alias p=printf
  p hello\\n

  $ cat s2
  #!/bin/bash
  alias p=printf
  p world\\n

  $ ./s1
  hello

  $ ./s2
  ./s2: line 3: p: command not found


Sure but someone putting #!/bin/bash at the top of their script, written for non-POSIX bash, won't have that issue...


That's very nice, it's POSIX-compliant when invoked as #!/bin/sh and sane when invoked as #!/bin/bash — exactly what I'd want.


If you want a portable script, then you don't want that behavior.


Then I wouldn't put #!/bin/bash at the top.


Your problem there is Ubuntu/Debian.

A bash script that runs as /bin/sh might not run as /bin/bash, until you set POSIXLY_CORRECT.

Easiest solution there is to move it to a better shell, usually a Korn variant.


Why would anyone want to target bash specifically which doesn't exist in all systems instead of just sticking to what's implemented in /bin/sh?


Because you're not going to have a great time with /bin/sh (i.e. dash or the like) if you want to do anything more than very, very basic scripts.


Relying on bash is a recipe for non-portability.


If you're not publishing your scripts and you're running your own infrastructure, you probably don't care about portability at all.


None of the BSDs use bash in their base. Apple recently switched from bash to zsh. OpenBSD uses a descendent of pdksh.

Another major user of a pdksh descendent is Android (mksh), with a truly massive install base.

Some of the bash problem, besides portability, is GPLv3. That was a major factor for Apple. I don't want my script portability linked to corporate patent issues. For this and other reasons, I don't use bash-specific features, ever.


Yeah I've written and maintained scripts for the past 10 years that have to run on stock Solaris, AIX and *BSD. When they started out they had to work on the Solaris 9 /bin/sh, which is particularly elderly.

You should feel free to continue to write portable scripts.

That is horrible advice for anyone who is just starting out though and they should stick with #!/bin/bash and not worry about it until they actually want to consider platforms other than Linux. Portability is the wrong thing for people who are learning to focus on.

It is also bad advice for most system administrators at most businesses since they won't have those alternative *nixes either, and its a best practice to minimize how many different operating systems you have to care about.


As we've already covered, Ubuntu is your downfall.

Almquist is in Ubuntu for two reasons: speed and standards compliance.

These may not be important to you, but they are to a great many people, and the horrible advice in this case is to disregard these factors.


#!/bin/bash

People who start out with linux should use that. Most companies, particularly HNey startups, should use that and nothing bad will happen.

Most people don't have to care about portability, shouldn't use /bin/sh, and shouldn't learn about Almquist. Your advice does not favors to the bulk of people who just need to get shit done.

If they do wind up working in an environment where it is important, they can teach themselves the differences at that point, or else ask someone like us to review their code (and there's an automated linter out there).

You aren't the Main Character, most people don't need to care about the same things that you do.


You don't really know exactly what you'll get with /bin/sh - you might get bash trying to behave like sh, you might get dash. At least with /bin/bash you're hopefully getting bash. Now you just have to wonder what version...


> That advice can cause substantial headache on Ubuntu/Debian, where the Almquist shell is /bin/sh. This does not implement much of bash and will fail spectacularly on the simplest of scripts.

That's not really a problem as long as you use #!/bin/bash shebang, and there is nothing wrong in doing that.


Unless bash lives in /usr/local/bin/bash


and then you use "#!/usr/bin/env bash"


#!/usr/bin/env bash


I settled with using /bin/sh for portability. If there is something that can't be done with sh but can be done with bash then it means a python script is better anyway. I don't want to deal with bashism and ksh and deb/ub/rh different takes on bash.

It's frustrating that most google search results and shell script search results on SO almost always mean bash and sh.


Yes, well if a script I wrote has somehow ended up on a machine without bash, then I'd be more worried about other assumptions the script makes.


That's missing the point.

Some server I know don't have vim. Traefik docker image is running ash and not bash. Tomcat image hasn't vim. Etc. /bin/sh is there. No worry about assumptions. No bashism, no fish, no zsh.


That's fine. My scripts tend to run on a single machine.. otherwise, probably the same/similar Linux distro.

So for me, if there's not even bash then I've also surely not accounted for other peculiarities on the system.


> That advice can cause substantial headache on Ubuntu/Debian, where the Almquist shell is /bin/sh.

#!/bin/bash

There, I fixed your "what shell is /bin/sh" problem.


Unless you actually look at real Linux deployments, which are:

  #!/bin/mksh
Android doesn't allow GPL code in userland, and the installed base is massive.


> Android doesn't allow GPL code in userland, and the installed base is massive.

You aren't administering Android devices.

Stop obsessing about writing portable scripts. Write scripts for the targets that you are going to run them on.


I run Lineage, and use a number of scripts.

Stop ignoring standards. They exist for important reasons. Invent your existentialism in some other realm.


> I run Lineage, and use a number of scripts.

There are people who dig trenches with spoons. It is not efficient and no one sane would be accounting for them.

> Stop ignoring standards. They exist for important reasons.

No.

> Invent your existentialism in some other realm.

I do. My target is Linux servers. Look at the topic : Reclaiming the lost art of Linux server administration" not "Finding another inefficient way to administer a phone"


> Stop ignoring standards.

That ship sailed when they froze that standard in time circa around 2008.

People will always choose convenience over correctness. Fighting against that is fighting against human nature and it's a battle lost before it's begun.


> -"As a practical goal, you should be able to recreate your host with a single Bash script."

I disagree with this. A single bash script configuring an entire hosts can be overly complex and very difficult to follow. As someone who has created complex bash scripts, this will become very time consuming and prevent you from making many changes without significant efforts. I'd suggest familiarizing yourself with tools like cloud-init and Ansible.


"Relax and Recover" (rear) is a useful tool, written specifically in the bash dialect, that will write a compressed clone of your system to a bootable USB flash drive, network share (NFS/SMB), and other targets.

Booting with rear, a (somewhat "klunky") interface will allow you to restore your captured backup, apparently pristine.

It is similar in effect to the HP-UX Ignite tool that I used on that platform.

Oracle has published a number of articles and blogs on or relating to it, and it is well worth study. The invocation of tar that preserves the SELinux contexts is particularly worth capturing.


Yes, but capturing something is far different than being able to reproduce a system across multiple machines.


My take on that, was the host creation should be simple enough to only require bash.


I do run my home server, but it is definitely an investment, and for some use cases it might not be worth it

The hardware is the cheapest part, then you have to pay electricity, manage backups, fix raid problems, have a good internet. Pay attention to how the server is doing. And if you're serving a business, you have to be available debug any issue. Investing a lot of time you could be actually working on the project

But definitely most devs should have a small home server for trying unimportant things. Nothing complicated, just keep the standard hardware config. There are second hand servers available for 50$. Install some Linux and have it running 24/7. Quite fun experimenting and hosting simple things


I'm running rke2 (ranchers k8s solution) on my server.

This means I can run my own servers and the only thing they do is running rke2.

I can take out a node and upgrade the base is without issues or anything.

And still get all the benefits of a high quality cluster is (k8s)

I love it.

And yes it's easier in my opinion and more streamlined to install a storage software (openebs) on my rke2 cluster and backing up those persistent volume than doing backup for my hard drives.

And my expectation is that while it works already very very good that it only gets even more stable and easier.


I have been running all my sites on a VPS exclusively since about 2004. I might not be a server architect but I like the idea of managing the server myself.


We have returned to "You never get fired for buying IBM" but with various cloud services and layers of abstraction, profit taking, obfuscation and performance penalties. I am too old to care. I run my family servers to keep my skills and earn pocket money from time to time. I never learned to become a salesperson for Gsuite, Azure, AWS, Cloudflare Docker, whatever. Don't really care. I skip the majority of the blog spam articles posted to HN because many push the same nonsense. Massive industry deskilling and a race to the bottom. Steer clear kids and look for a good trade.


Having to start using UNIX back in the day where we only had thin terminals, using telnet and X Windows to access our account, the cloudification is kind of ironic.

We are back in timesharing days, only using SSH and Web instead.

I guess I can at least switch my cloud shell colours to green to feel at home.


I wouldn't call it a "lost art". Author says: "One of the skills I wish I'd learned earlier in my career is basic Linux server administration".

There are plenty of books around. And there are literally thousands of people worldwide practicing this "lost" art daily.

Starting from small corp up to the major cloud providers. (Someone has to support those computers, running the "serverless" things")

My word of advice: start with the "philosophy". One program doing only one task but extremely well, "everything is a file" etc.

Understand why people are unhappy with SystemD. :-) Find out how kernel schedulers impact databases' IO. Write a boring program in C - network server which forks on accept4. Tip your toe in Perl 5 - there is lots of it in *nix and BSD. Still most stable and efficient way of writing CGI script ... Find out why Ksh is faster than Bash.

It is truly exciting world, and the best news is that it "fits" as a glove the modern world of JS and async programming etc.

I wouldn't call it "lost" - it is just dozen of levels of abstractions down, efficient, boring and complex. But powerful and unforgiving to typos :-)

I am glad someone actually is reading about all that.


Reading the comments, they seem sadly fairly black and white.

Either you need to have pets on tin, or cattle via cloud, but that never was the case. I worked at a hosting company ~2007 whom was an early IaaS provider. We PXE booted xen nodes, that automatically connected to our management layer, allowing customers to provision virtual machines. Most of our own fleet would be cattle well before this was meme worthy.

Today, you could bootstrap a k8s cluster with almost no effort on tin. You'll quickly have autoscaling cattle and a distributed cron. Sure you'd probably pet etcd and maybe the API servers. Running a database, API, and small management layer is well within the responsibilities of a professional system administrator. If this is beyond your orgs / teams capabilities you probably should use the cloud provider.

P.S. Not having a team that can run production services without outsourcing the database is fine. We all have different specilisms.

The storage layer is a bit more complex if you want to roll PVC.

You shouldn't bootstrap a $1m team to defeat a $500k cloud bill.


I have a clear statement in my profile at linkedin: I am a Unix/Linux sysadmin, not a "DevOps" or a "Cloud/AWS/Azure engineer". I am only doing traditional Unix/Linux sysadmin stuff.

There is not a day going by where a recruiter doesn't tell me "we are urgently looking for an experienced Linux sysadmin. Are you interested?"


> I am a Unix/Linux sysadmin, not a "DevOps" or a "Cloud/AWS/Azure engineer". I am only doing traditional Unix/Linux sysadmin stuff.

I will steal this.

As for the term "DevOps", I am never sure what people mean when they use it. You seem to be using in contrast to traditional linux sysadmin. What exactly does DevOps mean in your definition?


DevOps originally means a method. It is not a position or a job. But that doesn't stop companies to say, there are looking for an "DevOps engineer" or similar. The definition is vague at best.

Some seem to think a "DevOps" is a developer who knows how to administrate servers (or vice versa), as a modern term for a general IT person who can do anything, from programming, to firewall administration and repairing the printer.

Another definition is more specific, DevOps means in this case: working with CI/CD tools, programming "infrastructure as code" (Terraform, Ansible, etc) and doing all things "agile". This job is mostly cloud focussed.


A proper engineer should learn some GNU/Linux server administration, if anything to understand all the artificial limitation that modern cloud services inflict to their customers.

Or the absurd prices for stuff that basically does not make sense.

Or for practices that would otherwise be absolutely unlawful but that people let be because cloud providers are just too big to fight.


I agree with the post. If you want to gently start with reclaiming Linux admin skills and move towards self hosting, you might be interested in my book https://deploymentfromscratch.com/. I focus on long-lasting skills rather than tools of the week.


My setup:

For business related services I use root servers hosted by e.g. Hetzner. I don't want to deal with hardware maintenance nor the 24/7 power bill.

For private stuff (pictures, videos, movies) I have a cheap old desktop machine at home with lots of storage running Ubuntu. Easy to administer, and I can switch it off if not needed. Data is mirrored and snapshotted.

For long-term backup I encrypt my data and upload it to Amazon Glacier Deep Archive (around 1$/TB!)

That said the cloud in general is great and you can do some things today for cheap that weren't possible for most companies 10 years ago. For some use cases it's the best choice.

In general a lot of workloads can be served orders of magnitudes cheaper than 10 years ago.


How about security?

Any good resources / practices on making your server safe? and maybe not those kernel level tricks

also automated deployment

so I can commit and it'll be deployed on the server

I thought about using GitHub Actions so when I push, then the server receives HTTP Ping and clones repo and setups the app


I was a great UNIX/IRIX/Linux sysadmin. Fantastic. Until I got to a corporation with a structure so large that I wasn't even allowed near the VMware vSphere or the physical hardware itself ((servers locked in the basement)), I was only allowed to install RHEL from images onto other images and keep them patched with Satellite and Ansible. But, disconnected from the front end (vmware) and the real tangible back end (physical hardware), I found myself hating the work and I stopped doing it. I found it quite insulting, to be "just the LINUX guy" cog in the machine.


Then become the VMWare guy (or whatever tech you see coming at you to take over your world). This industry changes quickly, expecting to be a skilled sysadmin on the same platform for more than a decade is a career killer.


Also, this is the real "multi-cloud" people tend to ignore. All the clouds can run all the popular Linux distros, so if your target is one of those, you can run your app anywhere without a lot of hassle.


wrt https://gist.github.com/pietrorea/9081e2810c20337c6ea85350a3... :

Don't use "here documents" or "here strings" for passwords. Even in bash versions as recent as 2020, they create a temporary file within `/tmp/` with the secret inside. If the timing is unlucky, it will get written to disk and therefore leave permanent traces even after reboot. Only shredding will securely delete the data.


"I picked nginx as my default webserver, although I hear Apache is also good."

In my opinion: when you have choice, get to know all the options (within reason). I have Apache as my default, purely because nginx didn't exist for many years. When nginx turned up, I gave it a while to calm down and now I deploy it quite often. I deploy something like 75% Apache and 25% nginx.

I tend to Apache from inertia but I quite like the clean easy setup for a simplish site with nginx - this is with Debian/Ubuntu style defaults, which do not favour nginx.


I still do not understand how anyone can become a software engineer and not stop to learn how an operating system or network works. I would have gone crazy if I'd never learned how network protocols work, or how web servers work, or virtual memory, or schedulers, security models, etc.

It's like manufacturing tires without knowing how an engine works. Don't you want to know how torque and horsepower affect acceleration and velocity? How else will you know what forces will be applied to the tires and thus how to design for said forces?


Devs dont want to know. In the spirit of "DevOps" we needed to give Devs full admin access to Production servers. I have seen Staff Engineers do the most mind boggling things on these servers that cause issues.


I don't think that the core of the article is about pros&cons of the managed/unmanaged/virtualized/dedicated server/service approach, but about "why it would be a good idea to have your own dedicated or virtualized server (at least for a while), which is to assimilate know-how" (which can then be used in more abstract setups).

The total flexibility of such a server (compared to un/managed services) is a (great) bonus (not only at the beginning).


Tragic and comic in the same time. In half generation we will be losing all technological skills. There is a shortage ... "The world is facing a shortage of nuclear specialists because of a lack of training programmes and students to replace those about to retire, according to Nobel Peace Prize holder and former director-general of the International Atomic Energy Agency, Dr Mohamed ElBaradei." apparently the same thing is happening in aviation.


I would argue to you don't even need to put that much effort into learning bash scripting, you can totally get away with knowing systemd, journalctl, nginx, apt, ssh and docker and how to run them through bash.

Everything else is per-software files configuration and running commands from the software setup documentation.

Plus, I would run a server with a DE simply because I want to be able to look into databases with a GUI and do config files editing with a nice text editor.


> knowing systemd, journalctl, nginx, apt, ssh and docker and how to run them through bash.

Or, the way things are going, systemd, systemd[1], systemd[2], systemd[3], systemd[4] and systemd[5].

[1] https://www.freedesktop.org/software/systemd/man/journalctl.... [2] https://www.freedesktop.org/software/systemd/man/systemd-jou... [3] https://www.freedesktop.org/software/systemd/man/systemd-mac... [4] https://www.freedesktop.org/software/systemd/man/systemd-log... [5] https://www.freedesktop.org/software/systemd/man/systemd-nsp...


Remember that time where the server's command line felt like a mysterious adventure not unlike delving into a dungeon with just a torch and a short sword?


that is beautifully said


According to popular HN opinion, nobody runs servers anymore. Hipster coders leveraging AWS techno BS are never going to realize they are absolutely clueless and burning money by the ton. Same as in the gaming industry. Why bother starting from scratch. We want MVP right NOW, so just slap a bunch of random libraries together. People are going to buy faster GPUs anyway, right, and we'll fix issues later.


I know how to setup a basic VPS with firewalls, nginx, etc, but I'm scared to death to do that for a production environment available online.


It's harder if you didn't learn back in the 90s or early 2000s. Getting hit by a bad actor was considerably less likely so we were able to learn from our mistakes without a significant cost.

There's much less of a margin for error now.


I've been using Debian for 20 years and Arch for 5. Lately I've been trying to reconcile the traditional skills I have with more kubernetes oriented things. I've got a vague plan to write a helm plugin to talk to systemd and deploy everything with podman. After seeing the appetite for both approaches lately, I'm inclined to actually do this.


When reporting to one of our clients for admining we fount that it was about 3 times cheaper to self-host their services (webapps, sites, email, fileserver, backup) than it was to migrate to cloud. Although they initially went this road because of compliance with some banking clients, they're pretty happy saving heaps on infra.


> As a practical goal, you should be able to recreate your host with a single Bash script.

I think this is ideal, but I've yet to be able to do this or see a solid example.

At my old job I had to do exactly this, and it was really hard to get things right.

I'm much more seasoned now, but I still don't think I could do it lol


I've done this with Nixos. Bash and most other tools are too brittle to get the system back to the same exact state.


There are plenty of us who have been around since the late 90's and early 2000's for whom this is all old hat, and it's all the newfangled stuff that's hard. Why should I use a *aaS for one part of my stack when I can just yum install it?


Just as I learned the basics of setting up and dealing with Linux severs, we're in the cloud where they're basically irrelevant.


Me: I've been using Linux for a total of 14 years.

Interviewer: That's nice, but how much AWS experience? :(


It is partly for this reason that I’m funding an invite-only community shell/pubnix system.


can I get in? I have fond memories of the old SDF shell accounts


it's true; now that we have managed services i'm honestly not sure how we spend our days. i should take an art history class.


why is this not required at interview for sysadmins?

it would up the status of the industry overnight if everyone was at this level...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: