Debian used to be a necessity. The informational complexity of an OS installatio...

ploxiln · on June 1, 2020

I think a lot of devs and organizations these days take the "cattle" idea too far, to the point they're like industrial factory farms where all the animals are sickly, and the pools of effluent occasionally break out of containment and kill everything in the local watershed. Yeah, cattle, sometimes servers die and you replace them. But these organizations get too accustomed to just re-launching a whole new autoscaling group because the old one went bad for some reason that no one really wants to figure out. I get the impression they don't realize how unusual it is for a quarter of the cattle to die in a week. Meanwhile, my 100 or so "pets" are managed using an appropriate level of automation and monitoring, and I get a random bad-hardware incident about every 6 months. Compare that to monthly whole-cluster outages from the full-best-practices all-cattle people.

moreaccountspls · on June 1, 2020

> Compare that to monthly whole-cluster outages from the full-best-practices all-cattle people.

I'm trying to phrase this as nicely as I can: this statement does not reflect the reality of modern day operations.

edit: For people downvoting

1) How exactly do you think operations at companies like Google, Facebook, Amazon, Netflix, Uber, Spotify, etc. work?

2) How often are those companies having monthly outages?

stephenr · on June 2, 2020

Well at a low level we all know that AWS works akin to a Rube Goldberg machine, where a failure in some forcibly dogfooded service means even their own fucking status page can’t be updated, not to mention the dozens of seemingly unrelated services that suddenly don’t work.

harpratap · on June 2, 2020

HN is weirdly hostile towards the modern distributed systems and DevOps ideologies. The amount of people making incorrect opinions on K8s threads is just staggering for a technically inclined crowd.

dsr_ · on June 1, 2020

Debian adds a lot of value in tracking and patching security issues.

Debian adds a lot of value in dependency management.

Debian adds a lot of value in community.

All of these things can and are done by other organizations, but Debian is definitely doing it.

gorgoiler · on June 1, 2020

I have great respect for the project, and IMHO you missed the headline feature: Debian’s Free Software Guidelines and the commitment to true freedom in main is an important contribution to the ethos of a universal operating system.

And you’re right about Debian Security Advisories. They are an important part of a stable base OS for anything internet facing. (The days of local users attacking each other via vulnerabilities seem less relevant in 2020.)

sudosysgen · on June 2, 2020

The days of hacked routers and IoT devices harken back to that, though.

bayindirh · on June 1, 2020

Debian adds another two things which are priceless IMHO:

- Set and forget servers (enable automatic security updates and just let it run).

- Painless and guaranteed stable to stable upgrade paths.

lmm · on June 2, 2020

Debian adds negative value on security issues. The Debian ssh key vulnerability was among the worst vulnerabilities ever seen in general-purpose computing, and also a predictable result of Debian policies which remain in place to this day. It will happen again sooner or later.

jcranmer · on June 2, 2020

The ssh key vulnerability was as much a fault of the upstream as it was Debian: the Debian maintainer who made the patch explicitly asked the OpenSSL mailing list if it was okay, and they indicated at least acquiescence in the change.

Given that upstream OpenSSL itself would have the Heartbleed bug revealed a decade later, with all the revelations of its source code quality made more notable as a result, I'm willing to place more blame on OpenSSL than Debian here.

lmm · on June 4, 2020

> The ssh key vulnerability was as much a fault of the upstream as it was Debian: the Debian maintainer who made the patch explicitly asked the OpenSSL mailing list if it was okay, and they indicated at least acquiescence in the change.

IIRC they asked a list that was not the main project list and not intended for security-critical questions. Code that actually goes in to OpenSSL gets a lot more scrutiny from the OpenSSL side, and other big-name distributions either have dedicated security teams that review changes to security-critical packages, or don't modify upstream sources for them. Debian is both unusually aggressive in its patching (not just for OpenSSL; look at e.g. the cdrecord drama) and unusually lax in its security review.

> Given that upstream OpenSSL itself would have the Heartbleed bug revealed a decade later, with all the revelations of its source code quality made more notable as a result, I'm willing to place more blame on OpenSSL than Debian here.

Heartbleed was a much more "normal"/defensible bug IMO. Buffer overflows are a normal/expected part of using memory-unsafe languages; every major C/C++ project has had them. Not using random numbers to generate your cryptographic key is just hilariously awful.

shoo · on June 2, 2020

> Debian ssh key vulnerability was among the worst vulnerabilities ever seen in general-purpose computing

interesting, I wasn't aware of the history of this vulnerability. For anyone else curious, here's an analysis of what happened: https://research.swtch.com/openssl

m463 · on June 1, 2020

also, afaik no telemetry.

asymptotically2 · on June 1, 2020

You can opt-in to sharing some info with the popcon tool.

dsr_ · on June 2, 2020

Yeah, the biggest problem I have with Debian is that they tend to take popcon statistics as evidence.

No corporate installation ever adds popcon. Nobody sensitive to their privacy adds popcon. It's a small, self-selected group, and massively overrepresents desktops and laptops at the cost of servers.

gsich · on June 2, 2020

Main problem with telemetry. The reverse is also true. Software that has it enabled by default will just get data from noobs, because more technical users will disable it. (Looking at Firefox)

debiandev · on June 2, 2020

> that they tend to take popcon statistics as evidence

Please provide a source to back that claim.

dsr_ · on June 2, 2020

Sure. Go read debian-devel on the topic of systemd.

debiandev · on June 2, 2020

bayindirh · on June 1, 2020

If your systems are cattles [0] your case holds. If not, your method won't work on the long term.

Also, preseeding can do much more than ansible/salt/puppet. It can produce %100 reproducible systems under 5 minutes.

Case in point: I manage a cluster and our deployments are managed by XCAT [1]. I just enter the IP addresses, the OS I want (CentOS or Debian), and some small details in 5 minutes. With three commands, I power up the machines and in 15 minutes ~200 servers are installed the way I want, with the OS I want, identically and with no risk to drift into anything.

The magic? XCAT generates Kickstart/Preseed files and instead of pushing a minion and blasting through the installation, it configures them properly. Some compilation and other stuff is done with off the shelf scripts we have written over the years. More stable and practical than trying to stabilize a salt recipe/pillar set or an ansible playbook.

I only re-install a server if its disks are shot or it somehow corrupts itself (which happens once a year per ~500 servers due to some hw glitch or something).

The new ways are cool, but they don't replace the old methods. You're free to choose the one you like most.

[0]: https://www.engineyard.com/blog/pets-vs-cattle

[1]: https://xcat-docs.readthedocs.io/en/stable/

gorgoiler · on June 1, 2020

When your user data / logs / spools / db files are on a filer or confined to one local place that’s easily backed up and can be deployed — I tend to use /data — there’s no reason to have ‘pets’ any more.

What’s the non-cattle use case you are referring to? Systems that can’t be wiped because of important persistent user processes? That’s the only volatile state I can think of that would be lost by reimaging.

bayindirh · on June 1, 2020

In normal enterprise operations, it's easily attainable. I presume your servers are not extremely under load and they mount a remote storage system.

In our system, there's tens of storage servers under heavy load (while they are redundant they are generally actively load-balancing) and more than 700 nodes which are cattle but under max load 24/7/365. The whole thing doesn't have any time or space to breath basically.

While losing nodes doesn't do any harm, we lose some processes hence lose time and, we don't want that.

Even if we can restore something under 15 minutes, avoiding this allows us to complete a lot of jobs. We don't want to rewind and re-execute a multi-day, multi-node job just because something decided to go south randomly.

Our servers are rarely rebooted and most reboots mean we have a hardware problem.

lklxy · on June 1, 2020

> In fact, it’s all so automated, if you aren’t regularly blasting everything away and reinstalling, you’re doing it wrong.

The road is the goal? Some people actually use installed systems.

If everything improved so much, how come that the Internet is worse than in 2006?

gorgoiler · on June 2, 2020

It’s good practice to ensure your code still compiles after a make clean.

Wiping and redeploying is the same idea with infra: just another part of disaster recovery, or even just regular recovery.

harpratap · on June 2, 2020

> If everything improved so much, how come that the Internet is worse than in 2006?

What facts are you basing this opinion on?

downerending · on June 1, 2020

> The OS is disposable.

Hmm. You can install a basic userland with a custom kernel, with no particular dependency on a distribution (Let's use openSUSE today!) in just a few KB of YAML?

Have to admit--I'm skeptical.

pknopf · on June 2, 2020

I do it with a few text files.

https://github.com/pauldotknopf/darch-recipes

diffeomorphism · on June 2, 2020

This seems to explicitly reference ubuntu, i.e. a distro?

> sudo darch images pull pauldotknopf/darch-ubuntu-$IMAGE

So your "build" is actually just grabbing what the distro already built and then using its infrastructure.

Your description sounded more like you were actually doing things from scratch or at least weren't relying on a distro.

lmm · on June 2, 2020

The point is that the distro is a commodity. Yes, you need one somewhere, but it doesn't really matter which particular distro it is, which packaging format/repos they use...

diffeomorphism · on June 2, 2020

So why did you just happen to pick one of the largest with lots of manpower, large repos, lots of software support etc and not for example RebeccaBlackOS (yes, that exists and no it is not what you think)?

There has been quite a bit of standardization and improved compatibility, so that you can choose relatively freely between the "big" distros. But that does not say they are a commodity but rather that all (five or so) of them are good.

lmm · on June 2, 2020

People tend to go for the big names everyone has heard of because they're the big names everyone has heard of. A no-name distro is probably fine too though.

kwk1 · on June 3, 2020

Your last sentence is specifically what's being called into question.

downerending · on June 2, 2020

This seems to be leveraged on top of a distribution, and IMO is a good way to do things. But definitely you're relying on the work that the distro people do.

zozbot234 · on June 1, 2020

Debian itself does support automated installation/deployment. You can run the ordinary debian-installer while "preseeding" answers to every question it would normally ask during the install. It's not clear what your scenario is adding there.

FWIW, most stuff that you download from upstream sources will install under /usr/local/ by default, which is standards-compliant for your use case. (You don't generally need to use the /opt/ hierarchy.) Overriding that default and putting stuff in /usr/ is just breaking the OS for no apparent reason.

gorgoiler · on June 1, 2020

Preseeding is a beautifully built technology but it isn’t very relevant today.

It helps answer questions like partitioning and encryption that is hard to do on a running OS, but really everything else can be done once the OS is installed and running. The development cycle in creating a preseed config that actually does everything you want is painfully slow compared to writing shell scripts / playbooks / cookbooks for a running system.

nullify88 · on June 2, 2020

> Preseeding is a beautifully built technology but it isn’t very relevant today

I think preseeding is still relevant with the advent of container / immutable operating systems such as CoreOS, and perhaps Nix too. The technology has changed and overlaps with configuration management tooling, but only handles a small part of a servers lifecycle giving room for a proper cm tool.

bayindirh · on June 1, 2020

Can you compare the time required to generate a preseed file and a playbook?

Also, how often do you rebuild a playbook and a preseed file?

Honestly asking, no traps here.

gorgoiler · on June 1, 2020

When you’re at the end of the week and you know what you want to do, the two are equivalent.

When it’s Wednesday afternoon and you are still riffing on some ideas as to how to configure a new service, the preseed edit-test-edit cycle is in the order of minutes instead of seconds, compared to a script run via ssh on a stable running system.

That makes a huge difference to productivity, for me.

bayindirh · on June 1, 2020

Oh, I understand now.

I generally do service configuration at the post-install stage and if I have a working configuration just get that from our central storage. Or just write a small script and add to the post-boot steps of XCAT to run the commands and configure that stuff.

I configure the service by hand at one server, polish it, get the file (or steps) and roll it out.

So, preseed file stays very stable. We only change a line or two over the years.

Thanks for the answer BTW.

Edit: It's late here. There's packet loss between my brain and fingers.