> Bare metal management really feels like an unsolved problem. Whilst everybody working with cloud environments is whisked away by the latest shiny tools like Docker and Ansible, those of us working with bare metal are still trying to find a way to keep machines up and running with an OS that doesn't get corrupted from unexpected poweroffs or permanently cut itself off from the network because of a bad config.
It is and isn't solved. It usually takes a lot of work or custom scripts. One of the best is the Nerves Project, and is what I use for IoT deploymens [0] or even simple cloud deployments.
Nerves is setup to run Elixir/Erlang, but it's really just a wrapper around buildroot and Elixir can start programs in any language desired with some work. One of the core authors wrote a tool called `fwup` for doing immutable updates on Linux [1]. The ability to do an A/B update and have the device do an automatic rollback if an update fails is crucial.
A year or so ago they changed the default boot process to still allow networking and remote connections to work even if the main application crashes. Surprisingly it's all done using Erlang tooling, AFAICT. There are still rough edges, like limited ipv6 support. You can still get devices failing from dead SD cards -- even if your system boots from a read-only partition as power outages during a write on any partition can effectively destroy the SD card, so skip the SD cards.
It is and isn't solved. It usually takes a lot of work or custom scripts. One of the best is the Nerves Project, and is what I use for IoT deploymens [0] or even simple cloud deployments.
Nerves is setup to run Elixir/Erlang, but it's really just a wrapper around buildroot and Elixir can start programs in any language desired with some work. One of the core authors wrote a tool called `fwup` for doing immutable updates on Linux [1]. The ability to do an A/B update and have the device do an automatic rollback if an update fails is crucial.
A year or so ago they changed the default boot process to still allow networking and remote connections to work even if the main application crashes. Surprisingly it's all done using Erlang tooling, AFAICT. There are still rough edges, like limited ipv6 support. You can still get devices failing from dead SD cards -- even if your system boots from a read-only partition as power outages during a write on any partition can effectively destroy the SD card, so skip the SD cards.
0: https://www.nerves-project.org/ 1: https://github.com/fwup-home/fwup