Hacker News new | past | comments | ask | show | jobs | submit login
Bakeware – Compile Elixir applications into single executable binaries (github.com/spawnfest)
182 points by tempodox on Sept 17, 2020 | hide | past | favorite | 50 comments



As a more or less elixir beginner, elixir/mix releases remain confusing to me.

First of all, it took me way longer than it should be to figure out how to set the entrypoint of my application.

More importantly the resulting releases are kinda confusing, at least to me. Why does it need to contain 17 (13 .exe + 4 .bat) executables? Why are there 55 c-header files in the directory of the erts? Why are there so many configuration files?

I'm sure that they are there for a good reason, but the current solution is daunting, such that it lowers my enjoyment with the otherwise very elegant language.

IMO there should be a release approach that is very simple + compact, and extendable. This would allow people like me to hit the compile button and distribute my application in very manageable parts. Though further configuration it should be possible to end up the very adjustable release that currently exists.

This project seems to go into into the direction I would like. I hope the windows release is soonish such that can try it.


The thing to remember about Elixir/Erlang is that it gives you a lot of deployment options. There's this mix of concerns with deployment tooling to both support all of those options and be simple at the same time, which is tough to achieve.

Just as an example, with many Java App Servers you deploy your packaged java code and it automatically handles seamless deployment across the cluster with zero downtime. And there are lots of different server options.

With languages using K8's and Docker, you're deploying a configured container across a system that will help manage the same thing.

Distillery has been the standard for deployments with Elixir for a while now and supports everything. Releases are slowly rolling this functionality into the core, but don't currently support all of the options that you get with distillery.

But with the BEAM you've got clustering built in. You've got hot reloading options which literally deploy your new code in the middle of running code without so much as dropping a socket connection. You get the ability to roll that back too.

It's a little more complicated because it comes with a lot of very unique options that are built in instead of delegated to another system.


My experience is that preparing a release is much more straightforward since Elixir 1.9, which added release support to the Elixir standard distribution. That said I tend to treat a release as a black box, I haven't tried to understand what all those files are and I just interact with bin/myapp.

However I still have two complaints.

First, the system on which the release is built and the system where it's going to run need to be similar (same version of the same Linux distribution). In practice this means that these days we build releases in containers matching the target host.

Second, releases can't run mix commands so we have to write boilerplate code specifically for releases for things like database migrations, etc. It would be nice if writing app management commands could be a bit more DRY.

I could be wrong but at first glance it doesn't seem that Bakeware addresses these issue.


You can actually get away with quite a lot in terms of mismatching platforms. We target an embedded system (Xilinx Zynq). We cross compile the Erlang+Elixir system into the Linux platform build. Then for the application code we just build the release using no_erts. We're building that application on X86, just making sure that the Erlang/OTP/Elixir versions match and it works.

We have some ports that are separately cross-compiled and provided on the target.


+1 on the issue of Mix commands. We also put in boilerplate code for database migrations, but eventually I decided this was stupid. Now I keep the source on the servers and just run Mix commands from there.


For Ecto, you can just run a function:

  migrations_path = Path.join(["#{:code.priv_dir(:myapp)}", "repo", "migrations"])
  Ecto.Migrator.run(MyApp.Repo, migrations_path, :up, all: true)
Since migrations are idempotent, and I always want to run up to the latest migration, I just include this in lib/myapp.ex in its start function. That way any time a new release rolls out, the first node with the new release automatically runs the migrations. (We always make sure our migrations are backwards compatible for at least X number of versions, which means never deleting tables or columns so that existing nodes don't break)


But then you need to install Erlang/Elixir on the server, which kind of defeats the purpose of releases, don't you? Or have you abandoned releases altogether?


I actually find really good that mix commands are not included. Those are Makefile commands, I wouldn't expect them on production.

It's funny that the case highlighted is the one about migrations, which indeed shouldn't be located or connected to the app in the first place. There are a lot of SQL migration tooks available, but even simple bash scripts are enough.


Releases should be able to use Mix commands. You might need to adjust your mix deps to include Mix/Ecto-Migrator libraries for your build. Otherwise you can do what people use for sqlite3 migrations in Nerves [1]. Just call it via the library functions instead of via the command line like in `migrate_repo!/1`.

1: https://embedded-elixir.com/post/2017-09-22-using-ecto-and-s...


> Why are there 55 c-header files in the directory of the erts?

That shouldn't be the case anymore. It was a bug in rebar3 at one point and I think `mix release` may have copied said bug. If 1.11 doesn't properly not include those header files then an issue should be opened.


First, answers to your questions, and then a general addressing of your idea:

1. Having an “entrypoint to your application” isn’t the right way to think about an Erlang release.

An Erlang release is like a virtual appliance: an OS (ERTS) with a set of “service” packages installed in it (your apps and libraries.)

And, just like when assembling a VM instance using Terraform or the like, you can set up/install multiple root-level applications/services within that VM. Like, say, a LAMP stack. Different root-level services, none installed as a dependency of the other; all just running as siblings and configured to talk to one-another; all in one VM image. The VM image, so-composed, would be a “release” of a single networked system component. Not a single service, but a single set of services, deployed and upgraded atomically, along with the virtual-machine OS substrate they run on. A “node”, in Erlang parlance.

So, then, how do you set up an entrypoint for such an OS service? On Linux, you’d use systemd service-units. Each systemd service has its own entrypoint, configured by the unit file. Equivalently, each Erlang app has its own entrypoint, configured by the .app manifest. (Which is, in Elixir, generated from the Mix project file, which is why the application `mod` directive ends up there in Elixir.)

2. Those config files are the config files for the “OS” (ERTS), not for your app. They’re things that — in a different “multitenant” abstract-machine runtime, e.g. Smalltalk — would be hiding within the “image” that the emulator works with. Things that in a regular Linux VM, would be in /etc of the VM’s virtual block-device image.

Why are they there? Because Erlang is not designed under the expectation of heavy dev-ops collaboration. Releases may very well be created by an “upstream” of devs, and then thrown over a very tall wall to a hapless operations staff. The operations staff then has to deal with deploying this thing, where the only things they can tweak to get a release to work on a particular system, are those very config files. If they were inside the image, they’d have to ask the devs to burn them a new release with the fixes in place. As it stands, they can just tweak the deploy themselves.

3. And that’s also why there’s so many executables: a good few of them are different (static-compiled) emulators for the different operational deployment scenarios that won’t be known at build time, e.g. single-core vs. multi-core, where all this detail is abstracted away by detection-steps run just before emulator boot-time by those batch files. (Make no mistake: for any “runtime” package you might install — e.g. the JVM, the CLR, etc. — you get a similar menagerie of executables, just hidden somewhere out-of-sight.)

Oh, and some of them (e.g. EPMD) are just ERTS “daemons” that run as sub-processes of the emulator, rather than “in” the emulator. Since there are several variant emulators shipped in the release, burning this code into each of them and running it with fork(2), would result in more bulk to the release than just factoring it into a separate executable would. (And besides, Windows doesn’t fork(2).)

And also also, a more Erlang-y reason: isolating this code into separate processes, means you can validate it solely in terms of its failure-state IPC behaviour, rather than needing to take into account its failure-state emulator memory-state behaviour. It’s the same reason Erlang encourages the use of port processes over NIFs. It’s the same reason microkernels exist. Isolating failure, so things can crash hard, without the important things crashing.

4. The C header files are something you’ll see with any runtime that both ‘vendors’ the emulator itself; and ships a compiler accessible at runtime; and where that compiler supports FFI/building runtime extensions. In Erlang, you can run relups against a deployed release, that will install new Erlang applications into that release. If those Erlang apps contain native C code that needs to be compiled, the header files need to come from somewhere.

If they came from the host, they’d not be guaranteed to be compatible with the destination. Even if you wanted to set up some sort of cross-compilation toolchain matching the target, “the target” is a moving target, because relups can boot into a new version of the emulator; and because ops staff might independently upgrade/downgrade between relups (think “rolling upgrade failure”), meaning that any one of the set of so-far deployed copies of ERTS/BEAM might be the one running on any given node.

An Erlang node is a stateful, living system. Imagine it like a Windows virtual appliance that’s created as a series of Windows Deployment update files by a dev team; but where any given installation’s ops staff may-or-may-not choose to apply any given update pack. On such an appliance, the OS version isn’t really under the dev team’s control. Despite having atomic upgrades using atomic whole-release patches, it’s still not “immutable infrastructure” in the sense of e.g. a Docker image, where the whole image gets swapped out. And, as such, any given instance of the appliance can’t really be predicted in advance by the devs team, to have a particular OS version running on it. Rather, if the dev team wants generality, they have to build updates for multiple possible “base” versions of the OS; and then the update install system needs to interrogate/verify/select a matching update for the OS version that turns out to be running. And if they want specificity (e.g. to deliver a hot-fix update to a specific client), then they need to find out right before building the update, what OS version their appliance is currently running.

You can’t make the fully-general update-distribution problem any easier; but you can partially automate the hotfix-build-discovery problem. Just set up the virtual appliance so any running prod instance can be interrogated by your dev toolchain, whereupon it will deliver to your toolchain a tiny little cross-complication toolchain (i.e. C header files et al) precisely matching the running instance.

Which is... precisely what ERTS does. Relups are weird.

—————

Even what I said before (an Erlang release being a VM) is a bad abstraction — an Erlang release is an atomic patch of a VM, that the VM itself can then switch to. Like a base-image in CoreOS... but where the VM can switch to it without needing to reboot. That has a lot of complications.

Some languages (e.g. Go, Rust) are “closed-world”: they assume that, within some boundary (in Rust, a “crate”), everything will become fixed at compile-time, with nothing further able to change or intercede at runtime. These language compilers can thus execute Whole Program Optimizations.

Other languages (e.g. Java) are “open-world”: they assume that code can be loaded at runtime, right into the middle of any boundary you might draw; and, therefore, optimizations can only occur at the level of the code-unit (e.g. module, class), guaranteeing that all replacements will at least happen atomically at the level of the code-unit.

And then there’s Erlang, which takes “open-world” to a whole different level.

What you’re basically imagining here, is a version of ERTS that takes a “closed-world” assumption. No relups, no runtime module loading, maybe even burning the whole system into a single BEAM file with WPO. This would disable much of what makes Erlang, Erlang — but it would be possible. It’s just not possible to build this on top of the current OTP version of ERTS, since the open-world/closed-world assumption of a runtime is baked into basically every implementation decision of a VM and runtime at a deep level. You’d need to write your own (much less complex!) VM and runtime.


Thank you for the very detailed response, I'll definitely read it more thoroughly once I have more time.

Yes, what I imagine would probably be a 'closed-world' that goes against most of what makes Erlang special.

It's not that I don't appreciate ERTS. Like you said, ERTS is like an alien technology. I'm just some tinkerer that want's to harvest a small portion of it's power.


4. The compiler should not be included and neither should the headers. Unless a user explicitly wants to include the compiler and also have a C compiler on the node in order to compile stuff on the target node there is no reason to have the header files there and relups work fine without them.

If Elixir 1.11 'mix release' still includes all of the `erts` directory then an issue should be opened.


Yeah, it doesn’t include it all the time. It’s an option, though, and one that the OP might have accidentally been using. (I think in relx there’s an explicit `include_src` flag you pass to enable it.) I was just justifying the presence of the option. It’s not always, or even usually necessary.


Not a big deal but, it was actually a bug in relx that caused it to be included (accidentally copying the erts dir from the symlink/copy in the _build/<profile>/rel/<relname>/ dir instead of from the unpacked tarball generated by systools when doing the final assembly), not `include_src`, that option only applies to the Erlang applications and not the runtime.


Thank you for your detailed message. I haven't understood everything you wrote but it seems that basically all that complexity comes from supporting hot code reloading.

Given that a lot of (most?) Elixir devs don't use hot reloading, having a less powerful but simpler deployment alternative seems like a good idea.


Re: hot-code reloading:

Erlang is in the weird situation of its main code contributor, Ericsson, having a use-case for it that very few other people/organizations have: “immortal” nodes running code approximately forever, on telecom control-plane boxen.

That means hot-code reloading (or at least doing something to let the node receive upgrades while keeping live sockets open) is a high priority for them, and so the OTP (Open Telecom Platform) Erlang RunTime System is always going to be designed around that goal.

We third-party Erlang/Elixir developers, are all essentially using a recovered alien artifact. It was built for people very different from us, for purposes foreign to us. As it happens, though, it’s very advanced alien technology, such that it’s often able to solve our problems very well, despite not being designed to solve those problems. Our use-cases are very often strict subsets of those of our Swedish telecom overlords.

And, given that that’s the case, there’s not been much motivation to build a separate ERTS to serve us mere mortals deploying merely-mortal Erlang nodes. We don’t really need different ERTS, just less ERTS.

I could see someone maintaining a patcher program that takes the OTP repo, and strips out all the stuff us mortals don’t need. But that wouldn’t really be an alternative ERTS. It’d be the “XPLite”-ization of OTP.

And really, if you try to think of who would lead an effort to develop an alternative “closed-world”-design ERTS, you come to the realization that ERTS and actors and hot code-reloading all sort of occupy the same area of use-case space, such that AFAIK there’s no real use-case that absolutely needs most of what Erlang has, but absolutely doesn’t need hot upgrades. If you try to think of a scenario where there’d be absolutely no use for hot upgrades, that’s also probably a situation where Erlang/Elixir, or the actor model itself, would likely be a bad fit. Unikernels (Erlang-on-Xen)? Nah, hot upgrades make sense there. Long-running embedded microcontrollers (Nerves)? Nope, hot upgrades make sense there too.

Really, if you want “Elixir but closed-world”, the place I’d expect to get it is if someone wanted to reimplement Elixir-the-language on top of an entirely different runtime, e.g. the JVM. Elixir-on-JVM would imply Elixir-on-GraalVM, which is just such a closed-world runtime.


It has many files because it brings in Erlang and its executables. Then Elixir and its executables. And then it adds some scripts to invoke those. It may make more sense if you think of releases more of a bundle, that includes all tools that you have and may need in prod.

Theoretically speaking you shouldn’t need to worry about any scripts except the ones you have in bin/.


Looks great!

Did not look at it in detail, but I'm wondering about portability, eg how this works with system libraries, like OpenSSL. Is the resulting binary portable to systems with different C libraries than the ones the OTP system was linked against originally?


I love this idea.

I've been using Elixir for years, on and off. The syntax is approachable, the actor model is solid and easy to reason about with many threads running. The Phoenix ecosystem is fantastic to work with, particularly now with LiveView making quick web UIs so effortless to create.

My main practical problem with Elixir over the years has been handing the tools I've created to others who might find them useful. Bakeware looks like the right way to proceed.


Has anyone used elixir for building command line tools? It seems like this would be very useful for distributing them.


The readme file in the repo mentions a ~500ms start up time. Maybe it's just me, but I wouldn't want to wait half a second for a tool to start running. For context most popular Unix tools start up in 1-2ms.


On my PC:

  time node -e "console.log('a')" # 0.377 total
  time ruby -e "p 'aaa'" # 0.084 total
  time python3 -c "print('a')" # 0.020 total
  time elixir -e "IO.puts 'a'" # 0.117 total
  time _build/prod/rel/bakeware/simple_script Hi # 0.205 total without zstd, 0.212 with
Another question is which language loads modules the fastest (afaik Ruby is kind of slow when it comes to `require`).


> Another question is which language loads modules the fastest (afaik Ruby is kind of slow when it comes to `require`).

I think both Python and Ruby loosely follow a similar model. The difference is that IIRC Python caches bytecode files, while Ruby compiles them from scratch every time. Not sure how big of a performance impact that has, especially for simple cases like this.

And just to toot my own horn, using Inko (https://inko-lang.org/), I get:

    inko /tmp/test.inko # 0.470 total
Note that this invokes the horribly slow Ruby compiler currently in use. If you invoke the VM directly on the compiled bytecode, it only takes about 10 milliseconds to run. Using the `master` branch:

    vm/target/release/ivm /home/yorickpeterse/.cache/inko/bytecode/release/43/38021ddc9a3449ace13288a2fac894d1d3e2aaa.ibi # 0.008.8 total
In this case all modules (including the standard library) are included in the bytecode file, meaning no disk IO is necessary, and modules can be parsed in parallel.

This is something dynamic languages like Ruby and Python can't do, as loading modules is something that can be done anywhere at any time, so you have to process them one by one.


Perl's startup time is pretty good.


I wouldn't use Elixir for short lived cli commands because of startup time. For that I'd probably turn to Go or Rust.

However for distributing long running server apps--where Elixir shines--I'm happy with this start-up time.


I have and blogged about it here

https://zorbash.com/post/building-command-line-applications-...

I'll give bakeware a go and probably update the post with the results.


I've used it to write CLI tools for myself and found it to be great for medium to large tools. Small tools it's still much faster and easier to grab Ruby (assuming a quick bash script isn't appropriate).

But yeah distribution is mostly a pain. I use the escript method but it's far from perfect. I'm excited about Bakeware and plan to try it.


I've always felt like the ideal way to use Elixir for command line tooling would be to have some way to let a pipe feed into a new BEAM process from an already running application.


In .NET we have a very similar tool for creating such packed binaries, dotnet-warp. I used it in one of my projects and quite liked it, since it's also quite easy to cross compile (cross-pack?) for the 3 major operating systems.

I like the general idea, your independent of the system wide framework version and it still has this "one-click" install procedure (dropping the binary in your path). However, I guess this is also the negativ Part. Users don't expect that a single binary extracts itself to somewhere --> uninstalling the binary leaves traces on the system.

Definitely looking forward to try it out for elixir, wondering how fast the erlang/elixir startup really is.


Single-file, platform-specific executables, framework independence and dependency trimming are all available in .NET Core 3.0:

https://www.hanselman.com/blog/MakingATinyNETCore30EntirelyS...

https://docs.microsoft.com/en-us/dotnet/core/whats-new/dotne...


Complete n00b question -- you can do that?

The last time I touched anything .NET was about a decade ago. My somewhat old-school superiors were unimpressed with the ability to come up with a plain .exe at the end of the process. My lack of familiarity with how Visual Studio had evolved certainly played into it; the IDE made me feel like a chimp dropped into an airliner cockpit. I had the worst time trying to figure out how to turn off the "of course you have an enterprise server dedicated to delivering upgrades!" setting. None of it seemed to, uh, scale down for our piddly purposes.


.NET Core is a ground-up rewrite of the frameworks and tools. It's completely decoupled from Visual Studio. This would get you started on a toy project (one source file, one project file) now:

    dotnet new console --output sample1
    dotnet publish --self-contained -r linux-x64 sample1
https://docs.microsoft.com/en-us/dotnet/core/get-started


Is there anything like this in Java?

I am aware of launch4j, but curious if there is a single step approach for all 3 platforms.


Yes, since Java 9: it is called "jlink" and it's shipped with the OpenJDK. It assumes you have modularized jars.

An alternative is the GraalVM, which can compile the code as a C++ binary or a shared application, using LLVM as backend. If you work with Webservices (especially JavaEE), you might also want to have a look at Quarkus, which also uses GraalVM to create single binary executables


jlink/jpackage just isn't the same as the other things. It doesn't produce a single executable like .Net core or Bakeware which embed the vm into the binary. In .Net's case it self extracts on first run.

With jlink/jpackage you either ship an installer or a zip file.


From the page: ~0.5s startup times or better on our computers


The lack of single binary output will be fixed in dotnet 5.


After using Go, I always felt the need for something like this in the Elixir world.


makeself (https://makeself.io/) is a generic way to do this for Linux that works for any Erlang/Elixir release -- meaning no use of mix or rebar3: https://gist.github.com/tsloughter/d62aad6d67b263e69275376b1...

Maybe something as simple is possible for Windows as well?


Wish something like this existed for Java


I believe GraalVM[0] makes your wish come true. It allows you to compile your Java code into a single binary and offers other features which makes Java a feasible solution for CLI tools.

[0] https://www.graalvm.org/


There is: you can use jlink/jpackage or GraalVM compiler to produce compiled executables


does it support hot code deployment?


Can vanilla Elixir releases, for that matter? Not from what I've seen - would love to know if I'm wrong.


I believe so.

I think this is what he was referring to: https://blog.appsignal.com/2018/10/16/elixir-alchemy-hot-cod...


Vanilla elixir releases do not support hot code reloading. Distillery releases do, however.


you can't have your cake and eat it too


Well in Common Lisp we can. We can build a self-contained binary, run it, start a Swank server, connect to it from home and code against the running program. We can even install new libraries without stopping anything.

Not saying it's great for software updates, but it definitely is possible, and helps (like, update users' config etc).


That's not hot code reloading in the erlang sense. Hot code reloading is having a NamedModule be replaced by NamedModule on the fly and have all of the processes executing functions in that module's (global) namespace respect the new binary code in accordance to the well-defined hot code reloading API that erlang has established, followed by kicking out the old namespace when it's dependent processes have drained of it.

This is not trivial considering that different processes may be holding onto that code for different lifetimes, and all must be managed sanely.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: