Hacker News new | past | comments | ask | show | jobs | submit | more robinhoodexe's comments login

Is tuning the TCP buffer size for instance worth it?


In my experience running big servers, tuning TCP buffers is definitely worth it, because different kinds of servers have different needs. It doesn't often work miracles, but tuning buffers is low cost, so the potential for a small positive impact is often worth the time to try.

If your servers communicate at high datarates with a handful of other servers, some of which are far away, but all of which have large potential throughput, you want big buffers. Big buffers allow you to have a large amount of data in flight to remote systems, which lets you maintain throughput regardless of where your servers are. You'd know to look at making buffers bigger if your throughput to far away servers is poor.

If you're providing large numbers of large downloads to public clients that are worldwide from servers in the US only, you probably want smaller buffers. Larger buffers would help with throughput to far away clients, but slow, far away clients will use a lot of buffer space and limit your concurrency. Clients that disappear mid download will tie up buffers until the connection is torn down and it's nice if that's less memory for each instance. You'd know to look at making buffers smaller if you're using more memory than you think is appropriate for network buffers... a prereq is monitoring memory use by type.

If you're serving dynamic web pages, you want your tcp buffers to be at least as big as your largest page, so that your dynamic generation never has to block for a slow client. You'd know to look at this if you see a lot of servers blocked on sending to clients, and/or if you see divergent server measured response times for things that should be consistent. This is one case where getting buffer sizes right can enable miracles; Apache pre-fork+mod_PHP can scale amazingly well or amazingly poorly; it scales well when you can use an accept filter so apache doesn't get a socket until the request is ready to be read, and PHP/apache can send the whole response to the tcp buffer without waiting, then closes the socket; letting the kernel deal with it from there. Keep-alive and TLS make this a bit harder, but the basic idea of having enough room to buffer the whole page still fits.


It depends. At home - probably not. On a fleet of 2000 machines where you want to keep network utilisation close to 100% with maximal throughput, and the non-optional settings translate to a not-trivial value in $ - yes.


TCP parameters are a classic example of where an autotuner might bite you in the ass...

Imagine your tuner keeps making the congestion control more aggressive, filling network links up to 99.99% to get more data through...

But then any other users of the network see super high latency and packet loss and fail because the tuner isn't aware of anything it isn't specifically measuring - and it's just been told to make this one application run as fast as possible.


It literally covers this exact scenario in the readme and explains how it prevents that.


Worked with Weka to maximize NFSv4/mellanox throughput, and it was absolutely required to get targets.


It depends mostly on the bandwidth-delay-product and packet loss you expect on each connection. A there is a vast difference between a local interactive SSH session and downloading a large VM image from across an ocean.


There's not much cost to doing it, so yes.


If you're on 10 or 100 gig, it's almost required to get close to line speed performance.


A similar tool is detect-secrets[1].

[1] https://github.com/Yelp/detect-secrets


Also similar, Pillager (or Gitleaks) is worth having on the sanity checklist

https://github.com/brittonhayes/pillager

https://terminaltrove.com/pillager/ <-- TerminalTrove is worth regularly checking.

    powerful rules functionality to recursively search directories for sensitive information in files. 

    At it's core, Pillager is designed to assist you in determining if a system is affected by common sources of credential leakage as documented by the MITRE ATT&CK framework.
Good for catching those Oops I deployed the company password list again SNAFU's. reply


Nice, i like some of the concepts.


We have been using seaweedfs for about 3 months and so far it’s pretty solid, the integration with kubernetes is nice and the maintainer is very active on slack and when opening GitHub issues.


Cool! IIRC authentication was a bit tricky with SeaweedFS. Did you run into any issues there?


Sounds like nix using devenv[1] also would solve this problem.

[1] https://devenv.sh/


Having tried asdf - among other tools - for awhile, dropping it for nix+flakes+direnv was great.

Devenv seems nice (in fact it’s how I started down this path) but I haven’t found anything it does for me that I can’t get out of flakes - so far.


I'm considering doing a pilot (~5 devs out of 120) with using nix to manage dependencies and build containers at $DAYJOB, and here I think devenv is nice as a "one package" plus an active community for support.


95% of the difficulty I have at $DAYJOB is nix installation and dealing with enterprise certificate/auth crud…sadly devenv doesn’t help much there.

Our pilot is quite a bit larger. Sticking to plainer flakes has made it easier for folks to self-service for now but we do intend to re-evaluate devenv.

Same username on twitter if you’re interested in chatting.


Yeah… but then you get nix’s problems.

- steep steep learning curve, so your team is split between those who can understand it and those who have to blindly follow checklists and ask for help when something breaks

- it doesn’t play well on macOS


How doesnt it play well one MacOS? I've been using Nix Home Manager + Nix Darwin as my package manager, and Direnv + Nix Shell for developer environments; and havent had any problems (yet). Is there something I should be aware of?

Agree about the learning curve; but I am going to experience onboarding my coworkers onto using Nix only for developer environments over the next months; I feel the curve is not quite that steep for that limited use case.


There’s some really annoying edge cases I’ve found once Xcode gets involved and graphical apps are a bit hit and miss (Wireshark failed pretty completely for me a few weeks back.) I’d still call it a major improvement over the alternatives.

Re: onboarding - I’m doing the same thing at a somewhat larger scale. Same username on twitter if you want to start a support group. :)


I'll take you up on that, will follow you on twitter (@mg0rn)


Every time macOS has updated, I’ve had to reinstall Nix. Which I guess is in part a consequence of Nix not supporting single user installs on macOS.


I have not had that problem. Only thing I have to do when I update Nix after a MacOS update is to move /etc/shells to something like /etc/shells.old


What exactly are the problems with using it on macOS? So far (on my, admittedly, short nix journey), I’ve not encountered any issues that wasn’t fixable with 5 mins of google (even as a beginner).


In my experience, nix-darwin and home manager are a little awkward to install together (you’ll want both) and central management is…tricky. All of these being large company problems, mind - I can’t rely on the median user googling to fix issues, they expect (reasonably) an actively supported platform.

I’ve also had issues with GUIs and Xcode as noted in other comments but I don’t mind that - those are much more of a solved problem than, say, keeping seven different JDKs around.


That’s every solution, honestly - you’re just choosing what % of users will need help, and frankly asdf has a lot of edges. Nix’s self-contained declarative stuff is a pain to learn, certainly, when used to brew install $whatever - but it’s far easier to support.

Also it plays really nicely on macOS unless you’re trying to share nix config across macOS and Linux which…just fork and move on, it’s not worth it. :)


Nix for developer environment and building containers.

I’m wondering if it’s worth it to introduce to the rest of the company. We’re pretty comfortable building/“maintaining” ~400 container images, and it’s relatively fast (~3-5 min build time if no packages are changed), but there a lot of shared dependencies between all these container images, and using nix to get actual reproducible AND minimal container images with a shared cache of Linux and language-specific packages (dotnet, node, python and R) would bring in a ton of efficiency, as well as a very consistent development environment, but I won’t force all the developers to learn nix, so the complexity should optimally be absorbed into some higher level of abstraction, like an internal CLI tool.

I’m aware that the caching of dependencies can be improved, as well as creating more minimal container images, but it’s tricky with R and Python in particular, and then I figured why not just to balls-deep on nix that actually solved these issues, albeit at a cost of complexity.


Can you recommend any resources on nix?


Hey! I love Nix, and I've been using it as my daily driver for more than 1 year. There is a lot of people putting a lot of energy on documenting and explaining, but the current recommendation is _suffer_.

For Docker, you could start here in this HN thread [1], for NixOS and flakes there is a video series and git [2] I used at the begining which I liked.

I wanted something a bit more 'complete', so if you will you can read through my nix repo [3].

I have built Portry and Go applications (that I'll push to nixpkgs at some point) and GCP images for VMs, so if you need a reference for that just ask :)

[1] https://news.ycombinator.com/item?id=39720007

[2] https://github.com/MatthiasBenaets/nixos-config

[3] https://github.com/francocalvo/nixos-eris


Out of curiosity, how would you run Python or R workloads in kubernetes without a distro or shell?


Python is better than shell, so intruder will use it first.



Python needs an ld.so and libc (minimally) but not a shell or other external utilities. Shebang scripts are loaded by ld.so, not the shell.


Shebang scripts are supported directly by the kernel via the exec family of system calls, so ld.so shouldn't be involved.

https://github.com/torvalds/linux/blob/master/fs/Kconfig.bin...

  config BINFMT_SCRIPT
 tristate "Kernel support for scripts starting with #!"
 default y
 help
   Say Y here if you want to execute interpreted scripts starting with
   #! followed by the path to an interpreter.


Python with batteries included, doesn't that mean exploit tools included?

No personal attack intended, I am wondering this about my own embedded product which contains Python.


Yeah, I remember reading the code in the kernel that handles shebang a long time ago. ld.so is not involved.


And just as we're about to migrate 4 kubernetes clusters with a total of ~4k pods. Terraform in github actions on selfhosted runners and argoCD is failing.


Oh that sucks, there's always going to be those who will say that it's the price you pay for using Github, but locally hosted VCS and CI/CD systems have issues as well.

External dependencies are always problem, but do you have the capacity and resources required to manage those dependencies internally? Most don't and will still get a better product/service by using an external service.


Rate of outages on github last few years has been orders of magnitude higher than anything I've encountered on a locally hosted VCS.

Local also means you can orchestrate maintenance windows to avoid outages at critical phases.


> Most don't

Define "most". There is a surprisingly high number of small/mid-sized companies which have dedicated people for this kind of things.


Perhaps "many" would have been better. If you only count companies that view themselves as IT companies then number of companies with self hosted/managed solution grows, but if you include everything then I'd guess that more than 50% don't run these services internally. If you're every small company with one or two developers that doubles as the IT staff then the numbers add up pretty quickly.


That's where I feel like it's actually pretty nice to not have CI tied to your source code. It's probably more expensive to use Travis/Circle but at least you don't have a single point of failure for deploys.


Doesn't this give you 2xSPF? Or can I use my local copy of the source to kick of Travis/Circle?


Ideally you don't ever rely on CI specific automation tooling to actually accomplish anything and instead just use it as a dumb "not my dev machine" to execute workflows.

You should always engineer things so you can fall back to something akin to:

./scripts/deploy_the_things

Ideally backed by a real build system and task engine ala Bazel, Gradle, whatever else floats your boat.

It also means you are free to move between different runners/CI providers and just have to work out how to re-plumb secrets/whatever.

GH actions/friends really provide minimal value, the fact they have convinced everyone that encoding task graphs into franken-YAML-bash is somehow good is one of the more egregious lies sold to the development community at large.


Well, git ops might not be impacted on Github, usual Github outages tend to be through Actions/the site, not the actual git operations. Doesn't seem like you can use a local copy but you can use Gitbucket/Gitlab.


Looks somewhat similar to the talos Linux project[1]

[1] https://www.talos.dev/


Substantially different.

Talos is a Linux operating system distribution tailored for running Kubernetes and container workloads. It runs runc, containerd, and other binaries, which spawn containers which themselves run at the same level of virtualization as the kernel. It's hardened for security. Containers that run in privileged mode can make syscalls that affect the host kernel.

gVisor is an OCI runtime - a tool for running containers, typically on Linux - implementation that presents a virtualized Linux kernel surface area to the applications and containers it runs. The applications have any syscalls they make intercepted by gVisor. It's not quite the same as hardware virtualization, but it reduces attack surface area by disallowing containers to make syscalls to the OS kernel.

---

Addenda:

gVisor is closer in spirit to Firecracker when used with Kata containers. Although the way the two work is very different, both effectively prevent containers from manipulating the Linux OS they run on.

Talos is closer, I think, to Bottlerocket OS, which is a Linux distribution created by Amazon for their container workloads. Bottlerocket is also a hardened, minimal operating system designed for running containers. Both of these are very similar to CoreOS aka Container Linux.


Talos has a gvisor system extension to make it easier for users to adapt:

https://github.com/siderolabs/extensions/tree/main/container...


Can you elaborate on this? We have some 300 internal APIs on a valid domain. We used to use let’s encrypt, but got rate limited for obvious and fair reasons when we were migrating between clusters. It’s a bit better with zerossl, but we still get 429s when cert-manager is issuing a ton of certs at the same time.


Just wanted to clarify that `lcl.host` is a service that only helps with local development, it's not useful (and shouldn't be used) in staging & production environments. For staging & production, we let customers use a public domain they own, or a special use domain (`.local`, `.test`, `.lan` etc).

Here's how the architecture you described works with Anchor: assuming your domain is `mycorp.it`, you can add it to your organization. Then create staging & production environments. This provisions a stand-alone CA per environment, and the CA is name constrained for the environment (e.g. only `*.stg.mycorp.it` in staging). Each of the 300 APIs can be registered as a service: this provisions an intermediate CA per environment that is further name constrained (e.g. `foo-api.stg.mycorp.it` in staging). For each service in each environment you generate a set of API tokens (EAB tokens in ACME parlance) that allows your automation to provision server certs with the ACME client of your choice. edit: in your case, cert-manager would be the acme client delegating to Anchor.


We’re currently consolidating all container images to run on Debian-slim instead of a mixture of Debian, Ubuntu and alpine. Sure, alpine is small, but with 70% of our 500 container images being used for Python, R or node the final image is so large (due to libraries/packages), that the difference between alpine (~30 MB) and debian-slim (~80) is negligible. We’ve been experiencing the weird DNS behaviour of alpine and other issues with musl as well. Debian is rock solid and upgrading to bookworm from bullseye and even buster in many cases didn’t cause any problems at all.

I will admit though, that Debian-slim still has some non-essential stuff that usually isn’t needed at runtime, a shell is still neat for debugging or local development. This trade off could be considered a security risk, but it’s rather simple to restrict other stuff at runtime (such as run as non-privileged, non-root user with all capabilities dropped and a read-only file system except for /tmp).

It’s a balancing act between ease-of-use and security. I don’t think I’d get popular with the developers by forcing them to use “FROM scratch” and let them figure out exactly what their application needs at runtime and what stuff to copy over from a previous build stage.


My biggest beef with the apt/deb-based distros for container images is that apt-get updates installs take frustratingly long, whereas apks always tend to be near instant. I wonder what it is in the postinst and related scripts that just take so long and can't be parallelized!

Most of the reason I switched from Ubuntu -> Arch is that working with alpine allowed me to realize that installing packages and their dependencies don't have to take so long.


Have you tried building the images with eatmydata library?

Essentially, dpkg is hilariously slow due to its approach in issuing fsync calls for every file - eatmydata wrapper injects a shared library which turns that into a no-op.

I haven't timed it rigorously, but I experienced up to 10x speedup in container builds by just wrapping dpkg with eatmydata.


[Edit: modern debian images have “force-unsafe-io” turned on in /etc/dpkg/ so eatmydata shouldn’t make a difference.]

I’m not the OP, but in an example I just tried it took build-essential from 48s to 39s (timed on the third step, without and then with the call to eatmydata):

  from debian:sid
  run apt update && env DEBIAN_FRONTEND=noninteractive apt-get install --yes eatmydata
  run apt update && eatmydata env DEBIAN_FRONTEND=noninteractive apt-get install --yes build-essential
Can you replicate something faster? Was I doing something wrong?


Yes, the second invocation will have hot caches. Basically the standard is to either test in a clean environment (e.g. immediately after a reboot) or to run the command until the times stabilize, and then take, say, the middle 3 out of 5 measurements.


What is cached though? DNS? IP routes to the Debian mirrors?

I ran it three more times. Plain install was 50s 48s 47s; eatmydata was 46s 46s 46s.

Installing build-essential probably does enough non-dpkg IO to account for the difference (install scripts etc, for example when libc changes and the system scans for running daemons) but it isn’t much and the delta may well be statistically insignificant.


File system. Drives cache recently read blocks. This isn’t something that’s neesesarily e en visible to the OS.


I decided to benchmark these scenarios out of curiosity, because I hadn't heard of eatmydata before or of force-unsafe-io.

- Docker image: `debian:stable-slim`

- `docker system prune -a` ran before each test.

- Packages installed: `build-essential ca-certificates libssl-dev software-properties-common wget curl git`

Results:

Default Debian settings: 48.32 seconds

force-unsafe-io enabled: 55.86 seconds

eatMyData enabled: 35.04 seconds


It's possible that the last time I dealt with that on debian force-unsafe-io wasn't set in /etc/dpkg.

I haven't really built debian-based containers in last two years.


This here. Honestly most orgs with uhh.. Let's say a more mature sense of ROI tradeoffs were doing this from pretty much the very beginning.

Also, Ubuntu 22.04 is only 28.17MB compressed right now so it looks equiv to debian-slim. There are also these new image lines, I can't recall the funky name for them, that are even smaller.


I'm pushing to go back to Debian from Ubuntu. Canonical's making decisions lately that don't appeal to me, and especially on the server I don't see a clear advantage for Ubuntu vs good ol' rock solid Debian.


Host OS vs container image base shouldn't be the same conversation.


I've done this multiple times when new Debian Stable was released (Bookworm just got released). Its great, until it isn't (ie. you need more up-to-date software). Although to be fair, you might be comparing Ubuntu LTS with Debian releases and the minor Ubuntu releases get short support. Ubuntu is akin to a polished Debian Testing snapshot. I don't like them swapping to Snap either (snappy it ain't).


> There are also these new image lines, I can't recall the funky name for them, that are even smaller.

You might be thinking of the chiselled images. An interesting idea but very much incomplete[1].

[1]: https://github.com/canonical/chisel-releases/issues/34


Yes, that's it! I've been working with .Net a lot recently and recalled Microsoft's announcement about them. Specialized runtime images for JRE, .NEt, Asp.Net, and etc.


why is it so much smaller than debian slim?


Fewer features (including less "we've supported doing that for 20 years and we're not cutting it now"), packages separated into parts (ex. where debian might separate "foo" into "foo" for the main package and "foo-dev" for headers, alpine will also break out "foo-doc" with manpages and such), general emphasis on being small rather than full-featured.


It's not, Debian slim is ~28.8MB compressed.


If you want to compare compressed size, then alpine is 3.25MB


The parent comment was comparing Ubuntu and Debian-slim


In the same boat here as well. Especially when you’re talking about container images using JavaScript or other interpreted languages that are bundling in a bunch of other dependencies, the developer experience is much better in my experience given that more developers are likely to have had experience working in a Debian based distro than an Alpine based one.

Especially when you’re also developing within the container as well, having that be unified is absolutely worth the convenience, and honestly security and reliability as well. I realize that a container with less installed on it is inherently more secure, but if the only people who are familiar with the system are a small infrastructure/platform/ops type of team, things are more likely to get missed.


Can you point me on where to look for more details on securing a container? I'm a developer myself, and for me, the main benefit of containers is being able to deploy the app myself easily because I can bundle all the dependencies.

What would you suggest I restrict at runtime and can you point me to a tutorial or an article where I can go have a deeper read on how to do it?


What you want to read is the kubernetes pod security context fields[1].

In your Dockerfile, add a non-root user with UID and GID 1000, then in the end of your Dockerfile, right before a CMD or ENTRYPOINT you change to that user.

In your kubernetes yaml manifest, you can now set runAsNonRoot to true and runAsUser & runAsGroup to 1000.

Then there’s the privileged and allowPrivilegeEscalation fields, these can nearly always be set to false unless you actually need the extra privileges (such as using a GPU) on a shared node.

Then there’s seccomp profiles and the system capabilities. If you can run your container as the non-root user you’ve created, and you don’t need the extra privileges then these can safely also be set to the most restricted. Non-privileged non-root is the same as all capabilities are dropped.

The tricky one is the readOnlyRootfilesystem field. This includes /tmp, which is considered a global writeable directory, so the workaround is to make a in-memory volume and mount it at /tmp to make it writable. Likewise, your $HOME/.cache and $HOME/.local directories (for the user you created in your Dockerfile) are usually used by third party packages, so creating mounts here can be useful as well (if for some reason you can’t point it to /tmp instead).

[1] https://kubernetes.io/docs/tasks/configure-pod-container/sec...


> In your Dockerfile, add a non-root user with UID and GID 1000, then in the end of your Dockerfile, right before a CMD or ENTRYPOINT you change to that user.

I'm not certain, but would it be more secure to create a non-root user in the Dockerfile, but _don't_ use UID 1000. Use volume mounts to grant host disk access instead.


If you want to use the runAsUser and runAsGroup options in kubernetes that user needs to exist, hence why creating it is needed.

In our case, we have our own base container images that include this user. The base images are either just plain copies of official ones (such as dotnet or curl) or also includes stuff that almost everyone needs (such as MSSQL ODBC drivers, libxml2, curl etc) in the case of python/R/node.



Basically you need to work as unprivileged user and with immutable file system (of course you can have ephemeral /tmp or persistent /data, but generally the entire system should be treated as read-only).


You might want to check out Wolfi and Chainguard Images. Wolfi is a Linux distro that we use to build minimal images that are roughly comparable to Alpine in size but, everything is compiled from source against glibc.

Our images come without a shell or pacakge manager by default, but there are -dev variants that include these.

https://github.com/wolfi-dev/ https://github.com/chainguard-images/images


I was using wolfi for an oss project, but the recent attempts at forcing payments when pinning version was a huge red flag, as was the messaging around it, and I’ll be migrating away from it.

https://www.chainguard.dev/unchained/scaling-chainguard-imag...

Pinning a language version (say python 3.11) isn’t an optional thing its a best practice, and the notion that its because of security seems intentionally misleading as the images should be refreshed in place on the tag along with signatures.


I'm very sorry that we broke things for you.

To be clear, nothing has changed with Wolfi. Wolfi is an open source community project and everything is still available there: https://github.com/wolfi-dev/.

We have made changes to Chainguard Images - our commercial product built on top of Wolfi - which mean you can no longer pull images by tag (other than latest). Chainguard images are rebuilt everyday and have a not inconsiderable maintenance cost (and the money we make here directly helps us support Wolfi).

The easiest way to avoid this is to build the images yourself. You can rebuild identical images to ours using apko and the source files in the images repo e.g: https://github.com/chainguard-images/images/blob/main/images... (note you can replace package names with versioned versions). You can also just use a Dockerfile with the wolfi-base image to "apk add" packages. Full details are here: https://www.chainguard.dev/unchained/a-guide-on-how-to-use-c...

I agree that pinning is a best practice. The above blog explains that you can still do it using a digest, but I accept this isn't the simplest solution.

If I can help any more, please feel free to get in touch - you can find me most places including twitter https://twitter.com/adrianmouat


I sympathize, but if wolfi is oss, please update docs and downloads to use a separate registry then your commercial one for distribution.

else, frankly you claimed wolfi is oss, get customers to use your registry, and then bait and switched your early adopters.

aka, major version upgrades at random, have fun!


Wolfi packages are served from https://packages.wolfi.dev/os which can be used with apk tools or apko.

The built Chainguard images are all on the cgr.dev registry.

Wolfi is completely OSS.

The policies regarding Chainguard Images have changed over time, so if there are docs that don't properly reflect this, please let me know and I'll get them updated.


let me rephrase that, wolfi is an oss container image distro without an image, you get to pay us for the image. but here's a our tooling cause we like to call it oss, have fun building. either have those oss images on a non commercial registry, or let's stop misleading folks about what this is, aka pay to play/use oci images. that all wolfi docs / blog posts reference a commercial registry while touting it as oss, is the definition of bait and switch.


> the difference between alpine (~30 MB) and debian-slim (~80)

Given that it's a different layer, the your container runtime isn't going to redownload the layer anyway, right?


Exactly. Part of the appeal to consolidate all of our container images to use Debian-slim is the ability to optimise the caching of layers, both in our container registry but also on our kubernetes cluster’s nodes (which can be done in a consistent manner with kube-fledged[1]).

[1] https://github.com/senthilrch/kube-fledged


Thanks for that - that operator sounds extremely useful!


And even if it did, in an ancient data center that only uses gigabit ethernet, that's only a .5s longer download. And even a $4 DigitalOcean server comes with 10GB of storage, so that 50MB is only 1/200th of the instance's store. (I'd also bet that nearly no one uses instances that tiny for durable production work where 50MB is going to make a difference.)


I've run into the same thing for large dev images, but using pure rust often means that musl allows for a single executable and a config file in a from scratch container for deployment. In cases where a slab or bump allocator are used, musl's deficiencies seem minimized.

That means duplication of musl in lots of containers, but when they are all less than 10MB its less of an issue. Linking against gnu libraries might get the same executable down to less than 2MB but you'll add that back and more for even the tiniest gnu nase images.


what is pure rust,rust needs its own libraries which is a few MB as I recall


By pure, I just meant no dependencies on c or c++ bindings other than libc. If that is the case you can do a musl build that has no dynamic dependencies, as all rust dependencies are static. So then your only dependency is the kernel, which is provided via podman/docker. A decent sized rust program with hundreds of dependencies I can get to compile down to 1.5MB. But that is depending on gnu. So if you had 4 or 5 of those on a node, it might be less data to use one gnu base image that is really small like rhel micro, and build rust for gnu. But if you have cpu hungry services like I do, then you usually have only a couple per node, so from scratch musl can be a bit smaller.


> cpu hungry services

Have you benchmarked musl vs glibc in any way? Data I've seen is all over the place and in curious about your experience.


Very few system calls in the main loop. Not even close to being io bound, and as I said memory is all preallocated, so allocator efficiency isn't a factor.

Edit: realized I didn't answer your question, I ran flamegraph both ways and it was completely dominated by tight vectorized loops and a couple sadly necessary memory copies.


I made the switch too around 4ish years ago. It has worked out nicely and I have no intention on moving away from Debian Slim. Everything "just works" and you get to re-use any previous Debian knowledge you may have picked up before using Docker.


Do you have any tips regarding building R-based container images?


R is kinda difficult and I haven’t cracked this one. Currently we’re using the rocker based ones[1] but they are based on Ubuntu and include a lot of stuff we don’t need at runtime. I’ll look into creating a more minimal R base images that’s based on Debian-slim.

[1] https://github.com/rocker-org/rocker-versioned2


Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: