I mentioned this in another place on this thread, but a simple AIMD algorithm paired with a token bucket is surprisingly effective at dynamically adjusting to available capacity, even across a fleet of services not sharing state other than the contended resource.
I notice that library self-identifies a problem of generally keeping a fixed-size number of buckets and addressing them with hashing, which leads to great memory usage but introduces a risk of collision.
Stochastic Fair Blue shares that problem, so I think you might find the solution it uses interesting: it rotates hashes on fixed intervals by re-seeding to ensure that if a responsive flow collides with a non-responsive flow that's being rate-limited, it is at least guaranteed not to do it for very long.
This leads to a problem similar to the problem with Fixed Window Counter rate limiting where it would basically forget the rate limit history every interval. To solve this, two queues are fed input data at a time: the router enforces one of them, while just feeding the other data and ignoring its output in order to warm up its buckets.
I imagine if I tried to make use of the library you linked and didn't absolutely need per-user granularity, I'd do something similar by concatenating the user's identifier with a seed value for each of the two queues and rotating at regular intervals.
I’ve found the AIMD algo (additive increase, multiplicative decrease) paired with a token bucket gives a nice way to have a distributed set of processes adapt to backend capacity without centralized state.
Also found that AIMD is better than a circuit breaker in a lot of circumstances too.
Bazel is a fully reproducible and hermetic build system. A lot of painstaking work goes into it producing the exact same artifacts build after build. And that provides some interesting properties that you can leverage for artifact caching, deployments, and CICD.
We very happily runny a polyglot monorepo w/ 5+ languages, multiple architectures, with fully reproducible artifacts and deployment manifests, all deployed to almost all AWS regions on every build. We update tens of thousands of resources in every environment for every build. The fact that Bazel is creating reproducible artifacts allow us to manage this seamlessly and reliably. Clean builds take an hour+, but our GH self-hosted runners often complete commit to green build for our devs in less than a minute.
The core concept of Bazel is very simple: explicitly declare the input you pass to a rule/tool and explicitly declare the output it creates. If that can click, you're half way there.
> Bazel is a fully reproducible and hermetic build system.
Yes, and it's very important to note that Bazel does nothing to solve the problem about having a reproducible and hermetic runtime. Even if you think you aren't linking against anything dynamically, you are probably linking against several system libraries which must be present in the same versions to get reproducible and hermetic behavior.
This is solvable with Docker or exceptionally arcane Linux hackery, but it's completely missing from the Bazel messaging and it often leaves people thinking it provides more than it really does.
Bazel is perfectly capable of producing static binaries, if that's what you want. It's not fair to say that does nothing to solve this problem, it simply does not mandate that the artifacts it produces are static.
Most static binaries are not completely statically linked. The system libstdc++ and libc as well as a few other things are almost always linked dynamically.
In particular, it’s almost impossible to get glibc to link statically. You need to switch to musl or another libc that supports it.
Static linking means no dependencies whatsoever, so it only needs a syscall interface in the kernel. If your binary requires libstdc++, it's not static, period.
Go creates static binaries. I was a bit bummed out to learn that Rust doesn't by default since it requires glibc (and copying a binary to an older distribution fails because glibc is too old).
Yes and no, it's trivial unless you use some of the most common libraries that depend on, i.e. OpenSSL, like pretty much anything that talks with the outside world. Then it's painful. (You mentioned OpenSSL in your comment but somehow I missed that part when I replied. My bad.)
One thing Go has going for it is that they have a more complete standard library so you can do without the pain of cross-compiling—which is what you're doing if most of your libs assume you're in a glibc environment but you really want musl.
I know because I recently tried compiling a Rust application that had to run on an air-gapped older Debian machine with an old glibc, and the easiest solution was to set up a debian container, download Rust and compile there, instead of fixing all the cargo deps that didn't like to be built with musl.
I just went through this recently (built a rust binary on my local machine, sent a binary to another person, they reported GLIBC errors, had to rebuild with the musl target, various crates depending on openssl failed) and found that every library that depended on openssl by default also had a `rusttls` feature in the crate that disabled openssl and enabled rusttls (https://github.com/rustls/rustls) instead.
So I just migrated everything over to that (which consisted of enabling the `rusttls` feature for every crate) and made another build with musl. Everything worked perfectly fine, and since it's not a performance sensitive application, there was basically no drawbacks. The binary became a bit bigger, but still under 4MB (with other assets baked into it) so wasn't a big issue.
Other than libc I'm not sure what else would "almost always" be linked. It's going to depend on the language, but for Go/Rust at least static linking is pretty much the default, and lots of people even link to musl.
For other languages I'd say openssl is probably the next most common.
> but for Go/Rust at least static linking is pretty much the default
I don't think that's true for Rust. It defaults to depending on glibc, at least on Linux, and you need to explicitly build with musl to get static binaries.
Platform ABI typically specifies that libatomic must be dynamically linked, since otherwise libdl will misbehave. Lots of other common libraries including libpthreads, libm, libgfortran, libquadmath also are frequently linked (though commonly now are linked into libc.so for better system performance)
There is no platform ABI mandate nor Linux kernel requirement for userspace applications to be dynamically linked to anything. If you can talk to the kernel, you can do anything those libraries can do.
The linux kernel forces your application to include the vDSO shared library (by pre-loading it). You can ignore it and talk to the kernel directly, but you cannot get the same performance as you could when using that shared library.
On some architectures, atomics are implemented in the kernel by the vDSO shared library (__kernel_cmpxchg), which is supplied to user via libatomic. You can ignore it, but then you cannot interoperate other code (any which uses libatomic.so) without introducing data-races into the code which were not present in the shared-library version, since they may attempt to execute their atomics differently and thus incorrectly.
Others have mentioned statically linked binaries. I thought I'd mention I actually use Bazel to build most of my docker images, and unlike Docker itself, Bazel rules_docker builds reproducible (i.e., same digest) container images by default.
Having spent a great deal of time getting bazel set up as you describe, I feel you have given readers a misleading impression. Bazel does not come out of the box that way. It uses whatever toolchain is laying around on the host, by default. It builds with system headers and links with system libraries. It looks at your environment variables. You need to do a lot of surgery on the toolchain to make it hermetic and reproducible.
I can't comment on Nix, but I work on a project that uses recursive Makefiles and build containers to make the builds (almost) hermetic.
I dread running builds because it takes like ten minutes to run `make all`. I spent like four hours writing Bazel build files for it and all of a sudden my clean rebuilds were taking like five minutes and my incremental builds were taking a couple of seconds. It was fantastic.
Ultimately it was rejected because other people on the project hated working with bazel and weren't familiar enough when it to debug any problems. If a build didn't work, the only solution was to call me because no one else was used to interpreting the bazel errors: if a curl command failed in a script in a docker file somewhere, everyone knew what that meant. But I was the only one who would see "IOException fetching... Error 401" and immediately know that they provided the wrong password.
I also consulted for another project where people were using containers as VMs, running systemd and copying in updated binaries, because they, too dreaded rebuilding the containers. Bazel, again, made it so that they could rebuild everything in a matter of seconds. That project is still happily using bazel last I checked.
Docker is a red herring. The parent's statement can be modified for Docker like this:
> B̶a̶z̶e̶l̶Docker uses whatever toolchain is laying around on the h̶o̶s̶t̶Internet
More precisely, Docker can use exact, well-specified, cryptographically-tamper-proof environments (images); but the only way to actually specify and build such an environment is via shell scripts (in a "Dockerfile"). In practice, such scripts tend to run some other tool to do the actual specification/building, e.g. `make`, `mvn`, `cargo`, `nix`, etc.
If those tools aren't reproducible, then wrapping them in Docker doesn't make them reproducible (sure we can make a snapshot, but that's basically just a dirty cache). In reality, most Dockerfiles seem to run wildly unreproducible commands, e.g. I've encountered Dockerfiles with stuff like `yum install -y pip && pip install foo`.
If those tools are reproducible, then wrapping them in Docker is unnecessary.
Also note that outputting a container is nothing special; most of those tools can do it (e.g. in a pinch you can have `make` run `tar` and `sha256sum` "manually")
The bazel way to do it is to put the tools in an archive, refer to the archive and its checksum in your workspace, and execute the build in a completely empty sandbox.
Can you share a bit about the completely empty sandbox? Is this a build-root with it's own user and environment? Or does it build inside worktree, e.g. a subdirectory. Or can both be done?
The concept of explicitly declared inputs and outputs is awesome. Though it's closed build and the necessity to define the builds down to the compiler level makes Bazel complex and breaks some developer workflows.
For this reason we are building a Bazel competitor with a less restrictive build environment which can also be applied to projects with lower than 1M line of code.
I usually write out Pacific Time out of concern that not everyone would immediately recognize PT as an acronym. Most scheduling systems use the full acronym.
Top level directories by product area, then team/project based sub directories most of which would be their own repositories if you didn't do a monorepo. There are more exceptions to this than can fit in a post, but that is the general structure.
I’m firmly on team monorepo, and I agree great tooling is an absolute requirement. We use Bazel + AWS CodeBuild with local caching. We have an average incremental CI build of our monorepo that’s under 45 seconds. Clean build 30+ minutes.
The AWS Cognito User Pool Authentication Flow utilizes an augmented PAKE (SRP). I imagine there are a number of major sites that use Cognito along with the SRP auth flows baked into their std libs. I know I've used it a number of times.
I implemented SRP a decade ago — it has issues, and thus a lot of revisions. It also leaks your salt and you can’t use a pepper. There is Opaque (see the play on PAKE! but it’s new and difficult to search for).