One practice I really like with Feature Flags is having a flag budget. Building on the idea of an error budget, where some percentage of errors is okay, but at a certain threshold feature work needs to slow or stop to get it under control.
Having a limit of in-flight feature flags, means you’re incentivised to clean them up when they’re done (or decide they’re long running features etc), and it can help keep a handle on in progress work.
But mostly, feature flags are a powerful tool—they can hurt as much as they help if you don’t use them right.
That's a great take! Feature flags are just like braces, they help you but you will need to remove them ;) I am currently hacking on a feature that will automatically create a PR for you with the flag removed. One click!
I just finished evaluating a bunch of feature flag platforms for my employer this past week. Self-hosting and having a .Net SDK was important to us. Below are the results.
Best options:
-----------------
Flagsmith: has UI and can be self-hosted for free (BSD-3 license).
GO Feature Flag: Completely free, but has no UI, although feature flags can be defined in JSON/YAML files checked into Github
Flagd: Completely free, but has no UI.
Statsig: Free plan covers our needs for a year or two. The .Net SDK is lacking, but we can use REST API instead.
Very biased: I work for Eppo (after working for Split), and our solution works great. I don’t want to shill _too_ hard, but happy to answer any question.
Another thing: Eppo is actually more focused on Experiment (A/B testing) analysis, and we let customers use any feature flag they want. Some clients use an in-house solution; some use Eppo, and many use commercial third-party options. If those systems have issues, our monitoring is often the first to flag it, and those errors are odd, so our clients pretty systematically flag us when that happens. That long explanation to say: I’ve seen a lot of how feature flagging system fail. Several a day, every day, for months. That’s why I can confidently say that Eppo is great, LaunchDarkly is great. Other commercial solutions… I have doubts: they work for basic use cases, but as soon as you have ad blockers, a bad connection, multiple ways to identify users (cookies vs. accounts; same account on mobile vs. desktop browser), or users who know how to edit their cookies, things get bad. In-house solutions (with one exception) also have common oversights.
We talk a lot to experienced engineers who think they can build it —and they can, key patterns are simple— but tricky cases always pop up. I usually ask a few questions to clarify and help them understand what those gotchas are. Senior tech leads aren’t always… keen on thinking, “That’s too hard for me.” So if they push back even lightly, I tend to recommend people to go that way and expect them to come back weeks later, asking tough questions. It’s never lost time: they learn a lot about their own architecture, limits of spinning up new services, passing configuration, design, etc. More importantly, they learned that third parties have relevant experience building this.
do you need a “platform” for feature flags? if you have so many feature flags that you need complex management i think that indicates you need configuration/settings management.
Changing flags on deploy is simple, great when you have a fast pipeline and only a few flags.
But at some point it becomes useful to decouple feature release from code deploy.
And the only way to do that, is to be able to change the value of a flag out-of-band of a pipeline.
Then you have the capability to test new code in environments before prod and in small parts of prod—canary releases and so on.
Configuration and settings management overlaps with feature flags, but note that often the value comes from the ability to test and safely deploy new code into production environments (more of a release flag), that to enable a feature for a specific user. It just so happens that the use cases and technical implementations overlap so frequently it’s sometimes less work to use the same system.
I'm pretty happy that as a dev I don't have to keep track of the relationship we have with each customer and craft that if statement accordingly. I just have to make sure the flag works and setting it to the correct value can be up to somebody who is customer-facing.
It's helpful if you have non-devs on the team who nonetheless want to toggle flags. Sometimes UX people use them for A/B testing, for example. Or a manager might want to see how a certain experiment performs and then turn it off if it's not doing well.
The "platform" is really just an easy web GUI with roles & permissions. In the past I've used Unleash, VWO, some Google thing that was sunsetted (of course), maybe some others. They're simple SaaSes but useful in the right teams.
You ship a mobile phone application, with 10million installations. You have a marketing/research department that wants to run A/B tests on their schedule, gathering analytics, and metrics.
Sure developers need to be involved in setting up SDKs, and making it possible for flags to be used. But oftentimes the people setting the flags and phased rollouts of features: "4% of users in France, then next week 10% of users in France and Germany, and then 50% of all European players need feature XX".
Platforms might be a hard sell, but many many mobile-application developers end up reinventing these kinda things over time, especially the popular ones.
Config/settings management are often paired with things that require at the very least an app reboot, where Feature Flags are explicitly something that should be capable of changing at will.
Now, could you have real-time config management that doesn't require a re-deploy/reboot of the app? Sure, but the typical 12-factor app can't really avail itself of that without significant rework.
Works great, has every feature we need. We compared with Darkly which was just insanely expensive for what we needed, millions of page hits adds up quickly.
We did have to do the work of adding unleash containers to our infra, but it was not hard at all.
For rails apps there has been rollout gem, and we’ve been using LaunchDarkly. These system wider services like LD are nice for control and uniformity and pan-system availability: being able to manage rollout of a single FF that can govnern multiple microservices at once made managing a larger cross-team feature very simple.
I cannot speak to price / value, but that FF sharing was so useful at simplifying communication between teams and rollouts, which can always get tricky.
Because it is an open standard by the CNCF, supported by various wellknown SaaS services (some of which are open and can be self-hosted such as flagd) and they provide a set of SDKs which are pretty homogeneous and with support for almost every popular technology/programming language.
We’ve gotten great mileage out of Amplitude’s support for feature flags and experiments. Puts our flags where PMs are already used to looking and integrates deeply with the metrics they care about.
It'd be great if they'd stop changing the UI, though. I was in the middle of an incident and couldn't figure out how to configure one of our killswitch flags.
Having a limit of in-flight feature flags, means you’re incentivised to clean them up when they’re done (or decide they’re long running features etc), and it can help keep a handle on in progress work.
But mostly, feature flags are a powerful tool—they can hurt as much as they help if you don’t use them right.