Hacker News new | past | comments | ask | show | jobs | submit login
How Dark Deploys Code in 50ms (medium.com/darklang)
269 points by craigkerstiens on July 11, 2019 | hide | past | favorite | 167 comments



It's an interesting concept that flips many things on its head. We're getting some real hard-to-achieve benefits out of the box, while sacrificing a lot of things we used to take for granted.

One thing that I notice is that Dark requires a completely walled in environment to be able to deliver all of its promises. In reality, any real-life project has to interact with external services, legacy systems, or custom peripherals which are unlikely to be based on Dark.

How will Dark be able to handle such cases? Current tooling and processes are able to cope with such things through custom, hand-crafted, and as the article stated, often fragile, systems and processes. The are a pain, but they get the job done. They are very versatile by their nature. While they break more often that we would like to, they can be patched up well enough to band-aid over rough patches and adjust to surprising externalities. Dark seems to have one way of doing things, and I am curious how compatible that way is when it comes to interfacing with external systems.


It seems that Dark is designed to allow you to wrap external APIs in easy-to-use interfaces, so it should be at home as part of a larger SOA or ecosystem, of course with the caveat that it's still being developed. See the footnote at https://medium.com/darklang/dark-residency-7dbc878a5f0a .


Yes, that's exactly right. Dark speaks HTTP (and at some point, will speak GRPC, Thrift, etc) so you can access 3rd party services over HTTP (and wrap them in SDKs in the Dark package manager, same as on npm).

We expect we'll have some built-in support for common things (resizing images and videos, or generating pdfs for instance; we already support static assets on a CDN).


Hey Paul, nice to see you here on HN. Very promising start!

What's the best way to contact you? (I'm $HNusername @ gmail).


Thank you! I'm at https://twitter.com/paulbiggar or paul@darklang.com.


hence the pun "going Dark", you're complete shut-in into a vendor-specific environment


First I’ve heard of Dark. The blog post is cool, but reading it I felt two main things: (1) no pull reviews would be terrible for a team, and (2) everything being instantly live in prod and automatically versioned etc sounds like a nightmare once you move beyond a simple program.

Reading on Darklang.com I see they kind of address this. The ecosystem they’re building is training wheels for developers, a way to make “coding 100x easier.”

I can see that working in a sense, and being an entry point that a lot of people use to make neat toys. I could even see it being a gateway that people use to get interested in and learn about programming. But I can’t imagine wanting to build a business with more than one programmer, or any kind of scale on a completely black box system like that.

I will also say, there’s a problem when it comes to entry-level systems, trying to teach people to code:

The struggle with the complexity is actually important. Dark isn’t actually making all the work of building and running a web application go away, it’s abstracting it all into a platform such that you can’t see it.

Suppose a person gets into coding self-taught and learns to work this way. The knowledge isn’t going to be very transferrable - ie when they look at other languages or systems they’re likely to struggle with problems like “what does prod versus dev mean, I just want my program to run for the world...”

You usually move knowledge between ecosystems by translating “I did it this way in (toolset A), so what’s the parallel in (toolset b)?” The more you have an idea of the underlying principles the easier this is to figure out.

That said, Dark certainly looks neat, and I imagine the implementation is quite cool.

The only other nit I’d pick is the name. To me, its not really a language, but maybe a development environment, or framework. I suppose it had a language in it, which I’d probably call DarkScript or something.


Author and Dark CTO here.

> Reading on Darklang.com I see they kind of address this. The ecosystem they’re building is training wheels for developers, a way to make “coding 100x easier.”

No no no no no. This is not what we're doing. We're building something for experienced software engineers. (New developers will be able to use it too and it should be much much easier for them than learning in other ways, but that's a secondary audience).

> Dark isn’t actually making all the work of building and running a web application go away, it’s abstracting it all into a platform such that you can’t see it.

Some of it is being abstracted, but a lot of it being properly removed. Servers are abstracted, but almost all of actual deployment is actually being removed.


How does Dark handle code review?

Even if I was willing to accept the other tradeoffs that come with Dark, I'd want another human to approve my changes before that 50ms deploy to production kicks off.


The model is quite different to how people write code today. Instead of a process that takes code from your machine and sends it to prod (and thus has a lot of risk), you code in a sorta sandbox _in prod_. That is then enabled for users via changing the feature flag setting.

So to do code review, you write the code behind the feature flag, then ask someone for code review. (There's google-docs style collaboration, so they can see your code if you permit it). After code review, you can "merge" by toggling the feature flag, or changing the setting to "let 10% of users see this" or whatever roll-out strategy you want.

just safely. So you're writing code behind a feature flag (but "in prod"), but no users can see it.


So what protects you from accidentally doing “DELETE TABLE users” on prod?

In traditional design, there will be pre-commit code reviews and staging env with limited access. But looks like Dark get rids of all of this?


Good question. The answer to most questions like this is "it'll probably be the same as now". I know it seems like everything is different, but we're not really trying to invent much new stuff, only what's required to take stuff out. Whenever things from today work, we'll keep them. Core review, testing, ways of preventing deleting everything on prod; they're all valuable.


cool- I guess there is some concept of a snapshot/diff to show you the before and after?


Not currently but I'm sure there will be at some point.


I don't think they've detailed that yet, at least not that I've seen.

But it is one of the few steps the Dark deployment process retains (see the diagram near the end of the article), so it seems they will support it (hopefully reasonably well).


Seems easy enough... Write whatever code you like, directly in production, but feature flags will be forced to zero (or to just your own test users) until code review passes...

It requires the language to have a bulletproof feature flag system which can't be bypassed, which it seems is the goal.


Sounds cool, seems like an interesting set of tradeoffs that nobody else has tried yet.

From what I've read so far it seems like Dark is a good fit for prototyping, experiments, and small products and teams that "move fast and break things". Would you consider it a good fit for large teams and applications, where different people own different components and reliability is the most important metric? A lot of the accidental complexity in the "standard issue" continuous deployment process detailed in the post is there to prevent accidents and enforce permissions.


Yeah, all this stuff is actually designed for large teams (though, it'll be a while before Dark is production-ready enough for them). CD matters because it allows you deploy in small chunks which is much less risk than deploying in big chunks.

The feature flag is a thing that you might not need if you're small, but it completely changes how large teams work. Same with DB migrations: if you have a small amount of data you can just do a schema migration and the locking wont affect you too badly.

We'll have a permissions model at some point that will take care of the ownership problem. We designed Dark to allow extremely fine grained ownership, for example allow a contractor access to a single HTTP route or function.

Overall, this is kinda similar to how we went about building CircleCI. We focused on good tooling for companies (eg parallelization and speed) instead of individual projects (eg build matrices).


The way to go is to have high-trust culture and comprehensive automated testing system.

You want developers to behave responsibly and to put all development effort into writing good, reliable code in the first place. I personally prefer pair programming over code review. Code reviews have a lot of downsides but two most important are that you can't get from commit to prod automatically (as there is another inevitable manual step) and that reviewer typically doesn't have enough background/time to perform the review in meaningful way.


Given how tightly coupled is Dark with its IDE and deployment environment, I'm interested in knowing how much VC money have they taken? What if they get acquired or simply shut down?

It looks like there won't be a migration path, and you'll need to rewrite your app from scratch using a different language. Again, it would be nice some transparency around how much VC money they've taken.


For there to be 9 of them and 4 open heads I'm guessing it's probably $10-20 million at this stage.

I think the only way I'd use this is if I had the option to host it myself i.e. the code was all open sauce.


The sustainability of Dark is super important to use. We're not fucking around with customers' livelihoods.

For our early customers, we have a handshake agreement that we won't leave them hung out to dry. If Dark was to shut down, we would 100% make sure they had the time and ability to port their app off. There would be a time cost to them of course, so it's not perfect.

As we get closer to opening up to the public, we'll be doing a bunch of work on the sustainability question so we have good answers to the "acquired or shutdown" problem. We imagine a legal framework to protect customers, but haven't got the details worked out.


A legal framework doesn't help much in the "bankruptcy" case. Unless that legal framework means all IP becomes open source and people can self-host I guess.


> When you write new code in a function or HTTP/event handler, we send a diff of the Abstract Syntax Tree (the representation of your code that our editor and servers use under the hood) to our servers, and then we run that code when requests come in.

You can achieve the same result with PHP and FTP, however once you start adding unit tests and branching, that's when things start to take time.

Long builds aren't necessarily a problem if they give you enough in return.


I am not sure people are getting these results with PHP and FTP. Sure, the build time is zero seconds. But each page load takes 100ms because it's doing the same work over and over for every request. Yes, even if you cache opcodes. We have a legacy PHP app and an "echo hello world" request takes 100ms. An extremely complicated request then takes 102ms to process. It is shockingly slow, but it is not something you will notice unless you have a lot of instrumentation, so you might be lead to believe that your performance woes are caused by something else.


You must be doing something wrong. There's a lot of overhead for frameworks like Symfony, yes. But it's possible to write reasonably productive and modern website code where you can get sub 20ms responses (doing auth, fetching from db, processing and rendering html). I know because I've done that, on PHP 5.4.


Full checkout experience here, 18ms average. PHP 7.3, Symfony.


You're exactly right that this style of development would be completely unsafe using PHP+FTP, which is why we had to include all the other stuff: the feature flags, function versioning, DB migrations.


What I was pointing out that hitting save and getting a result on the live server was achievable in the 90s, there is nothing new about it.

The reason we stopped doing that is because we wanted version control, we wanted to work in teams and we wanted unit and integration tests, which I doubt that you could run within 50ms.

> the feature flags, function versioning

I can easily replicate that with at least early 2000s PHP(would probably work with 90s PHP as well).

> DB migrations

We do have DB migrations in all web frameworks that are worth using.


> The reason we stopped doing that is because we wanted version control, we wanted to work in teams and we wanted unit and integration tests, which I doubt that you could run within 50ms.

A good way of thinking about Dark is to suppose that instead of stopping doing that, we found a way to make it work.

> I can easily replicate that with at least early 2000s PHP(would probably work with 90s PHP as well).

If you could replicate it, you'd do it. Except you need version control, unit tests, etc, etc. So you need tooling that has those things to make the feature flags work.


> A good way of thinking about Dark is to suppose that instead of stopping doing that, we found a way to make it work.

You found a way to run multiple selenium tests in less than 50ms?


You seem to be married to a particular technology and approach: I would imagine Dark deploys a versioned/feature-flagged alternative in 50ms, allows you to run tests on it, and once you are satisfied, you switch to it in "production" (even if it's already "deployed", it was not "activated: — we could argue that they are using deployed in a slightly different sense than usual).

Tests wouldn't be Selenium, and Dark can probably deduce which tests are affected by a particular code change (if it's close to purely functional/no side effects), so it does not need all of them ran.

I generally like the idea and approach technically, but wouldn't be using them because of lock-in nature.

I also worry about feature flags: for longer lived "features" (they happen!), code ends up being more complex and hard to maintain, and when the time finally comes to drop them, it's a mess.


Note that I am also guessing they are transpiling Dark to JS for the frontend, thus allowing them to use a common infrastructure for everything.


I'm getting really tired of learning new languages / stacks every few months. I want one language to write things in. It has to be a pure-OO language and right now that's ruby. I want to edit that code with one code editor. Syntax awareness is a nice-to-have but not essential. At the moment my editor of choice is Sublime Text 3 but to extend it I need to write python which is unideal. When I start writing websites it'll be hyperstack.

Unless your new language can fix all my issues, be open source, and be my One True Human Interface to All Things Machine, I'm going to keep building on ruby, with occasional bash / crystal / rust excursions for speed.

Is your language pure-OO? Do you have gradual typing? Does it run on a common VM? Am I going to have to rewrite everything I'd ordinarily bring in a library for? I can accept every single one of your claims but still not be interested in learning a whole new stack because I can already deliver stuff quickly enough with the one I've been iterating on for a decade. Especially when it could end up orphaned in 3 years if it doesn't get enough adoption.

I'm not saying your holy grail can't be holy enough. But it has to be pretty darn holy.


Syntax errors are the easiest errors to fix and almost never get into production.

Zero downtime deployments are very rare, and ads a lot of complexity, so it's a big trade-off, compared to just uploading the new code and restarting the service. And having the front-end deal with minor hiccups like service restarts, because users will have such issues all the time due to being on mobile networks, trains going through tunnels etc.

You can do as much testing as you want, both manually and automatically, but you still can't detect all issues as efficient as thousands of users in production. So just accept that there will be issues, and instead design your pipeline so that those issues can be fixed within minutes.


> Zero downtime deployments are very rare

What do you mean? Rolling deployments are a very basic feature of container orchestrators.


I don't know what he refers to. But there is a specific kind of zero downtime deployments which can restart without losing in-memory state, like Erlang.

Erlang also provides "fix in minutes, in production" stories he refers. For most of languages, it's very hard to remotely get access to an object, let alone fix things directly.

With Erlang, you can inspect every object in the cluster to understand what's going on for any user, and fix the plane in the mid-air instead of rebuilding a plane with the same issue, then go through the process to fix it.


You don't even need containers--it's a feature of load balancers containers rely on. Start new container, wait for it to become healthy, add it to the load balancer, remove old container (with an optional, let connections to old container close first)


what if it involves db updates?


We have run a lot of db changes. It requires multiple steps. Add a column, migrate data, update code to use new column and old column, update code to use only the new column, update table to drop old column. Similar for migrating to new dbs, tables, etc. It is rare for us to need to disable writes for a version bump or something.


Where I work we had that same thought. Until we stopped doing breaking database changes.


If you aim for zero downtime there will be a point where the complexity work against you. for example: load balancer goes down, corrupted state due to two different versions running simultaneous, a bottleneck causing congestion, etc.

My favorite "load balancing" for web-pages is using "DNS round robin" where you add several A records, and the browser automatically try another IP if the request fails.


> ...the browser automatically try another IP if the request fails.

The browser tries another IP if it cannot connect to the first one.

If a server throws 500s left and right DSN RR won't help, whereas a load balancer would remove the offending server from the pool.


The browser does not try another IP if the request fails; it tries the first one only (the DNS server chooses a random order for the response), unless browsers have changed their behavior recently.

DNS round robin is not failover. If one of your four A records points to a dead box, 25% of your customers are SOL.


https://en.wikipedia.org/wiki/Happy_Eyeballs

Browsers even try other IPs and alternate between IPv4 and IPv6 before the connection is established.


I know all about this, but most websites only support v4, and I don’t think this is used in any way for choosing which A record to use.


Most websites don't have redundant servers but its still there and used by a lot of big sites.

The whole point of it is to connect to multiple A and AAAA records with a very short delay (250ms in chrome). The browsers use poll to get notified about the first established connection and then use it. So if one A/AAAA records IP is unreachable or slow it will use a another one. There will be no fallout (except maybe for clients that don't use something like happy eyeballs, but any browser does that) if you have multiple A/AAAA records and one of them is down.


Its supported in IE, Chrome and Firefox, havent tried Safari and others. They will wait until timeout, then try the next IP. IE will give up if the second IP is down, while Firefox and Chrome will try many IPs


DNS round robin provides redundancy, not load balancing (though it ipso facto does somewhat evenly distribute requests). You really need an application aware load balancer that can detect http status codes and route around failures at the very least.


Zero downtime deployments is not the same as zero downtime generally ever.

We deploy to production several times a day. We have an in-house product that our own emplyees use and they absolutely love that they can ask a feature in the morning and get it in a few hours (in best case scenario). That wouldn't be at all feasible if all those deployments caused errors to users. Luckily that's not a problem, because these days it's really easy to get zero downtime deployments with all this containers and "serverless" functions and whatnot.


> You can do as much testing as you want

With what Dark is promising, I believe testing would be easier. Deploy to a test environment in 50ms. Then run a suite of tests, then deploy to production in 50ms.

There could be some complementary tech that figures out what tests need to be run based on what you change to get the time down.


They're kind of lying.

You only need pip or any package install if your testing image doesn't pre-install the required packages before running the tests. Which I don't myself since I always want to see if my packages work with the latest versions of other packages - and Dark wouldn't be able to handle that anyway.

So my test runs actually complete in about 50 ms if I'm not testing any external API functionality. They will also never finish earlier than 5 minutes if I am, even on Dark - since they're based on external services.


Hrm...we don't deploy code in to prod in 50ms, but we certainly deploy to dev/staging/testing in <1s, and promote to prod once we are 100% sure we want to do so. We use the boring, industry-standard tooling and process everyone else uses (Jenkins, k8s, Ansible, nothing bleeding edge). I'm unsure what advantage their product has that makes it stand out among the battle tested tools already in use. Would someone elaborate?

disclaimer: working for a l2 network provider, and a mistake costs us infinitely more than an "Oh we're sorry!" can fix.


The point is all that stuff (jenkins, k8s, ansible) is accidental complexity that, I presume someone at your company has-to/had-to think about. I think Dark's angle is it's stuff you shouldn't need to think about in order to solve the actual problem, and removing as much of that as possible is better.


Okay, I mean conceptually cool, but it's not like an existing startup gonna throw away all their code and rewrite it to save on dev-ops time.

I guess the value prop is that if you're starting from scratch, and want to write your whole app in Dark then you can avoid hiring a dev-ops engineer and iterate faster?

Seems like a stiff trade-off. If it's an interpreted language you're coming from, setting up envs can be < 30 secs anyways (and same with prod deploys), so that's faster than you need.

If it's a compiled language you're coming from (e.g. scala/c++) then it's likely Dark may not match your needs.

Plus I guess this assumes every team in your company is using only dark? Like what if it's a SOA and a full environment has a node component?


How coupled is Dark with the Dark language? I understand that the purely functional nature of the language makes its integration with the rest of the platform easier, but the blog post doesn't go into much detail about that. What would it take to integrate Dark with: a) another (existing) purely functional language, and b) a general-purpose language, e.g., Go? Could a code intelligence tool like Sourcegraph help?


Extremely tightly coupled. That's kinda the point "what can we do if we don't have to couple dark to anything else" (eg editors, other languages). I can't imagine how you'd couple Dark to another language, but I also don't see the point?


Thank you. My point was risk mitigation, which I discussed in another reply (which you answered: https://news.ycombinator.com/item?id=20399267 )


Yeah, the bigger picture here is that there's a ton of valid reasons why people can't use Dark right now. Of course there is, we're a small team with an early product. Once we show that this is really a better/faster/easier way to make backends, then we can start working on handling the risks and constraints of more teams.


That makes sense. It does look like you're very close to that first objective though. Congrats! Your blog post very clearly describes what Dark actually is (which was still unclear before I read it).


Twenty years ago, deployment of a new version of a web app means FTP'ing a new php file to the server.


Which wasn't a bad thing after all. I've had more incidents with deployment pipelines than with the actual code.


Mostly because people forget that pipelines are code and need documentation, tests and software engineering practices just like any code.


Well, you're free to deploy like this (but you probably know why it's a very bad idea).


Feel free to explain why


Auditing what was changed and which version is currently in production is a bit harder than it has to be.


Copying of multiple files is not atomic. You need to ensure a file with func1 exists before another file calling func1 gets copied. You can do it atomically by having 2 directories and swapping a symlink, but... that's a scriptable pipeline at that point.


Yet another reason is FTP turned out to be inherently unsafe.


SFTP, however, has FTP feature parity and is secure as SSH (since it is ostensibly SSH).


I've been seeing a lot of talks about darkland and can't find a single example of code. Can someone show me at least how does it look like?


from darklang.com:

"Dark is currently in private alpha."

That's probably the primary reason why there's little to no info.


They are leaving us in the dark, so to speak.


It feels like an ICO launch in that regard. Kind of exciting!


Back in the day (early 00s), you could do something similar with ZOPE: develop on a production server, in a private session and you could mock about with your version of the code, and when you were satisfied push it through for everybody else. This was because the database that was holding the users, permissions, ... was also holding the code. It was lightyears ahead of J2EE at the time. (Now I wonder what happened to it)


Former professional Zope guy here! (I made a living building things with it in the early-mid 2000s).

What happened to Zope? Well, Chris McDonough (creator of Pyramid, and Zope veteran) blogged about this in 2011 and from my perspective he got it exactly right. I still think this history is fascinating and I wish more people knew it.

https://web.archive.org/web/20161019153348/http://plope.com/...

After the history lesson, some of his "lessons learned" bullet points seem very apropos in the context of Dark, particularly the first two:

* "Most developers are very, very risk-averse. They like taking small steps, or no steps at all. You have to allow them to consume familiar technologies and allow them to disuse things that get in their way."

* "The allure of a completely integrated, monolithic system that effectively prevents the use of alternate development techniques and technologies eventually wears off. And when it does, it wears off with a vengeance."


Thanks for pointing to that, it's super interesting.

One of the things we designed Dark around was being able to safely and easily continuously deliver language and framework changes. The Zope 3 project was a disaster by the sound of it, and I would hope this will allow us avoid a similar fate. That's the goal.

Also, we're pretty pro-testing (though also type systems, which help too).


I know that Plone, which is built on Zope, found its place in a lot of universities and governments. So I imagine Zope lives on with necessary support from its big users.


Yup. Plone is still pretty active https://github.com/plone


So... you're 50ms away from a total global outage and/or major feature fault?

Why would I want that? How do I control it for green-blue? For rational isolation? Is there native support for canary releases?


> So... you're 50ms away from a total global outage and/or major feature fault?

And 50ms away from restoring service.

The degree of care that needs to be put into a release is related to the potential damage a bad release can cause and the time to repair.

Reducing time to deploy reduces time to repair for most errors (certainly not for errors than write bad states, if you write bad data to your database, that's most likely a long repair regardless of how fast to deploy it was).

If you're building NES cartridges, you should really get it right the first time, since there's no way to update. If you're making a low cost internet service, for small changes, it's not unreasonable to try things without a lot of testing before hand as long as you commit to detecting and rolling back quickly.


> And 50ms away from restoring service.

Sure, but fast rollbacks are much easier than fast rollouts.

> Reducing time to deploy reduces time to repair for most errors (certainly not for errors than write bad states, if you write bad data to your database, that's most likely a long repair regardless of how fast to deploy it was).

A uniform fast rollout is almost always worse, and a separate problem from data recovery or fast rollbacks. But I agree with this paragraph in principle.

> If you're making a low cost internet service, for small changes, it's not unreasonable to try things without a lot of testing before hand as long as you commit to detecting and rolling back quickly.

When you're doing more than basic presentation of semi-dynamic content, most CSPs make multi-region deployment very affordable and easy. It's generally a failure of early engineering departments or an accumulation of technical debt that makes it otherwise in 2019. Too many folks assume it is something they should do later rather than get right to start.

Canaries and regional redundancy should be part of every product's basic launch checklist, imo. People love to say any kind of software engineering best practice is bad, because they love to pretend software engineering has no value. But for every flash UX trick or tracking feature someone ships providing value, that's completely overshadowed by being down when a user needs you. Reliability not only saves you time at the developer level by making features more predictable, but it means you need to worry slightly less about funnel optimization because your user attrition is lower.


Yeah, because speed of deployment prevents CI/tests/checks /s

(Seriously, how do people come with such facile counter-arguments?)


How does CI actually prevent these kinds of outages? Did CI save Google from the networking outage or tests prevent a global Cloudflare outage? Do you think those organizations don't spend enough energy for the average SV commute on test runs per release? Are you seriously going to pretend that those efforts "solve" reliability and stability in any non-trivial system?

It's actually not hard at all to write fast software deployment. Many people do it accidentally. But to tie it directly to editors and stovepipe the infrastructure around testing, that seems to me like a great business model that any rational engineer interested in reliability should immediately reject unless it's coupled with open source, statistically valid canary tooling at the bare minimum.

What you actually want at that degree of universal knife-switching is rollbacks.


> Did CI save Google from the networking outage or tests prevent a global Cloudflare outage?

This is not a fair statement. This is like saying "Did seatbelts save the victims who died in car accidents X/Y/Z?"

Just because your tests/ci/etc. can't catch every single issue that could go wrong doesn't make them useless. Just because people die every single day in car accidents doesn't mean car's safety features are useless.

You're correct in that a CI should not in any way be considered a "perfect solution for outages", but an automated system that tests, deploys, etc. in a predictable and (again) testable manor is at least helpful.


> This is not a fair statement. This is like saying "Did seatbelts save the victims who died in car accidents X/Y/Z?"

Actually, your refutation is precisely what I'm trying to say! The GP implied that CI and UTs fix these problems. They don't. In the same sense that seatbelts and air bags may reduce the chance of a fatal accident, they do not replace safe driving, well-trained drivers and intelligent infrastructure design.

> but an automated system that tests, deploys, etc. in a predictable and (again) testable manor is at least helpful.

I didn't say otherwise. In fact I agreed those measures should be in place. They're just not sufficient. You will still face bad releases even if you do this, and so you shouldn't do fleetwide rollouts unless you just don't care about your system falling apart.

Please don't misrepresent my comment.


>Did CI save Google from the networking outage or tests prevent a global Cloudflare outage?

Yes, multiple times. It just didn't help with each and every outage, which is a different thing.

For the ones not preventable by CI (e.g. a disk was borked, power was down, but no integration / programming error etc a CI process could possibly spot) the question is moot.

Besides the concern about outages is totally unrelated to "speed of deployment".

Increased speed of deployment helps testing and iteration (obviously faster deployment speeds also means faster deployment to testing and staging environments, not just to the final server).


> Yes, multiple times. It just didn't help with each and every outage, which is a different thing.

So then maybe the idea that a prompt global rollout is a bad one even if you have good tests and coverage.

> For the ones not preventable by CI (e.g. a disk was borked, power was down, but no integration / programming error etc a CI process could possibly spot) the question is moot.

I'm here to tell you, as someone who has been at this for a long time and someone who is actually doing SRE at Google at the outer layers: No. It matters. We do canaries because they let you find problems no reasonable person cold catch with tests. Tests themselves are only cone component of a chain of reliability best practices, starting at coding practice and expressive type systems with static checks and moving outwards to reploying the code carefully and with intentionality.

> Besides the concern about outages is totally unrelated to "speed of deployment".

This is obviously false. Slower rollouts of new software and canaries are a proven way to reduce the impact of a bad push. Once your system gets modestly complicated, it can be very difficult to spot a bad push with naive unit testing which (let's be real here) often just mocks out dependencies anyways.

> Increased speed of deployment helps testing and iteration (obviously faster deployment speeds also means faster deployment to testing and staging environments, not just to the final server).

Right, but very fast deployment to client machines is not hard. The CD model is just inherently dangerous unless it has automated safeguards against the complexities of even a modestly large production app.

Personally, if your staging environment is CD I think that's great (up until your org gets big enough that a broken build pisses off other devs). But to prod? I will resign before I let you do it. Everyone who does software at scale agrees it's not a good plan to do CD on prod. There are whole books about why you shouldn't do it.


>This is obviously false. Slower rollouts of new software and canaries are a proven way to reduce the impact of a bad push.

Non sequitur. This confuses "speed of deployment" (which is a technical issue) with "slower rollouts" (which is a devops choice).

You can have 50ms (or, generally very fast) deployment and still have canaries, slow rollouts, incremental rollouts to smaller audiences, and so on. Those are orthogonal, choices.

The slowness of deployment that Dark claims to solve (and which is good for any project to solve) is about the non-essential slowness because of convoluted processes, too many moving parts, accidental complexity, and so on.

>Right, but very fast deployment to client machines is not hard.

You'd be surprised.


> You'd be surprised.

No, I wouldn't.


From what I gather, Dark's deployment "unit" is the entire AST covered by a function, and these deployment units are versioned and promotable via an independent criteria.

Perhaps one of the promotion strategies could involve a human controlled criteria.

I would like to understand their plans on this point a bit better as well.

I see immense power in switching a promotion strategy away from large units, like entire containers or versioned artifacts, to fine grained components, like versioned functions.

Do you see any advantages here for improving your platform's availability?


Author here. Was this in response to the article or the title? Because the article goes into great detail about this.

To answer, the whole system is based on feature flags (which are equivalent in this case to green-blue/canary releases). So yes, there is a massive amount of control for this; in fact, that's the whole point of the article.


Feature flags are not canaries. I checked the article twice, even did a find on the text of the view source because sometimes I miss stuff.

You just didn't talk about the actual complexity of deployment. And that's your call, but as you can see from this thread there are people who think CI/CD to prod with no controls is a good call.


(Also an SRE at Google)

From my reading of the article, the way that Dark is intended to work is that the language and environment enforces that changes to code accessible in production are forbidden to be made without putting the change behind a feature flag. Feature flags can be written so that flag flips are rolled out gradually - to whitelisted users, percentage of users, etc. (I'd be interested in how sophisticated the set of requests enabled for the feature flags can be specified.) Writing a tool to instantly revert the feature flags to their state at a previous timestamp is straightforward in principle. And presumably there'd be no need to write separate configuration files in a pure Dark environment, so changes to configuration would also be feature flagged. In this case, there would be in theory no need for a separate canary process, since the feature flags themselves can be canaried. I just hope that Dark will encourage teams to canary changes rather than have "easy" defaults of "cowboy deployment to prod".

Dark would probably make it very easy to roll back the simple changes that result in exceptions being thrown on every request. All the most interesting outages that I've seen have involved emergent interactions that put the system into a state unrecoverable without manual intervention. It's unclear how Dark would help with that from my reading.


Thank you! Yes, your understanding is correct and your explanation is lovely. Much appreciated.

A thing to clarify is that you're not really deploying code in 50ms. It's more that all the code you write is in production, behind a feature flag. But, you cant change unflagged code and you also cant change the old code, you can only set a flag that some users see the new code.


Isn't this better than deploying for 15 minutes, having an outage then, and then rolling back?


Not if that 15 minute deploy rolled out over a canary that gives me much less of a chance to actually have outages, yeah.

Knife-switch rollbacks are good though, tbf. I just don't think that's particularly challenging.


How does it compare with hot swapping in other languages and platforms like Erlang, JVM/.NET hotswap technologies? I'm guessing the built-in safety functionality from the feature flagging is one thing?


It's much simpler. Those have complex in-process hot swapping. Dark's "hot swap" is a write to a DB, then new HTTP requests use the new code.

As we grow and start to look at compilation and optimization, I think we'll probably get much fancier, but for now we're very into keeping things super simple.


I'm pretty interested in this. Signed up to the mail list. Good luck!


Marketing piece with grand plans...not really

I would argue with you that this is not such a priority for a software development organization, there are more important things to do.

You got so excited...so fast, I think this community may be doing more bad for me than good lately.


This sounds extremely exciting and it looks like a lot of thought has been put into this. I’m looking forward to try it out.

A lot of comments here are coming from an anxiety that is caused by many years of dealing with fragile deployment processes, but I like the boldness of Dark to just bypass all that and make the fundamentals safe and solid.


I added the same feature to my Java app server: http://github.com/tinspin/rupy it's a simpler design which compiles, zips and hotdeploys over HTTP to a classloader.

But it's fast enough as long as you keep each project small.


The problem is granularity of deployment - the smallest piece you can rollout is, these days, a container with code and dependencies, that has to be build and threaded through your CI/CD stack before it hits prod.

I like to imagine what would need to happen to reduce(increase?) that granularity to single function - you change the code, byte-compile it, send that blob to your backend’s hot code swap port and voila - you’re done. A lot has to happen behind the scenes though - a change in dependencies would still require a larger deployment or much smarter code swapping procedure, caller/callee tree would need to be considered, functions need to be pragmatically pure (talking to dB is fine, being part of an “object” - not really), functions need to be extensively specced, etc.

But that would be my backend holy grail.


What are you looking for, that isn't provided by platforms with hot code-swap (e.g. Erlang) or FaaS systems (e.g. AWS Lambda, OpenFaaS) ?


you're thinking of only surface functions/endpoints, those that are invoked directly from your api route dispatcher. i'm talking about any function in your source code of a large backend with many endpoints.

think of function `+` - i want to be able to deploy a change to that function such that no worker process gets respawned.

also imagine every time you call `+` in your application - it goes throught the whole FaaS shenanigans with serializing, sending rpc over wire, receiving response, deserializing, dealing with throttling, errors, backoffs. all of that is ridiculous for `(+ 1 2)`.

and yes, erlang is a good example of an environment where this could work.


Code deployment is almost always trivial and fast. It's the databases that are slow and very tricky, especially in distributed systems.


Very curious about the data migration piece. It sounds like it's designed to handle mapping a row of type a to a row of type a prime. What about situations like a specific combo of type a and c to b and two different records of type c, and optionally delete a record of type d? We occasionally have migrations that are only representable as a function of several entire tables, and which may even leave gaps, e.g. for domain events that affected the system but weren't logged until a later revision.


EDIT: Ha. My apologies as I just realized my stupidity - I was browsing on mobile and it turns out out I jumped into the 'What is Dark' post [1] and never finished the original! The mailing list links are in that one - I subscribed. This is a lot more interesting, cheers :)

[1] https://medium.com/darklang/the-design-of-dark-59f5d38e52d2


The section that describes how it does it in 50ms is called "#deployless". That describes the mechanics of the deployment. How we made deployless possible is addressed for the rest of the post after that.

(The mailing list thing is weird, I only see that link at the bottom. Can you point me at the ones you see in the text?)


Very interesting, but I don't like the idea of abandoning VSCode to learn some single-purpose IDE


Dark sounds like it would satisfy my preferences well.

- statically typed, functional language with no null pointers

- safe deploys and rollbacks

- structured editing / version control

- minimal infrastructure management

...etc. A tool that makes writing a correct program, and deploying it without catastrophe or fuss, easy. I look forward to reading more.


Author here - happy to answer any questions!


I see quite some similarities with low-code systems, 4GLs, smalltalk etc, so AFAIK the market is there... Please take into account that your target audience probably is not on HN. If I would be you, I would target web agencies, that can use the dark platform as a competitive advantage, if your system allows you to build stuff faster... IMO it does not make sense to learn all these new paradigms for a small project, you really need scale for this kind of investment from your customers... Right now it feels like you are building a solution that does not know which specific problem it is solving...


Dark sounds like an interesting concept.

I seriously hope though that you will open source the compiler.

I can't see Dark going anywhere without a guarantee that you can use it without using the provided infrastructure and that your code will work fine even if Dark goes away or you decide to host it yourself.

I certainly won't touch Dark with a 10 foot pole otherwise. And would advise anyone against it strongly.


Other languages have large ecosystems of packages that allow users to do common tasks quickly. I can imagine not having something similar would be a sticking point to adoption; how does Dark picture the Dark ecosystem working and competing with other already well-entrenched ecosystems? Thanks!


There's a couple of ways of handling this. First is that of course we need to create a package manager and ecosystem, and that will take time (we can accelerate it by making a great experience, but haven't gotten there yet).

Dark can talk to other HTTP things, so many of the functions that are done by libraries can be done over HTTP. So you can call other services, even if we don't have an SDK for that yet. And from there it should be easy to package it up so there is an SDK.

Finally, it's a question of what you're doing. There's many tasks where the disadvantage of not having an ecosystem is outweighed by Dark's other advantages. And as we grow the ecosystem, the disadvantages go away, unlike the status quo tooling.


Hi Paul,

How did the decision to create a whole new language take place? Was using existing languages a serious consideration and if so, what were the trade-offs that made you go in this direction?

I'm familiar with the basics of back-ends. Let's say you had a language that compiles each back-end point into a binary executable (each api point is a separate program). All you'd need is versioning via directories, symlinks to the active most up to date version and a router that looks up the executables in a dir and routes to them.

Making changes would involve changing the code, pressing "deploy" which pushes to github and gets the backend to pull, re-compile the binaries and update symlinks.

Given this setup - what does your approach do that's more than what I've just described? I'm assuming there's some fundamental problem my naive approach does not handle that made you go in the direction that you've just described.


We did of course consider it. I don't think it's possible to use existing languages to do this.

Presumably, we'd have to use Javascript as our basis cause everyone knows Javascript. But then we'd have people wanting "Dark for python" and "Dark for Ruby" and "Dark for rust", etc, which making a new language neatly sidesteps.

So if we built it for javascript, we'd need to decide how to support all of JS's features. How do we support eval? How do we support Typescript and Flow and ReasonML?

Then we'd be stuck with all the JS mistakes: promises and futures and callbacks and object model and null/undefined/0/"" comparison tables.

This list keeps going: it would have to use a parser. It would need to support npm, etc, etc.

I expect once people see Dark that they'll try to build similar features for existing languages, and I hope they succeed! There's some companies that are doing similar things, and I think the scale of their vision is much lower, partially because they are limited by language choice.


Hm, your answer is not what I was expecting :) Some follow up questions if you are up for it:

What are the fundamental features of Dark that are not possible to add to an existing language?

What are the fundamental features of an existing language that are not possible to restrict via enforcing a subset, that are fundamental to Dark?

Is Dark a very domain specific language?

Perhaps these questions will better be answered by a demo that is coming in September.


> What are the fundamental features of Dark that are not possible to add to an existing language?

An obvious example is the structured editor. To do that, we have to have a language in which there is a meaning to every incomplete program. Other languages don't have that - they just have a syntax error.

Our editor shows you what we call "live values" as you type. If you've seen lighttable, it's a similar idea. We save real values from production and use it an a sorta debugger. It's really wild, makes understanding code really easy, and debugging too.

How would we handle that in JS? No idea, might be possible by hacking into v8. But even then, we actually needed to change the language semantics and require all functions to be marked as "pure" or "impure". And then we discovered there's a 3rd state: "non-side-effecting but we don't have a version of it in the client". And then a 4th state "not idempotent but we can it without side-effects".

I doubt there's anything that _couldnt_ be done in Js or ruby or python, but what's the point in fighting that battle? By controlling the entire thing we avoid all this.

Same as why we don't support vim. Could we make all the cool stuff we're doing work in vim? Sure, with a massive massive effort. Blank slates work better for this.


What is Dark’s biggest weakness (excluding the current lack of adoption)?


Wow, what a start :) Probably the biggest one is that you have to leave your editor. If you've spent 20 years carefully configuring your vim or emacs setup, you're not going to be happy to give that up. Don't have a good solution to this right now either.


Over all my years it's become clear to me that this attitude of refusing to give up a specific tool (and it's always something married to plain 1d text) is the biggest thing holding out industry back.

I don't know if you can convince people to do it, but I'm glad somebody's trying.


I think people don't want to leave their editor because navigating and typing code is a large part of programming. If you try some new editor, it's always going to be pretty wonky. For example, I've been using the Arduino IDE recently. I am used to navigating lines with C-n and C-p. It's something I do thousands of times a day. In the Arduino IDE, those keys create a new sketch (something I want to do maybe once a week) and print the current sketch (something I imagine nobody wants to do ever).

The UI is just not thought out at all. I don't think I'm being resistant to change. I think their IDE is just garbage.

Some people have done new IDEs well, VSCode comes to mind. But replicating 40 years of features is not going to happen overnight, and some of those features are actually important. Figuring out which ones are the key. If you don't hit a developer's critical set of features, they aren't going to use your thing.


I'm glad someone's trying also.

And the reason 1d text is hard to get rid of is not merely an editor (ie, an "app") but a galaxy of languages, tools, and operating systems built around 1d character streams. Everything from the serial driver to the style sheet can be manipulated with primatives for mapping, filtering, and composing character streams and operations on them.

There ARE a bunch of benefits to structured code, for example Docker and npm/pip/etc have recently showed us a hint of what composition of standard components can bring.

What might be missing from structured code is the rest of the primitives on ASTs. Once we have all that we can talk about backporting to emacs :)


All code is structured, otherwise it couldn't be compiled/interpreted. The Go community, for example, doesn't need to store their code in some sort of custom binary format to apply changes: gofix works fine with plain text. Text is just a serialization format for those ASTs, no different than any binary format.


What do you see we could improve if we only left text? I've read the promise many times, but never actually saw any convincing reasons. Text is just another format anyway.


One editor setup cannot satisfy all of the diverse preferences of all "religious" editor users. You have to support significant customisation to "convert" us.

Maybe the editor should be customisable via Dark itself?

Alternatively, allow us to use our editor-of-choice and provide a language server to facilitate the creation of structured editing plugins for Dark.

I would add that most applications, including most non-vi-or-emacs editors and web browsers, do not allow much customisation of how their input is mapped to actions in the software. For someone like me, a person that is a vi die-hard because of its ergonomics, this is absolutely critical. Post-input customisation, no matter how advanced, cannot satisfy this need.


> Maybe the editor should be customisable via Dark itself?

That's the plan!


Haha figured I’d get in early with a hard question. I signed up a long time ago out of curiosity, it’s been a while since then, looking forward to trying it out


I am not able to signup for beta with an organization email address. is this intended?


No it's not! Sorry about that. Email me direct (see my profile) and I'll try to fix it.


Is it possible to write code in an offline environment (e.g. on a plane)?


Yes, we'll have support for that, including handling sync conflicts when you get back online. It's not gonna be a great experience since Dark is really really designed for working on production, but it should work ok.


Very interesting and flips most everything standard about CI and CD on it's head. I am disappointed the blog post doesn't include any screenshots or actual examples of the Dark code/editor working. It's kind of hard for me to wrap my head around how this works at a technical level just reading about it...would like to see some concrete examples. I went over to the website to look for a demo/examples but there isn't anything.


As an aside, that big list of tasks and diagram at the end are perfect to show next time a client exclaims about how long it takes to get a simple change into production!


They show very little, I'm anxious to be able to see.


We're talking about the design of Dark in these posts. We plan to show what Dark looks like in our launch in September.


I think it is fantastic that this kind of innovation is happening. I get sick of waiting for stuff at work, some kind of build or CI thing. npm i, docker build or whatever else. I hope the language experience is good, and it's a reasonable language (static typing!).

It sounds like taking the Go idea of quick builds to the next level.


What are the server/OS/cloud prereqs for Dark?


There are none! We run the infrastructure in the cloud (we run it on GCP using Postgres and k8s, but will likely go multicloud at some point).

Most of the advantages we discuss in this post come from the fact that you're using the Dark editor, language and infra. I don't believe this would be possible with a self-hosted solution.


So anyone using Dark is completely locked-in, to the point that they can't even take their code into an editor other than yours?

It's a proprietary language, that can only be edited in a proprietary cloud-based editor, that only runs on proprietary infrastructure?


Yes, that correct. There are lots of downsides to this, but you also get a lot of things that we don't think are possible without it.


I understand the concept, but I don't think the downsides can be so easily passed by. Some quick thoughts/questions:

- What happens if Dark-the-company disappears? It doesn't matter how. You're acquired by Google and shut down, your office is hit by a meteorite, whatever. Everyone that's ever built anything using Dark would completely lose it? Every company that's built anything using Dark would have nothing remaining in its place except--at best--some code that can't run anywhere?

- How much venture capital have you taken? Do you expect to take more?

- What if you update the platform and it introduces behavior changes or bugs into some of your users' applications? Can they rollback and stay on the previously-working version of Dark (indefinitely), or will they have to rely on you to resolve the issues?

- The first mission of the company that you list on your Values page is "Democratizing Coding". How does making literally every aspect of Dark completely dependent on your company support that? Will Dark users vote on company decisions? How can a completely centralized system with a for-profit owner promote democratization?


> What if you update the platform and it introduces behavior changes or bugs into some of your users' applications?

They can fix it in 50ms! Don't worry!


These are the downsides I'm talking about. If you're sensitive to them, don't use Dark! That's totally fine with us. We're creating a technology with a ton of advantages that don't exist elsewhere, and that comes with different tradeoffs.

If you're worried about a meteor hitting SF (and I know that's a metaphor, but let me run with it), then your risk profile indicates you won't be using new technology from a new startup (you probably won't even use less risky stuff, like Rust or Elm).

If every technology had exactly the same constraints, then we'd be stuck with those constraints forever. This isn't the _right_ way to do it, it's just our vision of how to solve the problems with coding.


It's a hard sell. Another risk is that, after I've invested a lot in Dark, you increase your prices, and I can't switch because I'm dependent on your product. The switching cost would be rewriting my entire code base (my biggest investment).

I'd like to have the option to reuse my code to mitigate those risks. Two possibilities: a) If the Dark IDE and infrastructure were compatible with a common language, I could use "regular" build and deploy tools if I wanted (I'd weigh the pros and cons vs. your price increase), or b) if the Dark infrastructure had an open source API (not necessarily an open source implementation, but like what SQL is for databases, or the S3 API for object storage), I could implement my own alternative infrastructure or shop for one.

(a) seems difficult technically and (b) means risking commoditization on your end.


> Another risk is that, after I've invested a lot in Dark, you increase your prices, and I can't switch because I'm dependent on your product. The switching cost would be rewriting my entire code base (my biggest investment).

This would also kill the company.

We are actually looking at ways to mitigate this risk, as we do agree that being locked in is a real risk. We also have to provide sustainability to the business, as Dark failing as a business isn't helpful for our customers either. We're thinking about this, but nothing concrete yet.


> If you're worried about a meteor hitting SF (and I know that's a metaphor, but let me run with it), then your risk profile indicates you won't be using new technology from a new startup (you probably won't even use less risky stuff, like Rust or Elm).

no. they're both open source for starters.

furthermore, even if all developers working on these projects vanish from one day to the next, you'll still be able to use the last released version and will have a maintenance window to deprecate the software.

if Dark goes away, you'll be gone at the same date.


You're wrong. Everything is possible as open-source. Often the lock-in is not an option. Not at any price. I hope you realize that soon.


”You’re wrong” is a sure way to get people to just walk away from a discussion.


There's no need for a discussion when stating facts.


Is stating facts not a form of discussion? ;)

I'm an open source fan myself, but given that AWS has a penchant for lifting concepts from open source projects and using it to crush them, and given that AWS is one of the primary targets here, it kinda makes sense to opt for proprietary. Especially at this stage.

Just my two cents.


And that's why Kubernetes is so popular today - nobody likes the AWS lock-in.


i’m sorry but the fact that aws is choosing open source tech ( even if it’s to compete and crush them) is a rather compeling argument for open source techs rather against. This lets Aws say that you could still walk out and keep using your code ( and data) in another infra, at least in theory.


This is similar to our thinking at the moment.


Yes.


>> Most of the advantages we discuss in this post come from the fact that you're using the Dark editor, language and infra. I don't believe this would be possible with a self-hosted solution.

What makes it impossible specifically?


Is Dark intended to be a Lisp? I'm surprised by the fact that the team didn't include language examples (from what I can gather)


No, it's an ML. (You're right that we haven't published samples, we think people get distracted by syntax and we want to focus on the goals at the moment)


Oh cool! That makes me quite a bit more interested.


Can I take Dark out for a spin?


Right now, we're in private alpha. If you have a product or service that you want to build, fill out https://darklang.com/subscribe and our CEO Ellen will get back to you to discuss and schedule an onboarding. We're not doing tire kicking yet though :)


Entered my email awhile ago... just updated the mailchimp profile (daniel@orthly.com) to include my use-case. Let me know! Would love to provide feedback - what you're working on sounds awesome


One question: Security


That's quite a dark name for such a bright idea.


It sounds like a convoluted version of Geocities.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: