Hacker News new | past | comments | ask | show | jobs | submit login
Deterministic simulation testing for our entire SaaS (warpstream.com)
203 points by wwilson 10 months ago | hide | past | favorite | 26 comments



This is related to Antithesis, here is the thread on the original announcement :

https://news.ycombinator.com/item?id=39356920


Off topic: warpstream's calculator on the pricing page is pretty cool https://www.warpstream.com/pricing

That breakdown switch is a lovely touch.


It’s nice functionality but the UI is very broken for me on mobile


This is so, so cool. Basically the holy grail as a distributed systems engineer. Like the author, I've also avidly consumed every Jepsen report but the effort of actually implementing Jepsen tests for my systems always seemed too high.

Very excited to see this technology democratized and made available to to more companies!


This is quickly becoming my favorite technical blog. Congrats Richie and Ryan. I didn't fully understand Antithesis the first time I ran into it; now it makes sense.


Hey WarpStream folks… does your blog have an atom/rss feed?



Question from another field that does a lot of simulation - why is the assertion that deterministic simulation testing, rather than something stochastic, is the gold standard.


[ I work at Antithesis ]

Concurrent/distributed system bugs can be really finicky because they may depend on subtle timing conditions to manifest. So you might see a bug once, then try to re-run the test using the "same" inputs, and the bug doesn't appear a second time. This might be because e.g. threads aren't scheduled the same way as before, so some 1-microsecond-wide window of vulnerability for a race condition was missed. If you can't reliably reproduce the bug, it's much harder to study and fix.

Determinism lets you perfectly reproduce the bug as many times as you want. Perfectly as in, exactly the same thread+process scheduling, exact same memory and disk access times, exact same network packet transit times and orderings .. exact same everything. Then once you have returned to the bug, you can rewind time, to do things like explore counterfactual scenarios by varying the random seed from that moment on.

We do have randomness of course, otherwise it wouldn't be a very good fuzzer. But we save all the seeds, so it's a controlled, reproducible randomness.


From yet another field where deterministic simulation is often a goal (robotics), the ideal is a simulation test system that is deterministic for a given initialization (e.g. a random seed) so that for an initialization that causes some error to occur, you can reliably reproduce and resolve the error. Of course, you then need to run that system with a range of initializations to have confidence that you didn't just get lucky with the initialization.

In practice, this can be quite hard to do in the presence of uncontrolled non-determinism (e.g. thread/process/GPU scheduling)* and it is often more pragmatic to invest the time in better stochastic testing and logging than deterministic reproduction.

* Yes, these can be made closer to deterministic. But doing so often comes with reduced performance, such that the system you are testing would no longer match the system being deployed, defeating much of the purpose of the test in the first place.


Isn't your * exactly what they are doing? And actually able to simulate faster than wall time?


> Antithesis has created the holy grail for testing distributed systems: a bespoke hypervisor that deterministically simulates an entire set of Docker containers and injects faults, created by the same people who made FoundationDB.

I remember the Antithesis founder was having a hard time explaining what exactly they did.


I remember that too, the ambiguity for me was how their fuzzing was good enough to explore an arbitrary state space efficiently enough.

The deterministic hypervisor is 'simple' enough albeit a pretty heavy engineering lift.


One of the cool tricks we can use is that since the testing is all fully deterministic, once we find an interesting point in a test run - even if it is “deep” into the run time wise - our system can start many new branches of test runs off of that moment or moments just prior. So it is much more efficient than having to re-do the work to get to that rare interesting moment for each new branch.


I’m curious if you’re willing and able to share: Are you using FoundationDB as the data store for Antithesis?


We’ll be writing a lot in the near future about how Antithesis works, stay tuned :)


Can’t wait!


This article and previous Antithesis ones mention testing distributed systems and, as someone who works at a company specialized in exactly this, I am excited. However, I wonder if Antithesis could help with nondeterministic failures observed in unit and integration tests I encounter in my Jasmine and TestCafe suites. Most of the time, these are quite hard to reproduce - if at all possible - and a significant portion of failures is caused by genuine application bugs. I wish there was a tool that helped with these.


Slightly tangential, but when I went to go look at pricing information on mobile, the rates were clipped/overflowed out of bounds.


Woops, what are you device details? I'll take a look!


The "Fetch from follower" button is slightly broken on my Pixel6

The breakdown one looks good, but the follower seems like the background got reduced width, but the active button is moving full width


hopefully in year 2300 we can have good way to test landing pages


Hopefully a century before someone would finally invent a front end framework with robust text styling!


BTW, why base pricing on r4 instances vs something more cost effective like r5?


I think I just followed the official recommendations I found (which are probably stale now). I'll update it to r5, but it doesn't really matter. The price difference between the two is like 5%, but hardware only ends up representing a tiny fraction of Kafka's cost at scale (the real cost comes from EBS and inter-zone networking).

I could make the hardware free for Kafka in the comparison, and WarpStream would still come out significantly more cost effective. Cloud networking is really expensive.


I've bookmarked it, just because the site is so pretty.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: