Step functions are an "external orchestrator". With DBOS your orchestration runs in the same process, so you can use normal testing frameworks like pytest for unit tests. Its super easy to test locally.
We have a time travel debugger that makes it super easy to test workflows. You could set them up in test and then time travel them, or even time travel completed workflows in production.
It turns out that DBs were invented to solve hard problems with state management. When people moved away from DBs (at least transactional relational DBs) they had to rediscover all the old problems. Tech is cyclical.
One of the motivation for DBOS is that OSes were designed with orders of magnitude less state to manage than today. (e.g. linux >30 years ago). What's made to manage tons of state? A DBMS! :)
Recovering the application from failures especially when updating multiple data sources, once and only once execution and such things are in the application domain. They have never been done by relational databases. That is the problem solved by the Python SDK of DBOS ( and typescript SDK)
It's sort of a combination of both. The library solves those problems by storing specific data in the database and then taking advantage of the database's ACID properties and transactions to make the guarantees.
Then the DBOS cloud platform optimizes those interactions between the database and code so that you get a superior experience to running locally.
I find managing the AWS API Gateway to be too cumbersome. We use a proxy lambda that handles authentication and has routing logic to other lambdas (or SQS | State Machine).
For a one-sided, unbounded distribution when you still want to observe changes without being susceptible to outliers.
If you're monitoring response timings on a server, for example, the median might be very close to 0, and it won't shift unless a majority of the distribution slows down. If you take a winsorised mean, you can trim useless long response times that mess with the mean, but still see if e.g. 1/3 of your responses are suddenly slower than normal.
When processing astrophotos, multiple exposures are “stacked” together. There is a certain amount of noise in each frame - due to electronic noise or simply the random number of photons that strike a pixel in any one exposure - that you want to average out on a per-pixel basis.
However some frames may contain unwanted outliers, for example if a satellite briefly passes overhead it will appear as a very bright streak in only one frame.
By winsorizing, outlying pixel values can be eliminated while still maintaining the same number of samples per pixel as the rest of the stacked image.
A lot of financial data is of dubious quality. Lot's of spurious data points, mostly in the extremes, just aren't real. Median is useful for many things, but sometimes you just have to get rid of the bad data to get a valid mean.
Any of: low amount of data points; wanting a continuous value if your points are integers; slightly different behavior in high dimensional vector spaces.
I use yaml for over 15 years now and never had any issues. Iam a rubyist, naturally that was how i first met yaml and since then I really like it.
Also with homeassistant I was "forced" to write a lot of yaml by hand and also never had real issues. Maybe its because emacs handles writing yaml fine. Anyways. I like yaml.
Hopefully most distros' default emacs installs do the right thing these days but it's not something I've checked because these days pretty much everybody already has a setup that works for YAML because of how prevalent it is.
(I like YAML for just-about-human-editable data structures but then people try to text template it rather than actually treating it as data and then I want to cry)