More

dangoldin · 2025-10-20T14:28:06 1760970486

I worked at an adtech company where we invested a bit in HA across AZ + regions. Lo and behold there was an AWS outage and we stayed up. Too bad our customers didn't and we still took the revenue hit.

Lesson here is that your approach will depend on your industry and peers. Every market will have their won philosophy and requirements here.

dangoldin · 2025-06-17T02:02:37 1750125757

I'd reach out to your former manager that you got along with and ask for their advice. Seems you had a good rapport and they know the company dynamics.

dangoldin · 2025-01-14T21:02:51 1736888571

I'm sure you've heard of SQLMesh but that seems like a potential fit. Or is it still too heavy handed?

dangoldin · 2025-01-14T02:00:57 1736820057

Problem is that often you also end up relying on GitHub for CI/CD so not as easy of a change. Imagine GH being down and you need to deploy a hotfix. How do you handle that? Especially, if you followed best practices and set up a system where all PRs need to go through code review.

hamandcheese · 2025-01-14T02:13:49 1736820829

Systems like these should have an escape hatch of some sort. The key part is that it needs to be auditable.

Anything you do in CI should be possible outside of CI, at least by some subset of users.

citizenpaul · 2025-01-14T02:56:39 1736823399

I've seen numerous "escape hatches" over the years that actually just turned out to be painted on the wall. If you actually tried to use them. No one ever does though.

I don't think its malice. I just think its pretty uncommon for anyone to intentionally back out of a structural tech decision so it gets forgotten about and remains un-battle tested.. That or the timeline is longer than SaaS has been around.

dangoldin · 2025-01-14T02:46:00 1736822760

Yea - definitely. Just not ideal and something that needs to be built out, tested, etc.

hamandcheese · 2025-01-14T03:22:37 1736824957

Yes, it is easier said than done. At my company we use buildkite and many people wrote scripts that simply fail outside of buildkite.

GitHub actions is even worse, it seems like it was designed from the ground up to create lock in.

Nix helps a bit on the bootstrapping and dependency management problem, but won't save you from writing a script that is too tightly coupled to its runtime environment.

supriyo-biswas · 2025-01-14T06:04:32 1736834672

This is why I personally like to use none of the CI features, and mostly use it like a shell script executor. Images? Stick to OS images only so that you can easily spin them up with `docker run` locally. Artifacts? Read and write them into S3 buckets and avoid the native artifact features.

This is obviously more difficult in the Github actions ecosystem, but I have mostly used Gitlab CI so far. My CI pipelines mostly look like this:

    image: ubuntu:24.04
    before_script:
        - apt-get install ...
    script:
        - ./ci/build-project.sh
    after_script:
        - ./ci/upload-build-artifacts.sh

nosefurhairdo · 2025-01-14T02:47:49 1736822869

I've run into a scenario where one of our rarely used environments needed a hotfix and the GitHub action we used to deploy there was broken. Was easy enough to translate GitHub action steps to shell scripting for a quick, manual deployment.

dangoldin · 2024-12-18T17:11:45 1734541905

Yea - the idea is that Snowflake will generate these after a query runs in order to help you look at multiple runs of the same query. So imagine you run a query that's "select a from b where c = 1" and you want to find all examples of that query running. That's where "query_hash" comes in. But Snowflake also says well what if we let you be generic about the parameters - so "where c=1" and "where c=2" and "where c=300000" all have the same query_parameterized_hash.

That's the intent but turns out it's only doing a very simple hashing and not actually looking at the canonical version of the query. For example it won't treat aliases/renames as the same even though it should. This makes it harder to look at all queries that are in essence doing the same thing.

ryanwaldorf · 2024-12-18T17:40:49 1734543649

Oh that's really interesting! I imagine there could be a reason for it, for instance the data is distributed differently in the micropartitions so different where values could result in different data lookup patterns as you may skip more/less blocks. But overall this makes a lot of sense!

dangoldin · on June 26, 2024

Really cool stuff and a nice introduction but curious how much modern compilers do for you already. Especially if you shift to the JIT world - what ends up being the difference between code where people optimize for this vs write in a style optimized around code readability/reuse/etc.

tux1968 · on June 26, 2024

JIT compilers can't compensate for poorly organized data. Ultimately, understanding these low-level concepts, affect high-level algorithm design and selection.

Watching the Andrew Kelly video mentioned above, really drives home the point that even if your compiler automatically optimizes structure ordering, to minimize padding and alignment issues, it can't fix other higher-level decisions. An example being, using two separate lists of structs to maintain their state data, rather than a single list with each struct having an enum to record its state.

kllrnohj · on June 26, 2024

JIT languages tend to have the worst language-provided locality as they are often accompanied by GCs and lack of value types (there are exceptions to this, but it's broadly the case). And a JIT cannot re-arrange heap memory layout of objects as it must be hot-swappable. This is why despite incredibly huge investments in them such languages just never reach aot performance despite how much theoretical advantage a jit could have.

AOT'd languages could re-arrange a struct for better locality however the majority (if not all) languages rigidly require the fields are laid out in the order defined for various reasons.

gpderetta · on June 27, 2024

> the majority (if not all) languages rigidly require the fields are laid out in the order defined for various reasons.

The as-if rule gives an escape hatch, although in practice it is hard to take advantage of, especially without whole program optimization.

almostgotcaught · on June 27, 2024

I wish people would stop saying this. It's like CS-woo the idea that some magical solution exists that saves you from having to think real hard about the hardware, that magical abstractions save the day.

All of these things boil down to combinatorial optimization problems (bin packing ring a bell?). And there are no widely available compilers or JITs or whatever that bundle ILP solvers. Thus, what you're really getting with every compiler is a heuristic/approximate solution (to many many combinatorial optimization problems). Decide for yourself whether you're comfortable with your code just being approximately good or if you need to actually understand how your system works.

dangoldin · on June 1, 2024

FWIW - Redpanda open sources their core product - https://github.com/redpanda-data/ while WarpStream keeps their core product proprietary - https://github.com/warpstreamlabs

xyzzy_plugh · on June 1, 2024

Unfortunately, neither are Open Source Software. The BSL is not FOSS. They're both proprietary.

dangoldin · on June 1, 2024

Yea - I get that argument but these days it's just hard to do infra as true FOSS with the hyperscalers and current cloud economics. There is a community license and and the code is visible. Not saying it's ideal but Redpanda is further into the open source world than WarpStream.

xyzzy_plugh · on June 1, 2024

Not really? I'm not a stickler on the term "open source" but they're both proprietary at the end of the day. It's a weird nit to pick. Why even bring it up at all, unless you're desperate to defend Redpanda?

I can see the source code of Unreal Engine too. Does that make them "further into the open source world" than WarpStream too?

I don't have a horse in this particular race but WarpStream's blog post is a lot more charitable towards the project in question, and the open source world in general, than Redpanda's.

dangoldin · on April 28, 2024

You don’t need to. dbt/sqlmesh are competitive. I just like the model of sqlmesh over dbt but dbt is much more dominant.

dangoldin · on April 28, 2024

From egress + storage cost standpoint absolutely which ends up being a big factor for these large scale data systems.

There’s a prior discussion on HN about that post: https://news.ycombinator.com/item?id=38118577

And full disclosure but I’m author of both posts - just shifted my writing to be more focused on the company one.

dangoldin · on April 28, 2024

Author here. Basic idea is you want some way of defining metrics. So something like “revenue = sum(sales) - sum(discount)” or “retention = whatever” which need to be generated via SQL at query time vs built in to a table. Then you can have higher confidence multiple access paths all have the same definitions for the metrics.