Page 19 in https://smt.st/SAT_SMT_by_example.pdf shows an example in both Python and SMTLIB. After looking at a guide like TFA this book is a good next step.
Matter of fact, my ideal “assistant” is not an assistant. It doesn’t pretend to be a human, it doesn’t even use the word “I”, it just answers my fucking question in the coldest most succinct way possible.
When I buy a refrigerator, if a model has insane power consumption that makes me think about my electric bill, I’d be less inclined to buy it. The refrigerator manufacturer is going to sell less. So in a way, you can say that they’re definitely paying for those wasted resources.
For software companies it’s the same, in a way. They pay whatever the consumers deem appropriate. It’s just that consumers don’t really care for the most part, so the cost is basically $0. RAM is cheap, and if a company prioritizes shipping features over optimizing RAM, that’s almost always going to be the deciding factor.
First of all, RAM is not cheap. Second, the average consumer is not able to install more RAM even if it was. Consumer hardware often has barely enough RAM to run Windows.
Power consumption of newly sold refrigerators is regulated by the U.S. government (and, I assume, other big countries), so I don't think this argument works in your favor.
How expensive is it? Is it expensive enough to suffer a month of feature lag? A year of feature lag? A decade of feature lag? Probably not.
Regardless, we’re talking about the average user. As the article implies, on average, users’ time is in fact cheap compared to the alternative, or maybe the time cost is simply insignificant.
As much as I like super snappy and efficient native apps, we just gotta accept that no sane company is going to invest significant resources in something that isn’t used 99%+ of the time. WhatsApp (and the world) is almost exclusively mobile + web.
So it’s either feature lag, or something like this. And these days most users won’t even feel the 1GB waste.
I think we’re not far away from shipping native compiled to Wasm running on Electron in a Docker container inside a full blown VM with the virtualization software bundled in, all compiled once more to Wasm and running in the browser of your washing machine just to display the time. And honestly when things are cheap, who cares.
Do you think most people have 128GB laptop? It’s more likely to be 8 or 16 GB. The OS os likely taking 4 or more, the browser two or more. And if you add any office applications, that’s another few GB gone. Then how many Electron monstrosities you think the user can add on top of that?
Some PM: users are locked in to our ecosystem; they can't load other apps after ours! /s
But for real, the average number of apps people downloading get fewer year over year. When the most popular/essential apps take up more RAM, this effect will only exacerbate. RAM prices have also doubled over the last 3 months and I expect this to hold true for a couple more years.
It depends what metrics are considered. We can’t continue to transform earth into a wasteland eternally just because in the narrow window it takes as reference a system disconnect reward from long terms effects.
Asking as someone who is really out of the loop: how much of ML development these days touches these “lower level” parts of the stack? I’d expect that by now most of the work would be high level, and the infra would be mostly commoditized.
> how much of ML development these days touches these “lower level” parts of the stack? I’d expect that by now most of the work would be high level
Every time the high level architectures of models change, there are new lower level optimizations to be done. Even recent releases like GPT-OSS adds new areas for improvements, like MXFP4, that requires the lower level parts to created and optimized.
How often do hardware optimizations get created for lower level optimization of LLMs and Tensor physics? How reconfigurable are TPUs? Are there any standardized feature flags for TPUs yet?
Is TOPS/Whr a good efficiency metric for TPUs and for LLM model hosting operations?
> How does Cerebras WSE-3 with 44GB of 'L2' on-chip SRAM compare to Google's TPUs, Tesla's TPUs, NorthPole, Groq LPU, Tenstorrent's, and AMD's NPU designs?
There are multiple TPU vendors.
I listed multiple AI accelerator TPU products in the comment you are replying to.
> How reconfigurable are TPUs?
TIL Google's TPUs are reconfigurable with OCS Optical Circuit Switches that can be switched between for example 3D torus or twisted torus configurations.
(FWIW also, quantum libraries mostly have Line qubits and Lattice qubits. There is a recent "Layer Coding" paper; to surpass Surface Coding.)
But classical TPUs;
I had already started preparing a response to myself to improve that criteria; And then paraphrasing from 2.5pro:
> Don't rank by TOPS/wHr alone; rank by TOPS/wHr @ [Specific Precision]. Don't rank by Memory Bandwidth alone; rank by Effective Bandwidth @ [Specific Precision].
There are some not so niche communities, like FlashAttention and LinearFlashAttention repos. New code/optimizations get committed on a weekly basis. They find a couple of percents here and there all the time. How useful their kernels actually are in term of producing good results remain to be seen, but their implementations are often much better (in FLOPS) compared to what were proposed in the original papers.
It's just like game optimization, cache-friendliness and memory hierarchy-awareness are huge in attention mechanism. But programming backward pass in these lower-level stacks is definitely not fun, tensor calculus breaks my brain.
I feel like sometimes it’s a form of procrastination.
There are things we don’t want to do (talk to costumers, investors, legal, etc.), so instead we do the fun things (fun for engineers).
It’s a convenient arrangement because we can easily convince ourselves and others that we’re actually being productive (we’re not, we’re just spinning wheels).
It's the natural evolution to becoming a fun addict.
Unless you actively push yourself to do the uncomfortable work every day, you will always slowly deteriorate and you will run into huge issues in the future that could've been avoided.
I see your point. But accidental complexity is the most uncomfortable work there is to me. Do programmers really find so much fun in creating accidental complexity?
Removing it, no matter whether I created it myself, sure, that can be a hard problem.
I've certainly been guilty creating accidental complexity as a form of procrastrination I guess. But building a microservices architecture is not one of these cases.
FWIW, the alternative stack presented here for small web sites/apps seems infinitely more fun.
Immediate feedback, easy to create something visible and change things, etc.
Ironically, it could also lead to complexity when in reality, there is (for example) an actual need for a message queue.
But setting up such stuff without a need sounds easier to avoid to me than, for example, overgeneralizing some code to handle more cases than the relevant ones.
When I feel there are customer or company requirements that I can't fulfill properly, but I should, that's a hard problem for me. Or when I feel unable to clarify achievable goals and communicate productively.
But procrastrination via accidental complexity is mostly the opposite of fun to me.
It all comes back when trying to solve real problems and spending work time solving these problems is more fun than working on homemade problems.
Doing work that I am able to complete and achieving tangible results is more fun than getting tangled in a mess of unneeded complexity. I don't see how this is fun for engineers, maybe I'm not an engineer then.
Over-generalization, setting wrong priorities, that I can understand.
But setting up complex infra or a microservices architecture where it's unneeded, that doesn't seem fun to me at all :)
Normally the impetus to overcomplicate ends before devs become experienced enough to be able to even do such complex infra by themselves. It often manifests as complex code only.
Overengineered infra doesn't happen in a vacuum. There is always support from the entire company.
"Do programmers really find so much fun in creating accidental complexity?"
I certainly did for a number of years - I just had the luck that the cool things I happened to pick on in the early/mid 1990s turned out to be quite important (Web '92, Java '94).
Now my views have flipped almost completely the other way - technology as a means of delivering value.
Edit: Other cool technology that I loved like Common Lisp & CLOS, NeWS and PostScript turned out to be less useful...
I see what you mean, sometimes "accidental complexity" can also be a form of getting to know a technology really well and that can be useful and still fun. Kudos for that :)
I like your idea of doing some amount of uncomfortable work every day, internalizing it until it becomes second nature. Any tips on how to start? (other than just do it) :)
My first 5 years or so of solo bootstrapping were this. Then you learn that if you want to make money you have to prioritise the right things and not the fun things.
I'm at this stage. We have a good product with a solid architecture but only a few paying clients due to a complete lack of marketing. So I'm now doing the unfun things!
Because we have a few paying clients who seem pretty happy. We have upsold to some clients and have a couple more leads in the pipeline. We are good at stopping bots and we have managed to block most solvers, which puts us (temporarily) ahead of some very big players in the bot mitigation sector.
If we can do this with nearly zero marketing, it stands to reason that some well thought out marketing would probably work.
Or is it to satisfy the ideals of some CTO/VPE disconnected from the real world that wants architecture to be done a certain way?
I still remember doing systems design interviews a few years ago when microservices were in vogue, and my routine was probing if they were ok with a simpler monolith or if they wanted to go crazy on cloud-native, serverless and microservices shizzle.
It did backfire once on a cloud infrastructure company that had "microservices" plastered in their marketing, even though the people interviewing me actually hated it. They offered me an IC position (which I told them to fuck off), because they really hated how I did the exercise with microservices.
Before that, it almost backfired when I initially offered a monolith for a (unbeknownst to me) microservice-heavy company. Luckily I managed to read the room and pivot to microservice during the 1h systems design exercise.
EDIT: Point is, people in positions of power have very clear expectations/preferences of what they want, and it's not fun burning political capital to go against those preferences.
I dont quite follow. I understand mono vs micro services, and in the last 3 weeks I had to study for system design and do the interviews to get offers.
Its a tradeoff, and the system design interview is meant to see if one understands how systems can scale to hypothetical (maybe unrealistic) high loads. In this context the only reason for a microservice is independent scaling and with that also fault tolerance if an unimportant service goes down. But its really the independent scaling. One would clearly say that a monolith is good for the start because it offer simplicity or low complexity but it doesn't scale well to the hypothetical of mega scale.
In practice, it seems not to be a tradeoff but an ideology. Largely because you can't measure the counter-factual of building the app the other way.
It's been a long time since I've done "normal" web development, but I've done a number of high-performance or high-reliability non-web applications, and I think people really underestimate vertical scaling. Even back in the early 2000s when it was slightly hard to get a machine with 128GB of RAM to run some chip design software, doing so was much easier than trying to design a distributed system to handle the problem.
(we had a distributed system of ccache/distcc to handle building the thing instead)
Do people have a good example of microservices they can point us to the source of? By definition it's not one of those things that makes much sense with toy-sized examples. Things like Amazon and Twitter have "micro" services that are very much not micro.
I dont disagree, but you can horizontally scale a monolith too, no? So scaling vert vs horiz is independent of microservices, its just that separating services allows you to be precise with your scaling. Ie you can scale up a compute heavy micorservice by 100x, the upload service by 10x but keep the user service at low scale. I agree that one can vert scale, why not. And I also agree that there are probably big microservices. At my last workplace, we also had people very bullish on microservices but for bad reasons and it didn't make sense, ie ideology.
Can you elaborate on the fault tolerance advantage of micro services?
For context, my current project is a monolith web app with services being part of the monolith and called with try/catch. I can understand perhaps faster, independent, less risky recovery in the micro services case but don’t quite understand the fault tolerance gain.
Im no world leading expert but as far as I understand, coupled with events, if an unimportant service goes offline for 5 min (due to some crash, ie "fault"), its possible to have a graceful degradation, meaning the rest of the system still works, maybe with reduced ability. With events, other systems simply stop receiving events from the dead service. I agree you can achieve a lot of this also in a monolith with try catch and error handling, but I guess there is an inherent decoupling in having different services run on separate nodes.
I fully agree on workplace politics, but for system design interviews, are you not also just supposed to ask your interviewer, ie give them your premises and if they like your conclusions? I also understand that some companies and their interviews are weird, but thats okay too, no? You just reject them and move on.
If there's a big enough bias, the questions become entirely about finding that bias. And on 90% of cases the systems design questions are about something they designed in-house, and they often don't have a lot of experience as well.
Also: if there's limited knowledge on the interviewer side, an incorrect answer to a question might throw off a more experienced candidate.
It's no big deal but it becomes more about reading the room and knowing the company/interviewers than being honest in what you would do. People don't want to hear that their pet solution is not the best. Of course you still need to know the tech and explain it all.
I worked for a company once where the CEO said I need to start using Kubernetes. Why? We didn't really have any pressing use cases / issues that were shouting out for Kubernetes at all.
His reasoning was all the big players use it, so we should be too...
It was literally a solution looking for a problem. Which is completely arse backwards.
It absolutely drives me nuts when people spend so much time building something but make it difficult to show you what they’ve built.
A short code snippet (with syntax highlighting thank you) should be the first thing on your page.
I do not have to scroll through a huge wall of text (probably AI generated), 2 images (definitely AI generated), miss it, start clicking links, still not find it, hit the back button, scroll through the slop again, etc.
I want to see the thing, I don’t care about what you have to say about the thing until I can get a sense of the thing.
reply