Hacker Newsnew | past | comments | ask | show | jobs | submit | brap's commentslogin

That’s a very clean API.

It's very close to the SMTLIB API.

Page 19 in https://smt.st/SAT_SMT_by_example.pdf shows an example in both Python and SMTLIB. After looking at a guide like TFA this book is a good next step.


>warmer

I actually wish they’d make it colder.

Matter of fact, my ideal “assistant” is not an assistant. It doesn’t pretend to be a human, it doesn’t even use the word “I”, it just answers my fucking question in the coldest most succinct way possible.


When I buy a refrigerator, if a model has insane power consumption that makes me think about my electric bill, I’d be less inclined to buy it. The refrigerator manufacturer is going to sell less. So in a way, you can say that they’re definitely paying for those wasted resources.

For software companies it’s the same, in a way. They pay whatever the consumers deem appropriate. It’s just that consumers don’t really care for the most part, so the cost is basically $0. RAM is cheap, and if a company prioritizes shipping features over optimizing RAM, that’s almost always going to be the deciding factor.


First of all, RAM is not cheap. Second, the average consumer is not able to install more RAM even if it was. Consumer hardware often has barely enough RAM to run Windows.

> It’s just that consumers don’t really care for the most part, so the cost is basically $0.

It's zero, because most people cannot just not use WhatsApp, M$ Teams, etc. because WhatsApp has a monopoly on their centralized garden.

It's not like a Matrix where I can easily switch clients like I can switch refrigerators.

Even if someone would be able to provide an alternative, Facebook would just "kill" them like Apple did kill Beepers iMessage alternative.


Power consumption of newly sold refrigerators is regulated by the U.S. government (and, I assume, other big countries), so I don't think this argument works in your favor.

> RAM is cheap

My time isn't


How expensive is it? Is it expensive enough to suffer a month of feature lag? A year of feature lag? A decade of feature lag? Probably not.

Regardless, we’re talking about the average user. As the article implies, on average, users’ time is in fact cheap compared to the alternative, or maybe the time cost is simply insignificant.


What is the alternative?

Additional 100k cost per year vs time, energy and hardware for billions of users?


> RAM is cheap

Have you checked RAM prices this week?


Cheap in the sense that most target users don’t have to worry about allocating it

Are you sure? Or do you think users only run one app on their desktop?

Yeah, I kinda agree with that reluctantly.

As much as I like super snappy and efficient native apps, we just gotta accept that no sane company is going to invest significant resources in something that isn’t used 99%+ of the time. WhatsApp (and the world) is almost exclusively mobile + web.

So it’s either feature lag, or something like this. And these days most users won’t even feel the 1GB waste.

I think we’re not far away from shipping native compiled to Wasm running on Electron in a Docker container inside a full blown VM with the virtualization software bundled in, all compiled once more to Wasm and running in the browser of your washing machine just to display the time. And honestly when things are cheap, who cares.


Do you think most people have 128GB laptop? It’s more likely to be 8 or 16 GB. The OS os likely taking 4 or more, the browser two or more. And if you add any office applications, that’s another few GB gone. Then how many Electron monstrosities you think the user can add on top of that?

Some PM: users are locked in to our ecosystem; they can't load other apps after ours! /s

But for real, the average number of apps people downloading get fewer year over year. When the most popular/essential apps take up more RAM, this effect will only exacerbate. RAM prices have also doubled over the last 3 months and I expect this to hold true for a couple more years.


> And honestly when things are cheap, who cares.

It depends what metrics are considered. We can’t continue to transform earth into a wasteland eternally just because in the narrow window it takes as reference a system disconnect reward from long terms effects.


I read 2-3 sentences and then immediately decided to check the author’s credentials. Fashion designer. That’s a hard no from me.


Asking as someone who is really out of the loop: how much of ML development these days touches these “lower level” parts of the stack? I’d expect that by now most of the work would be high level, and the infra would be mostly commoditized.

> how much of ML development these days touches these “lower level” parts of the stack? I’d expect that by now most of the work would be high level

Every time the high level architectures of models change, there are new lower level optimizations to be done. Even recent releases like GPT-OSS adds new areas for improvements, like MXFP4, that requires the lower level parts to created and optimized.


How often do hardware optimizations get created for lower level optimization of LLMs and Tensor physics? How reconfigurable are TPUs? Are there any standardized feature flags for TPUs yet?

Is TOPS/Whr a good efficiency metric for TPUs and for LLM model hosting operations?

From https://news.ycombinator.com/item?id=45775181 re: current TPUs in 2025; "AI accelerators" :

> How does Cerebras WSE-3 with 44GB of 'L2' on-chip SRAM compare to Google's TPUs, Tesla's TPUs, NorthPole, Groq LPU, Tenstorrent's, and AMD's NPU designs?


this is like 5 different questions all across the landscape - what exactly do you think answers will do for you?

> How often do hardware optimizations get created for lower level optimization of LLMs and Tensor physics?

LLMs? all the time? "tensor physics" (whatever that is) never

> How reconfigurable are TPUs?

very? as reconfigurable as any other programmable device?

> Are there any standardized feature flags for TPUs yet?

have no idea what a feature flag is in this context nor why they would be standardized (there's only one manufacturer/vendor/supplier of TPUs).

> Is TOPS/Whr a good efficiency metric for TPUs and for LLM model hosting operations?

i don't see why it wouldn't be? you're just asking is (stuff done)/(energy consumed) a good measure of efficiency to which the answer is yes?


> have no idea what a feature flag is in this context nor why they would be standardized (there's only one manufacturer/vendor/supplier of TPUs).

X86, ARM, and RISC have all standardized on feature flags which can be reviewed on Linux with /proc/cpuinfo or with dmidecode.

  cat /proc/cpuinfo | grep -E '^processor|Features|^BogoMIPS|^CPU'
There are multiple TPU vendors. I listed multiple AI accelerator TPU products in the comment you are replying to.

> How reconfigurable are TPUs?

TIL Google's TPUs are reconfigurable with OCS Optical Circuit Switches that can be switched between for example 3D torus or twisted torus configurations.

(FWIW also, quantum libraries mostly have Line qubits and Lattice qubits. There is a recent "Layer Coding" paper; to surpass Surface Coding.)

But classical TPUs;

I had already started preparing a response to myself to improve that criteria; And then paraphrasing from 2.5pro:

> Don't rank by TOPS/wHr alone; rank by TOPS/wHr @ [Specific Precision]. Don't rank by Memory Bandwidth alone; rank by Effective Bandwidth @ [Specific Precision].

Hardware Rank criteria for LLM hosting costs:

Criterion 1: EGB (Effective Generative Bandwidth) Memory Bandwidth (GB/s) / Precision (Bytes)

Criterion 2: GE (Generative Efficiency) EGB / Total Board Power (Watts)

Criterion 3: TTFT Potential Raw TOPS @ Prompt Precision

LLM hosting metrics: Tokens Per Second (TPS) for throughput, Time to First Token (TTFT) for latency, and Tokens Per Joule for efficiency.


> There are multiple TPU vendors

There are not - TPU is literally a Google trademark:

> Tensor Processing Unit (TPU) is an AI accelerator application-specific integrated circuit (ASIC) developed by Google.

https://en.wikipedia.org/wiki/Tensor_Processing_Unit

The rest of what you're talking about is irrelevant



There are some not so niche communities, like FlashAttention and LinearFlashAttention repos. New code/optimizations get committed on a weekly basis. They find a couple of percents here and there all the time. How useful their kernels actually are in term of producing good results remain to be seen, but their implementations are often much better (in FLOPS) compared to what were proposed in the original papers.

It's just like game optimization, cache-friendliness and memory hierarchy-awareness are huge in attention mechanism. But programming backward pass in these lower-level stacks is definitely not fun, tensor calculus breaks my brain.


a recent wave of interest in bitwise equivalent execution had a lot of kernels this level get pumped out.

new attention mechanisms also often need new kernels to run at any reasonable rate

theres definitely a breed of frontend-only ML dev that dominates the space, but a lot novel exploration needs new kernels


I feel like sometimes it’s a form of procrastination.

There are things we don’t want to do (talk to costumers, investors, legal, etc.), so instead we do the fun things (fun for engineers).

It’s a convenient arrangement because we can easily convince ourselves and others that we’re actually being productive (we’re not, we’re just spinning wheels).


It's the natural evolution to becoming a fun addict.

Unless you actively push yourself to do the uncomfortable work every day, you will always slowly deteriorate and you will run into huge issues in the future that could've been avoided.

And that doesn't just apply to software.


I see your point. But accidental complexity is the most uncomfortable work there is to me. Do programmers really find so much fun in creating accidental complexity?

Removing it, no matter whether I created it myself, sure, that can be a hard problem.

I've certainly been guilty creating accidental complexity as a form of procrastrination I guess. But building a microservices architecture is not one of these cases.

FWIW, the alternative stack presented here for small web sites/apps seems infinitely more fun. Immediate feedback, easy to create something visible and change things, etc.

Ironically, it could also lead to complexity when in reality, there is (for example) an actual need for a message queue.

But setting up such stuff without a need sounds easier to avoid to me than, for example, overgeneralizing some code to handle more cases than the relevant ones.

When I feel there are customer or company requirements that I can't fulfill properly, but I should, that's a hard problem for me. Or when I feel unable to clarify achievable goals and communicate productively.

But procrastrination via accidental complexity is mostly the opposite of fun to me.

It all comes back when trying to solve real problems and spending work time solving these problems is more fun than working on homemade problems.

Doing work that I am able to complete and achieving tangible results is more fun than getting tangled in a mess of unneeded complexity. I don't see how this is fun for engineers, maybe I'm not an engineer then.

Over-generalization, setting wrong priorities, that I can understand.

But setting up complex infra or a microservices architecture where it's unneeded, that doesn't seem fun to me at all :)


I 100% agree.

Normally the impetus to overcomplicate ends before devs become experienced enough to be able to even do such complex infra by themselves. It often manifests as complex code only.

Overengineered infra doesn't happen in a vacuum. There is always support from the entire company.


"Do programmers really find so much fun in creating accidental complexity?"

I certainly did for a number of years - I just had the luck that the cool things I happened to pick on in the early/mid 1990s turned out to be quite important (Web '92, Java '94).

Now my views have flipped almost completely the other way - technology as a means of delivering value.

Edit: Other cool technology that I loved like Common Lisp & CLOS, NeWS and PostScript turned out to be less useful...


I see what you mean, sometimes "accidental complexity" can also be a form of getting to know a technology really well and that can be useful and still fun. Kudos for that :)

Oh yes I loved building stuff with all these technologies mostly for my own entertainment - fortunately I was in academia so could indulge myself. ;-)

> Do programmers really find so much fun in creating accidental complexity?

I believe only bad (inexperienced) programmers do.


> a fun addict

Interesting term. Probably pretty on-point.

I’ve been shipping (as opposed to just “writing”) software for almost my entire adult life.

In my experience, there’s a lot of “not fun” stuff involved in shipping.


I like your idea of doing some amount of uncomfortable work every day, internalizing it until it becomes second nature. Any tips on how to start? (other than just do it) :)

You know what, you're right.

I should get off HN, close the editor where I'm dicking about with HTMX, and actually close some fucking tickets today.

Right after I make another pot of coffee.

...

No. Now. Two tickets, then coffee.

Thank you for the kick up the arse.


My first 5 years or so of solo bootstrapping were this. Then you learn that if you want to make money you have to prioritise the right things and not the fun things.

I'm at this stage. We have a good product with a solid architecture but only a few paying clients due to a complete lack of marketing. So I'm now doing the unfun things!

If you have had zero marketing, how do you know what you have is a good product?

Because we have a few paying clients who seem pretty happy. We have upsold to some clients and have a couple more leads in the pipeline. We are good at stopping bots and we have managed to block most solvers, which puts us (temporarily) ahead of some very big players in the bot mitigation sector.

If we can do this with nearly zero marketing, it stands to reason that some well thought out marketing would probably work.


Not really . Even Cloudflares free bot detection is better .

Is it really for "fun"?

Or is it to satisfy the ideals of some CTO/VPE disconnected from the real world that wants architecture to be done a certain way?

I still remember doing systems design interviews a few years ago when microservices were in vogue, and my routine was probing if they were ok with a simpler monolith or if they wanted to go crazy on cloud-native, serverless and microservices shizzle.

It did backfire once on a cloud infrastructure company that had "microservices" plastered in their marketing, even though the people interviewing me actually hated it. They offered me an IC position (which I told them to fuck off), because they really hated how I did the exercise with microservices.

Before that, it almost backfired when I initially offered a monolith for a (unbeknownst to me) microservice-heavy company. Luckily I managed to read the room and pivot to microservice during the 1h systems design exercise.

EDIT: Point is, people in positions of power have very clear expectations/preferences of what they want, and it's not fun burning political capital to go against those preferences.


I dont quite follow. I understand mono vs micro services, and in the last 3 weeks I had to study for system design and do the interviews to get offers. Its a tradeoff, and the system design interview is meant to see if one understands how systems can scale to hypothetical (maybe unrealistic) high loads. In this context the only reason for a microservice is independent scaling and with that also fault tolerance if an unimportant service goes down. But its really the independent scaling. One would clearly say that a monolith is good for the start because it offer simplicity or low complexity but it doesn't scale well to the hypothetical of mega scale.

In practice, it seems not to be a tradeoff but an ideology. Largely because you can't measure the counter-factual of building the app the other way.

It's been a long time since I've done "normal" web development, but I've done a number of high-performance or high-reliability non-web applications, and I think people really underestimate vertical scaling. Even back in the early 2000s when it was slightly hard to get a machine with 128GB of RAM to run some chip design software, doing so was much easier than trying to design a distributed system to handle the problem.

(we had a distributed system of ccache/distcc to handle building the thing instead)

Do people have a good example of microservices they can point us to the source of? By definition it's not one of those things that makes much sense with toy-sized examples. Things like Amazon and Twitter have "micro" services that are very much not micro.


I dont disagree, but you can horizontally scale a monolith too, no? So scaling vert vs horiz is independent of microservices, its just that separating services allows you to be precise with your scaling. Ie you can scale up a compute heavy micorservice by 100x, the upload service by 10x but keep the user service at low scale. I agree that one can vert scale, why not. And I also agree that there are probably big microservices. At my last workplace, we also had people very bullish on microservices but for bad reasons and it didn't make sense, ie ideology.

Can you elaborate on the fault tolerance advantage of micro services?

For context, my current project is a monolith web app with services being part of the monolith and called with try/catch. I can understand perhaps faster, independent, less risky recovery in the micro services case but don’t quite understand the fault tolerance gain.


Im no world leading expert but as far as I understand, coupled with events, if an unimportant service goes offline for 5 min (due to some crash, ie "fault"), its possible to have a graceful degradation, meaning the rest of the system still works, maybe with reduced ability. With events, other systems simply stop receiving events from the dead service. I agree you can achieve a lot of this also in a monolith with try catch and error handling, but I guess there is an inherent decoupling in having different services run on separate nodes.

The point is that it doesn't matter which is better or worse for the case, or if you know the pros/cons of each:

In those interviews (and in real work too) people still want you skewing towards certain answers. They wanna see you draw their pet architecture.

And it's the same thing in the workplace.


I fully agree on workplace politics, but for system design interviews, are you not also just supposed to ask your interviewer, ie give them your premises and if they like your conclusions? I also understand that some companies and their interviews are weird, but thats okay too, no? You just reject them and move on.

If there's a big enough bias, the questions become entirely about finding that bias. And on 90% of cases the systems design questions are about something they designed in-house, and they often don't have a lot of experience as well.

Also: if there's limited knowledge on the interviewer side, an incorrect answer to a question might throw off a more experienced candidate.

It's no big deal but it becomes more about reading the room and knowing the company/interviewers than being honest in what you would do. People don't want to hear that their pet solution is not the best. Of course you still need to know the tech and explain it all.


I worked for a company once where the CEO said I need to start using Kubernetes. Why? We didn't really have any pressing use cases / issues that were shouting out for Kubernetes at all.

His reasoning was all the big players use it, so we should be too...

It was literally a solution looking for a problem. Which is completely arse backwards.


I had the same situation happening with Cloud Native.

EC2 was forbidden, it had to be ECS or EKS if Lambda was not possible.

We did the math and the AWS bill had the cost of about 15 developers.


It's also virtue signaling of what a great engineer they are. Have you wired together ABC with XYZ? No? Well I did... blah blah blah

It absolutely drives me nuts when people spend so much time building something but make it difficult to show you what they’ve built.

A short code snippet (with syntax highlighting thank you) should be the first thing on your page.

I do not have to scroll through a huge wall of text (probably AI generated), 2 images (definitely AI generated), miss it, start clicking links, still not find it, hit the back button, scroll through the slop again, etc.

I want to see the thing, I don’t care about what you have to say about the thing until I can get a sense of the thing.


> when people spend so much time building something

I do not think that much human time was spent on this actually.


How about we treat people like adults?


What is to reason, if not to generate a seeming of reasoning?

(tips fedora)


You said the quiet part out loud of political debate.

(does something)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: