More

ehsanu1 · 2026-02-05T01:34:17 1770255257

I wonder how this compares to running sqlite off of an s3-backed ZeroFS https://github.com/Barre/ZeroFS

ehsanu1 · 2026-02-04T20:06:10 1770235570

To spell it out for myself and others: approaching equivalent calculations for each individual attention block means we also approach equivalent performance for the combination of them. And with an error bar approaching floating point accuracy, the performance should be practically identical to regular attention. Elementwise errors of this magnitude can't lead to any noteworthy changes in the overall result, especially given how robust LLM networks seem to be to small deviations.

ehsanu1 · 2026-02-04T19:56:38 1770234998

Could you substantiate that? That take into account training and staffing costs?

ihsw · 2026-02-04T20:07:08 1770235628

The parent specifically said inference, which does not include training and staffing costs.

ehsanu1 · 2026-02-05T01:36:31 1770255391

But those aren't things you can really separate for proprietary models. Keeping inference running also requires staff, not just for the R&D.

ehsanu1 · 2026-01-21T21:31:19 1769031079

https://www.anthropic.com/news/anthropic-and-the-department-...

ehsanu1 · 2026-01-15T06:39:34 1768459174

I've gotten great results applying it to file paths + signatures. Even better if you also fuse those results with BM25.

CuriouslyC · 2026-01-15T13:37:50 1768484270

I like embeddings for natural language documents where your query terms are unlikely to be unique, and overall document direction is a good disambiguator.

ehsanu1 · 2026-01-15T06:38:18 1768459098

Embedded usearch vector database. https://github.com/unum-cloud/USearch

ehsanu1 · 2026-01-15T01:09:21 1768439361

They have rate limits for this purpose. Many folks run claude code instances in parallel, which has roughly the same characteristics.

ankit219 · 2026-01-15T02:27:51 1768444071

Not the same.

they have usage limits on subscription. I dont know about rate limits. Certainly not per request.

ehsanu1 · 2026-01-15T01:06:59 1768439219

You can't control it to the level of individual LLM requests and orchestration of those. And that is very valuable, practically required, to build a tool like this. Otherwise, you just have a wrapper over another big program and can barely do anything interesting/useful to make it actually work better.

Wowfunhappy · 2026-01-15T01:27:25 1768440445

What can't you do exactly? You can send Claude arbitrary user prompts—with arbitrary custom system prompts—and get text back. You can then put those text responses into whatever larger system you want.

ehsanu1 · 2026-01-16T22:23:15 1768602195

You don't get a simple request/response paradigm with claude code: 1 message from the user results in a loop that usually invokes many inner LLM requests, among other business logic, resulting in some user-visible output and a bunch of less visible stuff (filesystem changes, etc). You control an input to the outer loop: you can only do some limited stuff with hooks to control what happens within the loop. But there's a lot happening inside that loop that you have no say over.

A simple example: can you arbitrarily manipulate the historical context of a given request to the LLM? It's useful to do that sometimes. Another one: can you create a programmatic flow that tries 3 different LLM requests, then uses an LLM judge to contrast and combine into a best final answer? Sure, you could write a prompt that says do that, but that won't yield equivalent results.

These are just examples, the point is you don't get fine control.

handfuloflight · 2026-01-15T01:43:10 1768441390

May as well just use Claude Code then.

Wowfunhappy · 2026-01-15T01:45:35 1768441535

Well, I do use Claude Code myself, but I'd thought the point of OpenCode was that it could combine the responses of multiple LLMs.

ehsanu1 · 2026-01-10T20:29:23 1768076963

It has to be suited for human consumption too though.

I wonder if this has any real benefits over just doing very simple html wireframing with highly constrained css, which is readily renderable for human consumption. I guess pure text makes it easier to ignore many stylistic factors as they are harder to represent if not impossible. But I'm sure that LLMs have a lot more training data on html/css, and I'd expect them to easily follow instructions to produce html/css for a mockup/wireframe.

ehsanu1 · 2026-01-10T20:24:23 1768076663

It kind of makes sense if you relate it to ASCII art, which is very often not ASCII for similar reasons. The naming evokes that concept for me at least. Naming is hard in general, I'm sure they tried to find a name that they thought worked best.

I agree that "TUI" is a better fit though. But not TUI-driven-development, more like TUI-driven-design, followed by using the textual design as a spec (i.e. spec-driven development) to drive GUI implementation via coding agents.