Hacker Newsnew | past | comments | ask | show | jobs | submit | graemefawcett's commentslogin

Exactly. They've implemented a VM inside a transformer, turned an O(1) memory access call into O(n), optimized it down to O(log n) and wrote a post about how smart they are.

It's a nice bit of engineering, if you don't subscribe to YAGNI. If you do, you must ask the obvious question of what capability this delivers that wasn't available before. The only answer I've got is that someone must have been a bit chilly and couldn't figure out the thermostat


Basically just madlibs - the models generate intermediate tokens that help predict a better answer based on training (RLHF & otherwise). They tend to look like "reasoning" because those tokens correlated with accepted answers during training.

Extended thinking passes are just more of the same. The entire methodology exists merely to provide additional context for the autoregression process. There is no traditional computation occurring


This is exactly how problem solving works, regardless of the substrate of cognition.

Start with "all your questions contained in randomness" -> the unconstrained solution space.

The game is whether or not you can inject enough constraints to collapse the solution space to one that can be solved before your TTL expires. In software, that's generally handled by writing efficient algorithms. With LLMs, apparently the SOTA for this is just "more data centers, 6 months, keep pulling the handle until the right tokens fall out".

Intelligence is just knowing which constraints to apply and in what order such that the search space is effectively partitioned, same thing the "reasoning" traces do. Same thing thermostats, bacteria, sorting algorithms and rivers do, given enough timescale. You can do the same thing with effective prompting.

The LLM has no grounding, no experience and no context other than which is provided to it. You either need to build that or be that in order for the LLM to work effectively. Yes, the answers for all your questions are contained. No, it's not randomness. It's probability and that can be navigated if you know how


You can constrain the solution space all you want, but if you don't have a method to come up with possible solutions that might match the constraints, you ll be just sitting there all day long for the machine to produce some results. So intelligence is not "just knowing which constraints to apply". It is also the ability to come up with solutions within the constraints without going through a lot of trial and error...

But hey, if LLMs can go through a lot of trial and error, it might produce useful results, but that is not intelligence. It is just a highly constrained random solution generator..


I believe that's I and the paper are both saying as well. The LLM is pure routing, the constraints currently are located elsewhere in the system. In this case, both the constraints and the motivation to perform the work are located in Knuth and his assistant.

Routing is important, it's why we keep building systems that do it faster and over more degrees of freedom. LLMs aren't intelligent on their own, but it's not because they don't have enough parameters


Connecting them is easy, one is the math of the exchange and one of the state machine.

A better question might be why no one is paying more attention to Barandes at Harvard. He's been publishing the answer to that question for a while, if you stop trying to smuggle a Markovian embedding in a non-Markovian process you stop getting weird things like infinities at boundaries that can't be worked out from current position alone.

But you could just dump a prompt into an LLM and pull the handle a few dozen times and see what pops out too. Maybe whip up a Claw skill or two

Unconstrained solution space exploration is surely the way to solve the hard problems

Ask those Millenium Prize guys how well that's working out :)

Constraint engineering is all software development has ever been, or did we forget how entropy works? Someone should remind the folk chasing P=NP that the observer might need a pen to write down his answers, or are we smuggling more things for free that change the entire game? As soon as the locations of the witness cost, our poor little guy can't keep walking that hypercube forever. Can he?

Maybe 6 months and a few data centers will do it ;)


I think the biggest benefit is bandwidth more so than efficiency. This gives you multiple streams to mux which and a means to control their mixing.

The biggest innovation I think may have been accidental. The doubly stochastic matrix implements conservation on the signal stream.

Treating the signal like the information it is as we do in any other domain is crucial for maintaining its coherence. We don't allow a network router to generate more packets than it receives for the same reason.


Are you aware that there are certain members of your very own species that are as intelligent as you or I, who lack those qualia.

Non standard cognitive architectures are already coherent. Even if they were, why do you think qualia cannot be replicated with a similar signal to semantic meaning? Are there additional dimensions that we can feel that we've never talked about or more importantly, written down?


I took that idea just as far as I could and landed here

https://zenodo.org/records/18181233

It parses the AST out of it and then has a play

We're using it as an agentic control plane for a fortune 500s developer platform.

Keep going with yours, you'll find it


We propose SyneState: a communication prosthetic built on DeepSeek's Manifold-Constrained Hyper-Connections (mHC) architecture. By parameterizing cross-channel mixing as a doubly-stochastic matrix constrained to the Birkhoff polytope, we can: (1) induce machine synesthesia—stable, tunable cross-modal binding between latent streams; (2) learn personalized binding matrices that approximate individual cognitive architectures; and (3) translate between compression levels—expanding high-compression encodings into explicit single-channel representations and vice versa.

  For mHC implementers, SyneState is a direct application of manifold-constrained mixing to cross-modal binding. For cognitive science and clinical researchers, it offers a candidate prosthetic for the double-empathy problem: bridging communication gaps not by "fixing" either party, but by learning the translation between different cognitive compression schemes.

  All required components—multi-stream residuals, Sinkhorn projection, multimodal attention heads—exist in production stacks today. This is integration work, not research.


Besides, if everyone could paint the Sisten Chapel, then we'd have works equivalent to the Sistene Chapell everywhere.

Why is that a problem?

That to me sounds like the opposite of a problem.

Used effectively, these tools are elevators, enhancing the capabilities of everything they touch.

Telling them to paint you a picture results in the word you envision.

Painting a picture with them is how you see mine


Is art then just the outcome? The artifact that was produced?

What's your criteria then for who is allowed to produce art? If allowing everyone to create it lessens its value such that it becomes worthless, there must be a cutoff.

If your goal is to ensure the continuity of human expression, limiting who is allowed to create art and narrowly defining art to great works kind of misses the point.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: