More

pizza · 2025-07-13T22:52:34 1752447154

Say what you will but customers pay to not suck it up

pizza · 2025-07-04T08:07:20 1751616440

Anything with a linter means, at minimum, free verifiable rewards for RL (though whether something parses versus looks good is another story). That, plus, they have more data than anyone, and also it seems somewhat reasonable that stronger models could learn 'more' from a given instance or set of examples.

agaase19 · 2025-07-04T09:51:21 1751622681

Can you elaborate on "linter means and verifiable rewards for RL"? Is this something others would find extremely difficult to do ?

holden_nelson · 2025-07-05T04:54:51 1751691291

They’re saying that they can use linters to check the output from a reinforcement learning model and reward it for correct output.

pizza · 2025-07-02T07:12:18 1751440338

> 3. There are no applications outside of potentially other pure math research.

I would feel remiss not to say: such statements rarely hold

jerf · 2025-07-02T14:54:01 1751468041

In this case, what the research says is that the approximations we have already been using for a long time are correct. "You're already right, keep doing what you're doing!" is not generally something people consider a "practical application".

franktankbank · 2025-07-02T11:08:55 1751454535

That's high praise!

pizza · 2025-06-05T01:41:54 1749087714

That's why I'm very excited for the potential metaphysical ramifications of DolphinGemma

K0balt · 2025-06-09T11:21:54 1749468114

I’m imagining a timeline where dolphins have accepted long ago that humans are intelligent because of our external manifestations of technology, and consequentially have developed much more sophisticated philosophy than humans based on having to understand paradigms well outside of their personal experiences.

Soon, we will find out through dolphin-Gemma that they have been talking to aliens for centuries, since the aliens tried to talk to us but failed due to our, and their, philosophical myopia… but the dolphins, with their non-circular philosophical understanding recognized alien communications and started the conversation before we finished the pyramids.

pizza · 2025-05-27T04:03:34 1748318614

This seems like the type of work that, whether or not it itself is profound (which I believe it seems it is), will accelerate the forthbringing of derivative works that will definitely be profound.

molmoai · 2025-05-27T10:22:28 1748341348

Elegant indeed. Bridges the formal and probabilistic paradigms

pizza · 2025-05-23T07:28:10 1747985290

ChatGPT, is that you?

throwaway2037 · 2025-05-23T08:32:11 1747989131

Yeah, I had the same uncanny valley feeling when I read this phrase: "carrying centuries of Korean culture in every jar". What next: Recommended that we "delve" into the history of soy sauce production?

SwtCyber · 2025-05-25T15:26:34 1748186794

Nah, not a bot :) Maybe I leaned a little poetic there, but I do think there's something powerful about how much history and identity can be packed into something as humble as soy sauce...

pizza · 2025-05-21T15:51:48 1747842708

Check out tropical algebras

ttoinou · 2025-05-21T19:36:47 1747856207

Fascinating thank you for reminding me about that

pizza · 2025-05-14T03:23:02 1747192982

Another one: SequenceMatch: Imitation Learning for Autoregressive Sequence Modelling with Backtracking https://arxiv.org/abs/2306.05426

There was also an hn thread: https://news.ycombinator.com/item?id=36425375

pizza · 2025-05-11T03:29:04 1746934144

I think this is really about the actively made decision to spend longer with each sensation than you 'normally' would - relative to your own individual baseline. Basically whether this is a tool in the person's metacognitive toolkit that they typically use or not. Like 'the feeling of the touch of a thought' on the mind itself, as opposed to a linear pre-simulation of a performance to play out loud. Or the difference between making what you say out loud a cheap, low entropy, lossy encoding of all that you're thinking, versus gathering all the aspects quietly and needing to find a way linearize all that incidental complexity.

pizza · 2025-05-08T05:38:19 1746682699

If you want a hard problem that's far from predictable with tangible utility, why don't you work on something that helps humans with human problems, but whee you can still use your talents? Some research that could help understand a disease, or a new kind of medical technology, or something like that. I don't know how well it aligns with your salary goals but it might be meaningful.