More

jakewins · 2025-06-14T21:03:07 1749934987

We use solvers throughout the stack at work: solvers to schedule home batteries and EVs in peoples homes optimally, solvers to schedule hundreds of thousands of those homes optimally as portfolios, solvers to trade that portfolio optimally.

The EU electricity spot price is set each day in a single giant solver run, look up Euphemia for some write ups of how that works.

Most any field where there is a clear goal to optimise and real money on the line will be riddled with solvers

jakewins · 2025-06-14T20:53:30 1749934410

I thought this article seemed like well articulated criticism of the hype cycle - can you be more specific what you mean? Are the results in the Apple paper incorrect?

astrange · 2025-06-14T21:02:03 1749934923

Gary Marcus always, always says AI doesn't actually work - it's his whole thing. If he's posted a correct argument it's a coincidence. I remember seeing him claim real long-time AI researchers like David Chapman (who's a critic himself) were wrong anytime they say anything positive.

(em-dash avoided to look less AI)

Of course, the main issue with the field is the critics /should/ be correct. Like, LLMs shouldn't work and nobody knows why they work. But they do anyway.

So you end up with critics complaining it's "just a parrot" and then patting themselves on the back, as if inventing a parrot isn't supposed to be impressive somehow.

foldr · 2025-06-14T21:07:42 1749935262

I don’t read GM as saying that LLMs “don’t work” in a practical sense. He acknowledges that they have useful applications. Indeed, if they didn’t work at all, why would he be advocating for regulating their use? He just doesn’t think they’re close to AGI.

kadushka · 2025-06-14T21:17:00 1749935820

The funny thing is, if you asked “what is AGI” 5 years ago, most people would describe something like o3.

foldr · 2025-06-14T21:20:45 1749936045

Even Sam Altman thinks we’re not at AGI yet (although of course it’s coming “soon”).

kadushka · 2025-06-14T23:32:02 1749943922

Markus has been consistently wrong over the many years predicting the (lack of) progress of the current deep learning methods. Altman has been correct so far.

foldr · 2025-06-15T07:36:36 1749972996

Marcus has made some good predictions and some bad ones. That’s usually the way with people who make specific predictions — there are no prophets.

Not sure I’d agree that SA has been any more consistently right. You can easily find examples of overconfidence from him (though he rarely says anything specific enough to count as a prediction).

barrkel · 2025-06-14T21:04:36 1749935076

You need to read everything that Gary writes with the particular axe to grind he has in mind: neurosymbolic AI. That's his specialism, and he essentially has a chip in his shoulder about the attention probabilistic approaches like LLMs are getting, and their relative success.

You can see this in this article too.

The real question you should be asking is if there is a practical limitation in LLMs and LRMs revealed by the Hanoi Towers problem or not, given that any SOTA model can write code to solve the problem and thereby solve it with tool use. Gary frames this as neurosymbolic, but I think it's a bit of a fudge.

krackers · 2025-06-14T21:33:41 1749936821

Hasn't the symbolic vs statistical split in AI existed for a long time? With things like Cyc growing out of the former. I'm not too familiar with linguistics but maybe this extends there too, since I think Chomsky was heavy on formal grammars over probabilistic models [1].

Must be some sort of cognitive sunk cost fallacy, after dedicating your life to one sect, it must be emotionally hard to see the other "keep winning". Of course you'd root for them to fall.

[1] https://norvig.com/chomsky.html

charcircuit · 2025-06-14T23:13:35 1749942815

>with tool use

A LLM with tool use can solve anything. It is interesting to try and measure its capabilities without tools.

barrkel · 2025-06-15T10:19:11 1749982751

I don't think the first is true at all, unless you imagine some powerful oracle tools.

I think the second is interesting for comparing models, but not interesting for determining the limits of what models can automate in practice.

It's the prospect of automating labour which makes AI exciting and revolutionary, not their ability when arbitrarily restricted.

RugnirViking · 2025-06-15T22:19:14 1750025954

Isn't the point of automating labour though to automate that which is not/was already not automated?

It would draw on many previously written examples of algorithms to write the code for solving Hanoi. To solve a novel problem with tool use, one needs to work sequentially while staying on task, notice where you've gone wrong, and backtrack.

I don't want to overstate the case here, I'm sure there is work where there's enough intersection between previously existing stuff in the dataset and few enough sequential steps required that useful work can be done, but idk how much you've tried using this stuff as a labour saving device, there's less low hanging fruit than one might think, but more than zero.

barrkel · 2025-06-16T10:36:11 1750070171

There is a decent labour savings to be had in code generation, but under strict guidance with examples.

There's a more substantial savings to be had in research scenarios. The AI can read more and synthesize more, and faster, than I can on my own, and provide references for checking correctness.

I'm not confident enough to say that the approaches being taken now have a hard stopping point any time soon or are inherently bound to a certain complexity.

Human minds can only cope with a certain complexity too and need abstraction to chunk details into atomic units following simpler rules. Yet we've come a long way with our limited ability to cope with complexity.

charcircuit · 2025-06-15T16:55:39 1750006539

Search is already a pretty powerful oracle to defer an answer to a human and is a common tool most AI use today.

What current models can automate is not what the paper was trying to answer.

barrkel · 2025-06-16T10:34:06 1750070046

What current models can automate is why they are exciting, and the attention the paper is getting because of how it cuts into this excitement. It follows logically that the attention is somewhat misplaced.

jakewins · 2025-06-13T15:40:06 1749829206

People do - but the actual answer to your question is as you’re implying: it’s not as simple as “you get paid to consume”.

There are negative spot prices in Europe all the time - but they are not usually negative enough to make up for the grid fees and taxes. Or they are in countries like Germany that hasn’t rolled out smart meters, so consumers have no way to access spot prices

jakewins · 2025-06-12T13:18:02 1749734282

Similarly in Europe; spot market with a big single pay-as-cleared spot auction for every quarter-hour, and then a continuous auction for the same periods closer to delivery, similar to the normal stock market. Millions of residential devices are traded there right now

jakewins · 2025-05-08T05:22:10 1746681730

My experience is this is nearly impossible, the solution is new packages written after typing was introduced.

I don’t know about SQLAlchemy, but for libraries like pandas I just don’t see how it can be done, and so people are actively replacing them with modern typed alternatives

davedx · 2025-05-08T06:18:37 1746685117

Ha. I just finished a huge rewrite at work from sync SQLAlchemy to async SQLAlchemy, because the async version uses a totally different API (core queries) to sync. So this implies if I want type checking I need to use a different ORM and start again?

I love how Python makes me so much faster due to its dynamic nature! Move fast, break things!

mapcars · 2025-05-08T09:32:30 1746696750

I don't agree that dynamic nature makes things necessarily faster, if you compare Python to C or Java it is true, but if you compare to Typescript it is not. With a decent typing system and a good editor that makes use of it (and AI-assistants nowadays) the prototyping can actually be both faster and more stable.

davedx · 2025-05-08T15:03:07 1746716587

Yes, I 100% agree. My career has been Java/C++ -> php/JavaScript -> typescript/python. Types are a godsend

networked · 2025-05-08T14:43:55 1746715435

I think davedx was being sarcastic. Python's dynamic nature cost them time.

jakewins · 2025-04-23T18:56:38 1745434598

I feel like you can substitute almost any cutesy convenience feature in this sentiment. I wholeheartedly agree.

For software that is going to be maintained, optimising for debuggability, comprehension and minimised ball-hiding is almost always the side to error on.

jakewins · 2025-04-20T06:06:58 1745129218

My brother, a middle school teacher, was talking about TikTok yesterday. Every 2 years he gets a new batch of 10-year-olds.

They all have a “class chat”, and it is used daily for relentless cyber bullying. The current trend TikTok is pushing this month is to push the boundaries of calling black kids the n-word without explicitly saying the word. There is one little black girl in his class.

He says every class is the same, horror ideas pushed by edge lords TikTok algos push on the kids. Relentless daily bullying. And unlike bullying on the playground or at the boys and girls club.. there is no realistic way for adults to intercede beyond disconnecting their kid, shutting them out of the social context entirely.

achenet · 2025-04-20T07:58:31 1745135911

sorry if this is a stupid question,

but can your brother setup a class chat that he moderates?

I'm working on a simple chat app in Go as a learning project [0], you're welcome to use that, but honestly there are almost certainly better solutions out there, which he can actively moderate. Maybe a WhatsApp group, or something that can be used by a web interface (old forum techs?)

Group chats can be nice, I'm part of several acroyoga group chats and they're lovely, probably because adults who practice acroyoga tend to be nicer than middle schoolers.

[0] https://codeberg.org/achenet/go-chat

kridsdale3 · 2025-04-21T21:05:14 1745269514

Why would the kids want to use that?

squigz · 2025-04-20T07:35:54 1745134554

As someone who was bullied despite adults interceding, I'm curious why you think it being physical makes it better?

Interestingly the exact example you gave is something I can see happening when I was a kid as well as now.

Bullies gunna bully.

jakewins · 2025-04-20T17:58:20 1745171900

My primary issue here was actually more about TikTok - I don’t think it’s right that software engineers get rich writing code that pushes “bullying challenges” on children to increase engagement and ad sales.

But: all other things equal, of I get to pick between “10-year-olds primary daily public forum is completely, cryptographically, devoid of any moderating adult presence whatsoever” and - what I had - 10-year olds have privacy but there are adults around that have a chance at picking up that things are going off the rails”

jakewins · 2025-04-19T18:01:27 1745085687

There is no contradiction between “constructors cannot fail” and “fail early”, nobody is arguing the constructor should do fallible things and then hide the failure.

What you should do is the fallible operation outside the constructor, before you call __init__, then ask for the opened file, socket, lock, what-have-you as an argument to the constructor.

Fallible initialisation operations belong in factory functions.

int_19h · 2025-04-22T23:11:58 1745363518

The real problem is that constructors and factory functions are distinct in the first place. They aren't, in Rust, and it's much easier to reason about and requires far less verbiage to write.

throw-the-towel · 2025-04-23T08:46:32 1745397992

Why though?

jakewins · 2025-04-19T17:58:22 1745085502

This is the exact opposite? They explicitly encourage doing resource-opening in the __enter__ method call, and then returning the opened resource encapsulated inside an object.

Nothing about the contract encourages doing anything fallible in __init__

jakewins · 2025-03-16T12:04:41 1742126681

I’m intrigued as well.. My experience is notebooks struggle as a format for production code. We encourage people who work heavily in notebooks to use them for exploratory work, but choose other tools when it comes time to ship.

When you are exploring something, experimenting, showing.. it’s great; train-of-thought structure, APIs like Pandas optimised for writing and terseness etc.

But when you have a piece of code that will lose a million dollars a minute if someone ships a bug, and which will be maintained by many engineers over many years, then you really want a format that’s optimised for long-term maintenance, incremental change, testability, and APIs optimised for readers.

edanm · 2025-03-16T12:36:25 1742128585

I write production code, I also work lots in Jupyter notebooks.

Personally, I think the fact that notebooks are usually easier/funner for me to work with is a big problem. I'm by no means a Clojure expert, but I did do a semi-large project in Clojure a few years ago, and some of the ideas of true REPL-driven development that exist there are things I wish that Python supported.

It's hard to explain without actually learning it for real (and most Python devs mistakenly think Python has REPL-driven development; I sure did before learning Clojure!). But once you get used to being able to interact with your actual source code, and at any point just being able to write new code and immediately print out its value, then with one shortcut make it part of the regular codebase... that just blurs the distinction that exists between Jupyter Notebooks and production code in a way that makes everything much better.

vineyardmike · 2025-03-19T02:33:24 1742351604

I'd love to hear more about this REPL-driven development. I've heard people bring it up from time to time, but it's clearly very different from the typical "stateless horizontal micro-service" that has become common practice.

What tools are used for "write new code and immediately print out its value, then with one shortcut make it part of the regular codebase" and how does that square with working on a team and getting code reviewed?