We use solvers throughout the stack at work: solvers to schedule home batteries and EVs in peoples homes optimally, solvers to schedule hundreds of thousands of those homes optimally as portfolios, solvers to trade that portfolio optimally.
The EU electricity spot price is set each day in a single giant solver run, look up Euphemia for some write ups of how that works.
Most any field where there is a clear goal to optimise and real money on the line will be riddled with solvers
I thought this article seemed like well articulated criticism of the hype cycle - can you be more specific what you mean? Are the results in the Apple paper incorrect?
Gary Marcus always, always says AI doesn't actually work - it's his whole thing. If he's posted a correct argument it's a coincidence. I remember seeing him claim real long-time AI researchers like David Chapman (who's a critic himself) were wrong anytime they say anything positive.
(em-dash avoided to look less AI)
Of course, the main issue with the field is the critics /should/ be correct. Like, LLMs shouldn't work and nobody knows why they work. But they do anyway.
So you end up with critics complaining it's "just a parrot" and then patting themselves on the back, as if inventing a parrot isn't supposed to be impressive somehow.
I don’t read GM as saying that LLMs “don’t work” in a practical sense. He acknowledges that they have useful applications. Indeed, if they didn’t work at all, why would he be advocating for regulating their use? He just doesn’t think they’re close to AGI.
Markus has been consistently wrong over the many years predicting the (lack of) progress of the current deep learning methods. Altman has been correct so far.
Marcus has made some good predictions and some bad ones. That’s usually the way with people who make specific predictions — there are no prophets.
Not sure I’d agree that SA has been any more consistently right. You can easily find examples of overconfidence from him (though he rarely says anything specific enough to count as a prediction).
You need to read everything that Gary writes with the particular axe to grind he has in mind: neurosymbolic AI. That's his specialism, and he essentially has a chip in his shoulder about the attention probabilistic approaches like LLMs are getting, and their relative success.
You can see this in this article too.
The real question you should be asking is if there is a practical limitation in LLMs and LRMs revealed by the Hanoi Towers problem or not, given that any SOTA model can write code to solve the problem and thereby solve it with tool use. Gary frames this as neurosymbolic, but I think it's a bit of a fudge.
Hasn't the symbolic vs statistical split in AI existed for a long time? With things like Cyc growing out of the former. I'm not too familiar with linguistics but maybe this extends there too, since I think Chomsky was heavy on formal grammars over probabilistic models [1].
Must be some sort of cognitive sunk cost fallacy, after dedicating your life to one sect, it must be emotionally hard to see the other "keep winning". Of course you'd root for them to fall.
Isn't the point of automating labour though to automate that which is not/was already not automated?
It would draw on many previously written examples of algorithms to write the code for solving Hanoi. To solve a novel problem with tool use, one needs to work sequentially while staying on task, notice where you've gone wrong, and backtrack.
I don't want to overstate the case here, I'm sure there is work where there's enough intersection between previously existing stuff in the dataset and few enough sequential steps required that useful work can be done, but idk how much you've tried using this stuff as a labour saving device, there's less low hanging fruit than one might think, but more than zero.
There is a decent labour savings to be had in code generation, but under strict guidance with examples.
There's a more substantial savings to be had in research scenarios. The AI can read more and synthesize more, and faster, than I can on my own, and provide references for checking correctness.
I'm not confident enough to say that the approaches being taken now have a hard stopping point any time soon or are inherently bound to a certain complexity.
Human minds can only cope with a certain complexity too and need abstraction to chunk details into atomic units following simpler rules. Yet we've come a long way with our limited ability to cope with complexity.
What current models can automate is why they are exciting, and the attention the paper is getting because of how it cuts into this excitement. It follows logically that the attention is somewhat misplaced.
People do - but the actual answer to your question is as you’re implying: it’s not as simple as “you get paid to consume”.
There are negative spot prices in Europe all the time - but they are not usually negative enough to make up for the grid fees and taxes. Or they are in countries like Germany that hasn’t rolled out smart meters, so consumers have no way to access spot prices
Similarly in Europe; spot market with a big single pay-as-cleared spot auction for every quarter-hour, and then a continuous auction for the same periods closer to delivery, similar to the normal stock market. Millions of residential devices are traded there right now
My experience is this is nearly impossible, the solution is new packages written after typing was introduced.
I don’t know about SQLAlchemy, but for libraries like pandas I just don’t see how it can be done, and so people are actively replacing them with modern typed alternatives
Ha. I just finished a huge rewrite at work from sync SQLAlchemy to async SQLAlchemy, because the async version uses a totally different API (core queries) to sync. So this implies if I want type checking I need to use a different ORM and start again?
I love how Python makes me so much faster due to its dynamic nature! Move fast, break things!
I don't agree that dynamic nature makes things necessarily faster, if you compare Python to C or Java it is true, but if you compare to Typescript it is not.
With a decent typing system and a good editor that makes use of it (and AI-assistants nowadays) the prototyping can actually be both faster and more stable.
I feel like you can substitute almost any cutesy convenience feature in this sentiment. I wholeheartedly agree.
For software that is going to be maintained, optimising for debuggability, comprehension and minimised ball-hiding is almost always the side to error on.
My brother, a middle school teacher, was talking about TikTok yesterday. Every 2 years he gets a new batch of 10-year-olds.
They all have a “class chat”, and it is used daily for relentless cyber bullying. The current trend TikTok is pushing this month is to push the boundaries of calling black kids the n-word without explicitly saying the word. There is one little black girl in his class.
He says every class is the same, horror ideas pushed by edge lords TikTok algos push on the kids. Relentless daily bullying. And unlike bullying on the playground or at the boys and girls club.. there is no realistic way for adults to intercede beyond disconnecting their kid, shutting them out of the social context entirely.
but can your brother setup a class chat that he moderates?
I'm working on a simple chat app in Go as a learning project [0], you're welcome to use that, but honestly there are almost certainly better solutions out there, which he can actively moderate. Maybe a WhatsApp group, or something that can be used by a web interface (old forum techs?)
Group chats can be nice, I'm part of several acroyoga group chats and they're lovely, probably because adults who practice acroyoga tend to be nicer than middle schoolers.
My primary issue here was actually more about TikTok - I don’t think it’s right that software engineers get rich writing code that pushes “bullying challenges” on children to increase engagement and ad sales.
But: all other things equal, of I get to pick between “10-year-olds primary daily public forum is completely, cryptographically, devoid of any moderating adult presence whatsoever” and - what I had - 10-year olds have privacy but there are adults around that have a chance at picking up that things are going off the rails”
There is no contradiction between “constructors cannot fail” and “fail early”, nobody is arguing the constructor should do fallible things and then hide the failure.
What you should do is the fallible operation outside the constructor, before you call __init__, then ask for the opened file, socket, lock, what-have-you as an argument to the constructor.
Fallible initialisation operations belong in factory functions.
The real problem is that constructors and factory functions are distinct in the first place. They aren't, in Rust, and it's much easier to reason about and requires far less verbiage to write.
This is the exact opposite? They explicitly encourage doing resource-opening in the __enter__ method call, and then returning the opened resource encapsulated inside an object.
Nothing about the contract encourages doing anything fallible in __init__
I’m intrigued as well.. My experience is notebooks struggle as a format for production code. We encourage people who work heavily in notebooks to use them for exploratory work, but choose other tools when it comes time to ship.
When you are exploring something, experimenting, showing.. it’s great; train-of-thought structure, APIs like Pandas optimised for writing and terseness etc.
But when you have a piece of code that will lose a million dollars a minute if someone ships a bug, and which will be maintained by many engineers over many years, then you really want a format that’s optimised for long-term maintenance, incremental change, testability, and APIs optimised for readers.
I write production code, I also work lots in Jupyter notebooks.
Personally, I think the fact that notebooks are usually easier/funner for me to work with is a big problem. I'm by no means a Clojure expert, but I did do a semi-large project in Clojure a few years ago, and some of the ideas of true REPL-driven development that exist there are things I wish that Python supported.
It's hard to explain without actually learning it for real (and most Python devs mistakenly think Python has REPL-driven development; I sure did before learning Clojure!). But once you get used to being able to interact with your actual source code, and at any point just being able to write new code and immediately print out its value, then with one shortcut make it part of the regular codebase... that just blurs the distinction that exists between Jupyter Notebooks and production code in a way that makes everything much better.
I'd love to hear more about this REPL-driven development. I've heard people bring it up from time to time, but it's clearly very different from the typical "stateless horizontal micro-service" that has become common practice.
What tools are used for "write new code and immediately print out its value, then with one shortcut make it part of the regular codebase" and how does that square with working on a team and getting code reviewed?
The EU electricity spot price is set each day in a single giant solver run, look up Euphemia for some write ups of how that works.
Most any field where there is a clear goal to optimise and real money on the line will be riddled with solvers
reply