Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Don Knuth's MIP, 64 years later (nathanbrixius.wordpress.com)
80 points by todsacerdoti on May 25, 2024 | hide | past | favorite | 55 comments


Not explicitly stated in the article but the main contributor to the speed-up is not the hardware but the software. Solvers have improved immensely.


Ray Hettinger did a nice talk (unsurprisingly) on using modern solvers in Python.

Raymond Hettinger - Modern solvers: Problems well-defined are problems solved - PyCon 2019

https://www.youtube.com/watch?v=_GP9OpZPUYc


Off-topic; Sparcstation 5! I have 40 of them and 15 E450s, all working. Such lovely machines. Unbreakable.


I remember a friend showing me photos from someone who shot a sparcstation and proved they were literally bulletproof. I can’t find them now though.


I used 5s early in my career, and I can't recall a single failure. I very much miss Sun hardware.


Why do you have 40 Sparcstation 5s? Very curious.


I am a hoarder of things that work forever. I have 1000s of old computers that all work. I love this tech that I can solder… oldest one is from 1963.


Oh man, err, person, that reminds me that I need to fix my SparcStation Voyager, whose 2.5" SCSI drive broke. I do have a 2.5" SCSI2SD already, and I also have install media, but it always seems like too much effort. Lovely machine, though.

Anyhow, in the late 90s I was given an SS10 (?) clone made by Axil, which was my main driver for a year or so, running RedHat for Sparc.


Do you have a list of such "bulletproof" computers youd recommend for a rainy day project?


“ I asked ChatGPT to render an image of Gomory and Knuth pensively sitting atop a mainframe. It refused for privacy reasons. Here is the best it could do:”

Why? This took me right out of the article. No one needs this energy.


"Please don't pick the most provocative thing in an article or post to complain about in the thread. Find something interesting to respond to instead."

https://news.ycombinator.com/newsguidelines.html


What bothers me about this use case is that AI will generate the most generic middle-of-the-road illustration that adds no insight or value to the prompt. I’d get a better kick if he just wrote a description like “I can only imagine them debugging this while sitting beside a room-sized mainframe with a plethora of wires and bulbs”


Completely agreed. Instantly closed the tab without looking at the article when I saw the distinctly AI rendered image.


Shame, if you'd kept reading you would've been enlightened by the author's shocking discovery that modern computers can solve problems orders of magnitude faster than computers from the 60s or 90s. Just mind-blowing stuff, who'da think it


really? I find AI art perhaps the only non-obnoxious use case.


I find the idea of “AI art” ok, but something about Dall-E 3 specifically just riles me up.

It seems like OpenAI went out of their way to make everything generated look a certain way, while that’s clearly not a limitation of GANs in general.

It makes every single blog look like they have the same illustrator, which I really dislike.


So many non-AI article illustrations are just Alegria anyway.


I call it Corporate Memphis 2.0


I don't know if anyone "needs" this energy, but I liked it. Just as I did when I was 6, I enjoy having illustrations when I'm reading, and am generally ok with them being AI-generated; it's better than using random stock photos.


AI generated imagery is usually imbued with inaccuracies and errors. Not to mention the ethical problems of using artwork without consent for training the AI.

So, it might be worse than a random stock photo in those aspects.


> imbued with inaccuracies and errors

Which this one certainly was. Tape drives don't look like that nor does the arrangement make any sense.


“Good artists copy; great artists steal” - Pablo Picasso. By this standard AI must be an amazing artist.


See my comment above. People say that Picasso said that (because Steve Jobs used to quote it as from Picasso) but there's no evidence he said that. It seems it was actually TS Elliot originally and then that was "stolen" by Igor Stravinsky, who was unquestionably a great artist but was saying it about Anton von Webern.

https://quoteinvestigator.com/2013/03/06/artists-steal/


Great artists steal, as in, make things their own. The computer programs can't extrapolate without producing artefact-dominated visual noise: they can only interpolate ("copy"). (Besides, you're affirming the consequent.)


I'll let the courts decide the legality of the artwork, I can't myself see any ethical problems. I agree with the parent that an AI-generated image is a lot better than a stock one and enhances the material as long as a human is able to validate that the image is appropriate for the context.

In this case, I'm glad that the image was included. I was about to close the tab but the image caught my attention, and I kept going.


I do not trust the corrupt courts of the United States to decide on anything. Or, more charitably even if the blatant exploitation of artists can't be proven to breach any existing statute -- and the law is always behind -- even then I will shorten this obvious truism as "all AI art is stolen art".


Why not go one step further? All art is stolen art.


Maybe because that’s misleading and not very useful? There are different kinds of stealing, and the platitude “all art is stolen” conflates them. The kind you’re referring to is stealing ideas, and is not illegal. The kind parent is referring to is what copyright is designed for, and it happens to be illegal. It’s helpful to distinguish between these two kinds of stealing, because one can be good for society and art and business, and the other not so much.

The historical context of the idea that “all art is stolen” is intentional hyperbole, it’s meant to be provocative and a little but humorous, and not meant to be taken absolutely literally like you are implicitly suggesting. It does not mean literally that thieves walked out with the paintings, nor does it mean that art progressed because people were hand-copying the Mona Lisa. It was referring to varying degrees of using other people’s art as inspiration, borrowing subject matter or composition, mimicking techniques, etc.

None of this is what AI does. AI is trained on fixations (to borrow the copyright term) and can only produce remixes of fixations. It can’t do the kind of stealing that T.S. Eliot was talking about. It doesn’t steal ideas, it only steals pixels.


I disagree that it's what copyright is designed for, and I disagree that it's illegal, and I also disagree that it's stealing - or in the sense of the phrase, I disagree that it's a different kind of stealing. I think the sense in which AI steals is exactly and only the sense in which artists steal.

The historical context is taking others' art into your own cognition and building upon it, adding your own spin. That is exactly what AI does.

The idea that AI can only produce remixes - putting aside the battles that were fought in the art world over the validity of remixes - is simply false. It was never true! It's an urban legend that's retold and retold because it'd be very convenient for certain arguments if it were true. But it just isn't.

I seriously recommend that you spend a week's worth of free time just playing around with SDXL, in a UI that's capable of inpainting. IMO, the quickest way to realize this idea is false is to actually engage with the system.


What is the “it” you are announcing disagreement with? You deny that breaking copyright is illegal? Or what are you trying to say exactly? It’s not an opinion that photocopying art is illegal and borrowing ideas is not, that is codified into law. Declaring disagreement with that idea seems silly to me, so I assume you mean something else, but it’s not clear what you mean. This isn’t a convincing argument, it seems like hyperbole to me.

The authors of copyright laws, and the Berne Convention, did happen to write down what they intended, and they do make an actual distinction between stealing fixations and stealing ideas, and they clarified that stealing ideas is not covered by copyright laws. There are other IP laws that cover some forms of idea stealing.

> The idea that AI can only produce remixes […] is simply false

Why do you believe that? We know exactly how it’s built, we know what the algorithms do, and there’s nothing in there capable of original thought. It only interpolates, it does not extrapolate. AI is trained on examples, it builds a Markov model that reproduces what it learned, statistically, and locally reproduces what it was shown. We can prove it without engaging with the system, because we (humans) built the system. It’s deterministic and mechanical and repeatable. What exactly do you claim it’s doing beyond this? Be specific.


I don't think training is photocopying, and I don't think it's breaking copyright.

> Why do you believe that? We know exactly how it’s built

correct

> we know what the algorithms do

Dubiously correct, we know their function but not the purpose of the particular individual patterns learnt, see interpretability research

> and there’s nothing in there capable of original thought.

Incorrect, we have no idea what it takes to be capable of "original thought", the concept isn't even defined, we don't even know if humans do it. It's entirely possible that networks are capable of original thought, in fact there's a lot of evidence for it, like how if you put them in situations that aren't covered in the training they can often handle it fine.

> it builds a Markov model

Okay, do you know how the networks work? No, complete nonsense, Markov models have nothing to do with this. For one, LLMs are exactly not "locally" due to the long-range attention mechanism. Even things like RWKV, that "just" uses persistent state, can express correlations much, much longer than any Markov chain ever built. If you built a Markov chain for the sorts of correlation distances you'd get in LLMs you would need exponentially more states than there are atoms in the universe, which demonstrates that what's going on cannot be modelled as a Markov process without torturing the concept into incoherence.

> It's deterministic

Sure, it's deterministic plus a random noise factor. Anything in the world is deterministic plus a random noise factor, us included. If you perfectly recorded the thermal noise in a human brain you could deterministically repeat its output forever.

Imo this view is just "well the network can't be doing the same thing we're doing, it's too simple. I'm not that simple am I?" Yeah sorry, the human brain isn't that complicated.


You’re moving your own goal posts. There are different kinds of stealing even when only humans are doing all the stealing. It seems really funny to try to argue that all art is stolen, and that AI is not stealing, in the same breath.

> I don't think training is photocopying, and I don't think it's breaking copyright.

That’s a head in the sand position to take. There are thousands of examples, and dozens of lawsuits that demonstrate that literal copyright-breaking photocopying sometimes happens, I’ve seen it myself with Stable Diffusion and ChatGPT both. OpenAI is seeking copyright exemption for their models, which implicitly and openly admits they’re breaking current laws, and want the laws to change in their favor.

Even when output isn’t obviously breaking copyright, still, we know a-priori that we built a copy-and-remix machine by design. You can carry on arguing with everything I say, and that doesn’t change the fact that NNs are built with the intent to statistically reproduce their training data.

Side note that breaking copyright law does not require that the output is recognizable as a copy of one or more inputs. US Copyright Law specifies that one may not acquire and consume copyrighted material without permission. Some models are trained on legally acquired datasets, but many are not. Whether AI is stealing depends on the specific model and training data, so a blanket disagreement that no laws are broken is obviously incorrect.

> we have no idea what it takes to be capable of “original thought”, the concept isn’t even defined

May be true, but we do know for a fact that today’s neural network aren’t it, regardless of what original thought takes. Maybe we’ll crack the code in the future, and maybe today’s AI has started down the right track. Or, maybe not.

> plus a random noise factor. Anything in the world is deterministic plus a random noise factor, us included. If you perfectly recorded the thermal noise in a human brain you could deterministically repeat its output forever.

Maybe, but this hasn’t been proven or shown to be true, so is speculation. This armchair physics idea is hundreds of years old. Anyway, today’s AI is not deterministic plus a random noise factor, it’s just purely deterministic.

> Yeah sorry, the human brain isn’t that complicated.

Haha, the hubris. Prove it!

BTW this is straw man; I didn’t make any statements about the brain and I wasn’t comparing today’s AI to the human brain. Whether AI is stealing pixels does not depend on comparisons to humans or answering existential questions about consciousness.


> That’s a head in the sand position to take. There are thousands of examples, and dozens of lawsuits that demonstrate that literal copyright-breaking photocopying sometimes happens, I’ve seen it myself with Stable Diffusion and ChatGPT both.

Sure, when a picture appears in the corpus hundreds of times. I guarantee you a human artist could reproduce these pictures with the same level of fidelity, because they are cultural cliches. We choose not to, and when we don't, we do not violate copyright. So who commits the crime here? The model didn't decide to reproduce that image spontaneously, it was told to. The broad majority of its output cannot be shown to violate copyright, even with great effort. When a paper comes out with a model that violates copyright, always you see the prompt egging it on. That the model cannot tell that it is reproducing the input distribution a little too well, doesn't mean that's its default mode.

Besides, I suspect you don't want to rest your argument on a problem that can be fixed with the most basic effort put into dataset cleaning.

(Also, this only happened with some networks, as I remember the paper. I suspect it's a matter of training quality.)

So there are two questions: do companies violate copyright when training models, and do models violate copyright when generating images? I disagree with the former universally, and with the latter almost always.

> Even when output isn’t obviously breaking copyright, still, we know a-priori that we built a copy-and-remix machine by design.

Well, I just don't think this is true.

> statistically reproduce their training data.

Yes, on a high-dimensional abstract manifold, plus random noise. Again, at that point you've thrown enough divergences in that I genuinely don't see how humans aren't "mix-and-match machines" by this metric. Stop calling models names and make a concrete claim.

> we do know for a fact that today’s neural network aren’t it

Yes well I think they are it, or at least of the same kind. Show evidence.

> Maybe, but this hasn’t been proven or shown to be true, so is speculation.

Presumption of innocence still applies, even to code. I have my experience, you have your experience. In my time using models, I have used them to produce samples that were well outside the training distribution. They have managed this with no great difficulty. A mix-and-match machine would, by definition, not be capable of this, unless you use the concept so generically that it could apply to any computable function.


> In my time using models, I have used them to produce samples that were well outside the training distribution

No you haven’t, because it’s not yet possible. Either you didn’t see the training data, or you don’t know how neural networks work, or you’re anthropomorphizing and drinking AI kool aid.


Or you're wrong.

I mean, I'm just saying. I know what I saw. You're asserting I didn't see what I thought I did, because it's a problem for your argument. That's not very convincing though.

edit: Let me clarify, obviously the LLMs/diffusion models I've used can only apply the patterns that they have happened to have learnt. However, due to the high level of abstraction these systems are capable of, the concept of a "pattern" can be very abstract, and even include such things as "recognizing a pattern in the input" as an abstract pattern itself. As such, at the least the output is not constrained to some sort of trivial interpolation of the training set, but a deep nonlinear function of the training set, that has even partially fed back on itself. Again, arguably, just like us.


Because that makes no damn sense.


There's literally a saying that explicitly spells out "great artists steal." Art is imitation plus adaptation, and everybody used to know this, artists included, until they all spontaneously forgot about a year ago, around the time when the artificial imitation machines started becoming commercial threats.


Yes. That saying probably[1] started life with the quote from TS Elliot, who said

"Immature poets imitate; mature poets steal; bad poets deface what they take, and good poets make it into something better, or at least something different."

The composer Igor Stravinsky "took" that and changed it to “A good composer does not imitate; he steals.” and after that quote was further changed in the LaTex documentation (believe it or not), Steve Jobs took it and changed it to be Pablo Picasso saying "Lesser artists copy, great artists steal".

It doesn't seem there's any evidence of Picasso saying it although Steve Jobs said a lot that Picasso said it.

[1] https://quoteinvestigator.com/2013/03/06/artists-steal/


It’s amusing that the sequence of events in that story makes a great example of prior art getting regurgitated and altered by humans. If he were an AI bot, we would say Steve Jobs “hallucinated” a fact about Picasso.


Yeah. Still there is a human mind and effort between the original and the copy or we wouldn't care that much. That's the fundamental difference.

Also https://news.ycombinator.com/item?id=40475364


Sure so call it dehumanized art, but don't call it stolen art just because that sounds more severe.

It's just the '24 version of "You wouldn't download a car."


If you go to art school, what you do is study and analyze what artists have done before you. We humans learn much in the same way as an AI. We study what has been made by others.

I think accusing AI generated work to theft, is a based on the wrong assumptions.

It's not just art. What do you do in computer science courses? Or math, language, physics, or any other subject? You study what has already been made. What you produce based on that knowledge is not considered theft.

Have you ever seen an AI image that, if a human made it, would be considered a theft?


Human art can exist without anything prior. For example, I highly doubt the creators of cave paintings traveled the world to see other such work before doing theirs. But you could look at impressionism or pointillism for that matter.

AI can not.

Indeed, the problem is the utter unoriginality of the current wave of AI. Literally the only thing it can do is to produce the next token which is the most likely to be in the training set.


I doubt the creators of cave paintings stayed their entire lives in that cave. One presumes they saw animals.

Could an agentic AI generate a cave painting if you restricted it to training data of animals and also limited its ability to painting with primitive tools? Sure, why not?


What is this ‘agentic AI’ you speak of? There’s no such thing to date. There is no AI yet that can train on pictures of animals and then decide to produce cave paintings. If you limit the output to be produced only with primitive tools, then the AI is not agentic, you’re just making a fixed-function paint machine.


There is no such thing, but cave paintings are also not relevant to what people do with diffusion AI.

My argument is that art that diverges from reality is largely a function of path dependence and tooling more so than creative choice.


>If you go to art school, what you do is study and analyze what artists have done before you.

You don't have to go to art school to produce art. Even if you do go, there's nothing stopping you from producing art unlike what you studied.

>We humans learn much in the same way as an AI. We study what has been made by others. I think accusing AI generated work to theft, is a based on the wrong assumptions.

What assumptions?

>It's not just art. What do you do in computer science courses? Or math, language, physics, or any other subject? You study what has already been made.

Most of "what has already been made" is not "copyrighted work for which you have no license or exception"

>What you produce based on that knowledge is not considered theft.

It certainly can be.

>Have you ever seen an AI image that, if a human made it, would be considered a theft?

I've certainly seen AI produce things that would be considered copyright violation or misappropriation of likeness or similar offences if they were to be used commercially.


> You don't have to go to art school to produce art. Even if you do go, there's nothing stopping you from producing art unlike what you studied.

Sure, but also you do, presumably, have to see to paint. The overwhelming majority of our training corpus is just visual observation.


What are you trying to suggest is the significance of that?


The point is that nobody, not humans, not AI, develops painting skills in the absence of image data. Humans can learn to paint with less input images than AI, sure, we're a lot more sample efficient, but we still need samples. The processes are not that different.


Okay, but that's not really relevant. The whole issue is about the way copyrighted works are used.


I Find the same. Even though the AI image has lots of errors with the right instructions it usually carries the intentions of the master way better than some stock that can be found in reasonable time


Maybe you should install an image-generating reader-plugin then.


I'd actually love to have one - is there a good one?


It worked for me. Trying to share it here but there is an issue with the share API. Will reply to this message or edit it once the sharing is working.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: