Hacker News new | past | comments | ask | show | jobs | submit login

It doesn’t matter. The process is irrelevant.

Black box. Input. Output. Is the output the same as some training input?

It’s not rocket science.

“…but humans…” argument is not relevant. This is not a human.




> It doesn’t matter. The process is irrelevant.

Wrong. Literally the whole thing about law - and especially about intellectual property laws - is that process is as much, if not more relevant, than the outcome. This is why "code as law" efforts are plain suicidal. This is why you can't just print out a hex dump of a pirated MP3 file and claim it's not copyright violation because it's just a long number that your RNG spit out - it would've been a good argument if your RNG actually did that, but it didn't, and that's what matters.

This is what it means when we say that, for just about everyone except computer scientists, bits have colour[0]. Lawyers and regulators - and even ordinary people - track provenance of things; how something came to be matters as much, and often more, than what the thing itself is.

This is what makes generative AI case interesting. They're right there in the middle between two extremes: machines xeroxing their inputs, and humans engaging in creative work. We call the former "copying", and the latter "innovation" or "invention" or "creativity". The two have vastly different legal implications. Generative AI is forcing us to formalize what actually makes them different, as until now, we didn't have a clear answer, because we didn't need one.

--

[0] - https://ansuz.sooke.bc.ca/entry/23


The pirate bay founders made the argument that process was necessary and lost fairly big. They argued that the process dictated that the prosecutor had to first prove that a copy had been made, and prosecute that, before they could argue that the pirate bay somehow helped with that crime.

The court did not agree. They looked instead towards an anti-biker gang law that illustrated that a biker bar can be found guilty of assisting with gang crime, even if no specific crime can be directly associated with the bar.

The defense team argument - that prosecutors need to prove that a crime had occurred - failed. The courts only require that the opposite is not believable, which given all the facts around the case was deemed sufficient. In that question the process doesn't matter. If the court do not think it believable that copying has not occurred, any argument about "machines xeroxing their inputs and humans engaging in creative work" will be ignored.


I wonder who had much more money, PB or the RIAA


But if I put a human in the black box, that somehow now matters to your argument, because you're saying that it only holds for machines.


I don’t care if there’s a human in the box. If the box spits out training input as output it is copying it.

It doesn’t matter if there is a human doing it or not.

For your supposition to work, the input to the box would be only an abstract summary of the logical steps, and the output an exact copy of some other thing that was never an input.

In that case, yes, it would not be copying.

..but, is that the case? Hm? Is that what we care about? Is it possible to randomly generate the exact sequence of text with no matching training input? Along with the comments?

It seems fabulously unlikely, the point of being totally absurd.


I'm a proponent of not restricting (well, or trying to restrict) machine learning models and not considering them a lossy database but it must be said here if humans can recreate copyrighted works from memory and publish them, they are in trouble too.


I agree, I'm not saying that machines don't produce copies of existing data, I'm saying that's not all they produce.


I agree. The problem is that a human has ethical deterrents to avoid copying data while a machine doesn’t, so we have to rely purely on legal incentives to avoid copies from being produced.


I think the best argument here is that having the work in memory is not illegal, and human brains are not bound to copyright even when they can also be considered lossy databases. The question is where do we draw the line for a lossy database.


If you transcribe a copyrighted book by hand, that doesn't give you the right to publish it. I don't think being a human currently gives you a legal loophole to copy works so why make the comparison?


Humans (and their creativity) have special status and privileges in law.

Machines don't. It doesn't matter how fancy they are. The law doesn't care.

So yes, a human in a black box is different from a machine in a black box until laws change.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: