There seems to be a precedent with tech companies that if you do a lot of illegal stuff in the beginning you can count on either failing before it matters or you can buy your way out of the consequences. Bonus points if you induce regulatory capture to make sure no one else can follow in your footsteps.
Please HN do your thing and prove me wrong or tell me the benefits to society outweigh the cons, because otherwise it's a depressing take.
Well, the fruit of it, Llama 3, is public. Maybe it comes with some sort of license, but considering how it was made, I wouldn't feel guilty about violating it.
Then again, I would happily train a model on Anna's Archive myself without feeling guilty about it, if I had the resources to do so.
I think it ranks very, very low on the list of bad things Facebook has done.
The similarities to Google Books are interesting. Google created Books based on their own scans of books, which publishers considered piracy.
The legal battle initiated by the Authors Guild was vicious and it damaged and hampered Books and other entities and scanning projects like the Internet Archive.
Now Facebook pirates all the world's books, uses them without paying authors or publishers, and seemingly faces no consequences.
I would strongly prefer to live in a world where we could easily search and access all books, than one where the richest guys get to exploit them without consequences.
> Well, the fruit of it, Llama 3, is public. Maybe it comes with some sort of license, but considering how it was made, I wouldn't feel guilty about violating it.
Facebook would quite literally sue you until you killed yourself if you did that. It’s not the same rules for them and for us.
I think you are right on all counts, but it doesn't make me depressed, it's just the way humanity moves forward.
Since deepseek it's clear that regulary capture just doesn't work in this case, only pure competition. If USA doesn't let NVIDIA sell GPUs to the world, Huawei will be happy to in a few years.
It is the golden rule. Whoever has the gold rules. Laws are for mortals, while big-tech can do whatever they please while channeling their masculine energy. Basically, behave in a self-serving manner violating basic norms such as dont do unto others what you dont want done to you.
As content gets polluted with ai-garbage, human-generated content will regain value.
Here is a startup idea: A startup that protects creator's content from violations. For a nominal subscription fee, ensure content is not ripped off on major social platforms and AI. Track regulations in different jurisdictions, and keep a legal team on the beat. Sue big-tech for damages when found. ie, police for the creators.
I think it is a good thing to undermine copyright by creating various carve-outs that will eventually make it untenable to maintain, which is probably one of the faster ways to reform the obsolete system.
Piracy doesn't care about borders, and if US firms don't do it, Chinese firms will. It is a national competitiveness issue in that sense, which is the easiest argument to get the government to do things to weaken copyright.
Laying the groundworks for the eventual defeat of MAFIAA is a great benefit to society if it happens, and in my opinion outweighs the nonexistent "damage" they did. There wouldn't be llama if they didn't pirate the books, and the authors won't get paid anything either. Bonus points if we can get rid of DRM anti-circumvention as well.
For what it's worth Facebook doesn't seem to be doing the regulatory capture part, unlike "Open"AI and Anthropic.
Often it happens because there is a lot of market demand for the thing they are doing even if there are regulations against it, so you get Uber instead of normal taxis because the taxi service was no good, in this case you get informed AI instead of the information being tied up in court battles for years. So often the benefits to society do outweigh the cons, though not necessarily in every case.
No, you're just right, and you should feel upset about it. Think about the systems and incentives that create this outcome. Think about what needs to change for better outcomes to be possible. Find people and work towards those goals.
The benefits to society outweigh the cons. If AI and copyright are really destined to fight each other -- an exercise in question-begging if there ever was one -- then copyright must lose.
If only because any other outcome will radically empower corporations (and entire countries!) that don't GAF about copyright.
I can't prove you wrong and I don't think there is any benefit to society when a company with a black box doesn't give back to the community (aka humanity as a whole) and doesn't pay anyone. Copyright may be broken, but the only realistic way to encourage creators is to give them money. We don't live in a utopia, and people need money to eat.
Also, in France and as an individual, if I were to openly torrent all the content of Libgen or Anna's Archive, I would have to spend the rest of my life in jail whereas Zuck can enjoy having more billions with the same behavior.
Some kind of rant: The world is truly fucked right now if such behaviors are rewarded, but at my grand old age of almost 50, I'm starting to move away from all this and it's relaxing: I live a simpler life, I buy from small creators, and I learn whatever I want. I have also gained a lot of time that I can now spend by helping people around me. Life can be fun if you ignore that you'll never be in control of all the bad behaviors around you.
I have found throughout my life that the best gifts were ones that required me to give PII to the giver, and sign a binding agreement that legally restricts my use of said gift.
Bonus points if you harass me about your newsletter.
This is obviously annoying, but I suspect most people use them by downloading them off HF, Ollama, or similar sites which require no agreement. I also wonder why there is so much attention put on Facebook rather than OpenAI for example. Facebook is at least giving the weights to the public.
> who imports copies or phonorecords into the United States in violation of section 602, is an infringer of the copyright or right of the author, as the case may be.
Copying itself is infringing. OC has no clue what they’re talking about.
Read the law and stop being pedantic. The phrase is "or who imports..."
The intent is clear and the penalties are clear and with a few keypresses even the most irrational among us can balance how we wish things were versus how things are.
Given an informed base you can contemplate how to affect the change we all want instead of just posting misinformation in chat rooms.
> Bulk downloading is often done with BitTorrent, the file-sharing protocol popular with pirates for its anonymity, and downloading with BitTorrent typically involves uploading to other users simultaneously.
Isn't this a twofold misunderstanding of BitTorrent? I haven't used it much, but I've never believed BitTorrent to be popular for anonymity (is it even truly anonymous?), I thought it was popular because it makes downloading go faster by reducing bottlenecks. Also, choosing not to seed a file is extremely simple in every torrent client I've seen, so it seems a bit of a leap to conclude that Meta seeded the pirated books just because the protocol supports it.
I'm far more concerned about the simple fact that they downloaded pirated books at all than I am about the protocol they used to do it.
> [D]ownloading with BitTorrent typically involves uploading to other users simultaneously.
The key word here is "typically". Mutual sharing is designed into the protocol itself, particularly in the early seeding period, where higher priority is given to peers that re-share their portions of the torrent.
Yes, you can turn off uploads, but that's not the default in any client I've ever used. So I don't see anything wrong with saying the typical user will re-upload files.
But Meta specifically says they took precautions to not do that, and in every client I've seen the only thing they would have to do is make sure a single checkbox which is prominently visible on the main download modal isn't checked!
I'm not objecting to the use of the word "typically", I'm objecting to the explicit suggestion that Meta might have seeded the pirated works. It's unlikely and unnecessary to suggest—there's plenty else wrong with this picture so there was no need to include this idea, it's just a distraction from the real problem.
My take is pretty depressing for a different reason -
I understand the outrage against mark zuckerberg for nurturing such culture and making the executive decision, but also understand that at least one engineer was involved in writing and executing the code that does the piracy (with Product Managers and other cross-function employees)
And given the importance and visibility of the work, it's pretty obvious - that person wouldn't be a low level engineer either (I'd assume they earn about 1 million dollar a year - IC7+ at Meta)
now comes the depressing statement - a solid engineer, who probably don't even need to work another day in their life, is being a puppet of mark zuckerberg and robbing creators.
That's a very interesting and refreshing take. But that's the culture that has been nurtured in the capitalist present - climb up to the top and push the ladder behind you.
> In a message found in another legal filing, a director of engineering noted another downside to this approach: “The problem is that people don’t realize that if we license one single book, we won’t be able to lean into fair use strategy”
I hadn't heard that idea before. Any IP law experts able to give useful context on that one?
As Eric Schmidt put it in his infamous conference at Stanford: "It doesn't matter if you steal at first, you will only have to get the lawyers to clean the mess up."
This is one of the reasons there should simply be a big tax on AI profits, assuming they come to pass: they literally require uncompensated training on the sum of human intellectual effort. There is no AI without standing on the shoulders of both giants and thousands of everyday authors.
If you're just giving the result back to humanity, OK, there's a case for it being a fair trade, though there's still the question of how to handle the disruption of long tradition of valuing & compensating labor.
If you're using it to win capitalism, 50% tax seems starting stakes. Sure, lots of expertise and resources go into the data processing to produce a model that can talk to you, but without the data, the processing doesn't matter (and without the model, we could still do this collectively).
I'm pretty sure Facebook/Meta isn't earning any profits from this. They are mostly doing it to "commoditize their complement" and probably prevent the rise of another company to the Big Tech status.
It could be structured to reward living authors, mechanical licensing has worked this way.
But even if it wasn't -- even if couldn't be -- you could use it to fund all kinds of public goods which would benefit living authors. Maybe even a basic income if it turns out to be successful enough.
It's a different pro-social bargain than direct compensation for the fruits of labor, but at least it's still a bidirectional one.
Let’s look at this like food, instead of knowledge. You ate a donut, and you got energy. Maybe you’ve eaten 5 donuts per year, for 20 years…so a hundred donuts. Maybe you didn’t pay for all of them, you got some at work or at a friends house.
These donuts can be replicated an unlimited number of times for free without depriving the creator of the donuts of their donuts. The analogy doesn't work.
The real question is whether the value of these donuts is diminished by being ingested by the AI and it's still not clear.
There thousands of posts on HN defending digital piracy, mocking the government and groups like the RIAA that make poor analogies to theft of physical goods.
this is similar to situation of War - obviously "wrong" actions bring an advantage that cannot be undone, the winners press onwards without pause.. crowds and commentators are left red in the face and helpless. Extra bonus credit -- the economic empires being built with the wins will have this approach to society built-in .. ordinary becomes normal.
Please HN do your thing and prove me wrong or tell me the benefits to society outweigh the cons, because otherwise it's a depressing take.