Hacker News new | past | comments | ask | show | jobs | submit login
How to participate in Monday’s oral arguments re: Internet Archive (archive.org)
221 points by gigama on March 18, 2023 | hide | past | favorite | 82 comments



As a strong supporter of the Internet Archive’s primary mission, I am saddened to say that the prosecution appears to have a strong case here. Controlled Digital Lending is a controversial legal theory, not something that has any clear statutory basis. The IA must now hope for some creative judicial interpretation to save them.

It’s doubly frustrating because I think the publishers would have let the IA fly under the radar had they stuck to lending on a strictly one-digital-loan for one-physical-copy basis. The National Emergency Library was a serious lapse of judgement - a moment of madness amongst a backdrop of widespread Covid madness. They poked the hornet’s nest. IMHO they should have immediately apologised, leant into it being an honest mistake during a unique historical event, and come to some minor financial settlement. Instead, they and the EFF are doubling down and risk being flattened with a severe bill for compensation.

I absolutely would support legislation to properly recognize CDL as a lawful function of libraries. Instead, all our hopes are pinned on the judicial branch doing the job of the legislative branch.

P.S. Donate to the IA here: https://archive.org/donate/


While you are correct on the tactical level, with this I do not agree:

> The National Emergency Library was a serious lapse of judgement - a moment of madness amongst a backdrop of widespread Covid madness.

NO! It was a moment of sanity prompted by an exceptional situation in an absolutely insane world, a world that pretends to value property yet undermines property using IP. It is IP that is the real madness. It is IP that is used to suck every drop of life out of culture the same way Exxon sucks oil out of the ground. And it is IP that is used to bind people when other measures are not effective.

I get that the thought that the world is absolutely insane and absurd may not be a popular idea in the startup space which relies on a blind optimism, but IP is simply part of the cancer afflicting this world.

You are correct that it was a tactical mistake that endangered the rest of the project but the values that prompted that decision are some of the values that should guide us in building a better world if we want to stand any chance of avoiding a Black Mirror like dystopia on the path towards which the world is very much on.


I strongly agree. My taste in classical music in my formative years calcified around the 1940s mainly because I couldn't easily obtain sheet music any more modern. I think it's crazy that I lack appreciation, taste, and exposure to nearly a century of music (with some exceptions) simply because of copyright law.


Maybe video game music might catch your interest: https://www.youtube.com/watch?v=oKWgLe-jQjc


Such a coincidence that I was just recently listening to the Persona 5 (then Baccano) soundtrack but that's as close to big band as I get.

While copyright is the main culprit, there are others. Electronic aspects reduce the chance music can be played with your acoustic instrument. Composers like Varese anticipated challenges in reproducibility and instruct their readers to record their own samples so that no two performances will ever be identical. Of course that very action shifts the performer to a more active role: from musician to conductor. The closest I can approach are pieces for prepared piano, such as the music of Franghiz Ali-Zadeh.

The newfound electronic diversity also pushes composers to invent their own notation, such as Ianis Xenakis, George Crumb, and even as far back as Henry Cowell.

Last but not least classical is a highly curated genre. Which contemporary composers will be remembered a century from now? The current contenders are sufficiently popular to swap for their sheet music (the exceptions I mentioned). I have no trouble obtaining the minimalist works of John Adams, Philip Glass etc.

I still blame copyright to a large degree. Copyright enforcement suppresses the sheet music trade which reduces discoverability and renders the entire genre static, preserving the status-quo of established artists. You hear stories of Chopin, Schumann, and Liszt debating and inspiring each other across Victorian drawing rooms. Copyright suppresses the modern equivalent.

Contemporary music notation is unfamiliar and alien due to lack of exposure, which once again is exacerbated by copyright.

As for electronic music, I'd like to imagine that composers are free to share their samples alongside their sheet music - like source code samples to a computer science textbook - but I doubt it.

It's ironic but your video shows how less regulated public meetups are compared to virtual meetups. Show up to a jam session, grab a fake book, and start playing. Totally illegal online though.


> As for electronic music, I'd like to imagine that composers are free to share their samples alongside their sheet music - like source code samples to a computer science textbook - but I doubt it.

Tracker music is exactly that.

Also I do agree that the concept of open source should be spread to other fields. Sharing just the mastered recording or the final picture is equivalent to sharing just the binary for software.


But sadly, going to argue about it in a room with a mandatory dress code from the 19th century is not going to overturn pre-digital norms.


> I am saddened to say that the prosecution appears to have a strong case here.

This is a civil case so plaintiff not prosecution. This comment made me double check that there wasn't an associated criminal case I was unaware of.


Very little of what the IA does is strictly kosher under current law more or less anywhere in the world. They've mostly gotten away with it because, for example, they've mirrored websites that the owners put out in public and generally respect even retroactive requests to take them down. And, as you say, lending out 1 digital for 1 physical is reasonable enough that it's easy enough to believe publishers would overlook.

(And, yes, they're a library/archive but that basically means nada in the digital world.)


I used to donate to them every year, but I stopped when they did this stunt because I was afraid any money I donated now would go to pay for lawyers in a lost cause that I don't believe in. It pains me deeply in my heart because I think the Wayback Machine is a great and noble cause, and I would like to keep donating to it, but I don't want to contribute to what I see as a pointless crusade against copyright at the same time.


I am also a strong supporter of them, but I wrote them immediately and said that this plan of theirs was both very illegal and insulting to artists, writers and creators throughout the world.

The only reason we have professional or semi-professional artists and writers is because they're able to sell copies one way or another. The EFF and the IA would like to believe the writers and publishers are outrageously wealthy and able to sustain the kind of bleeding and pseudo-piracy they endorse. The reality is that most are barely getting by. It's sad to watch the richy riches of Silicon Valley steal from the artists and writers.


I think you have the story backwards: the artists starve not because the IA shared books in a planetary emergency. It's because publishers don't remunerate the creators fairly but turn cash to the executives and shareholders instead. Doctorow and the EFF have multiple explanations of why this happens and it has nothing to do with Napster-like tech.


Yeah. Strictly speaking, the reason why the music industry shat their pants over Napster was because someone found out a more economically efficient way to scam artists. They'd been fighting for decades to ensure that they would be the one ripping off artists, and then charging the public as if the artists were actually making a decent living.

Every time I see an artist worry about online piracy, I roll my eyes. It's not not a threat, but it is a rather weak one unless you're a best-selling author or musician. You're far more likely to either get ripped off by a "for exposure" bro[0] or music label, or just have your work languish in obscurity.

There's actually a bunch of authors that signed a letter of support for the Internet Archive in this suit, specifically because libraries are very, very good at getting mid-list authors into readers' hands. They value the author's work at the expense of the publisher's ownership, which is why publishers hate them. An author that gets a bunch of library exposure can sell people on another book tomorrow, but the publisher is out on "lost sales" today.

[0] I expect this to be replaced with GPT/SD enabled hustles eventually


Is this true? The author Charles Stross wrote "Your typical book publisher is not like the music or movie industry; they run on thin margins, and they're staffed by underpaid, overworked folk who do it because they love books, not because they're trying to make themselves rich on the back of a thousand ruthlessly exploited artists. I think their effort deserves to be rewarded appropriately." (http://www.antipope.org/charlie/blog-static/2009/03/reminder...)


You're blaming the Internet Archive (an online library) for the sins of publishers and the broad devaluation of digital content due to the Internet? For example, Spotify pays pennies per stream because that is what the content is worth when there is so much available. Piracy didn't kill copyright value capture by artists, the Internet did (just as LLMs will devalue intellectual work). This is scarcity inertia having an existential crisis with technology enabled abundance.

This started during Napster times, and is why bands derive most of their income from touring and merch.

Please reconsider your position based on the evidence. We need to figure out a way to compensate creatives, but the Internet Archive providing access to content in a controlled manner is not of material impact to the economic situation. Z-library and LibGen make content available with no controls already.


> the prosecution appears to have a strong case here. Controlled Digital Lending is a controversial legal theory, not something that has any clear statutory basis.

What is that based on? If you could share an expert analysis, that would be great (or do you have expertise?).


"Controlled Digital Lending is a controversial legal theory"

DiVX might like a word with you.


> is a controversial legal theory, not something that has any clear statutory basis

let the court decide that, you are essentially pronouncing judgement. Maybe "internet" or "digital" has changed the situation?

The post asked people to support IA


Digital changed the situation, for the worse. In digital there is no first-sale rights, because bits cannot be sold, only licensed. This was decided a decade ago in exactly the same court the IA is being sued in[0]. The only way they could decide in favor of IA would be to make a very narrow ruling (e.g. 'libraries are special' or 'the DRM makes it OK') or go full judicial activist in ways our extremely conservative SCOTUS would not tolerate.

The reason why libraries are even allowed to exist in the first place is because physicality allows you to do things with books that are not "copying" them. Everything you do to a digital file is covered by copyright. When you "move" a file from one computer to another, you actually copy and delete it. When you "read" a file, you copy it from one storage medium to another[1]. If we want digital first sale then we have to fatally wound the existing copyright system.

I would absolutely love for the IA to be able to roll back this madness even a little, of course. I don't see that happening. They will put up a good fight, and the judge will roadkill them for the trouble.

[0] https://en.wikipedia.org/wiki/Capitol_Records,_LLC_v._ReDigi....

[1] https://en.wikipedia.org/wiki/MAI_Systems_Corp._v._Peak_Comp....

Yes, this is that court case. The one that says "RAM is storage and loading programs into it is infringement".


Would you share any expert analysis that your comment is based on? All I see are comments by random people on the Internet.


The IA is not a religion. We love them dearly but we're still going to call them out when they do something that's blatantly, provocatively illegal, and we're sure as shit not going to support them when they lie about what they did. Especially when everyone was there when they did it and were screaming "oh my God, please do not do this idiotic thing, they are going to smash you like a bug because this is obviously not even hypothetically legal."


I was sad to see the EFF jump off a cliff trying to keep the Register of Copyrights unaccountable to the public and I was sadder to see IA jump off a cliff with this boneheaded move.

Some actions are just catastrophically stupid, even when viewed from orbit.


What I want to know is who was behind it. "You can't drive to the library therefore copyright is suspended" is a feat of such mindbending idiocy it had to be driven by a single powerful individual or a small group. I want to know who, because now there's somebody in the IA management that nobody should ever trust again, and until that info comes out the entire organization is suspect. Can you depend on them when they might decide to play legal Russian roulette at any moment?


> What I want to know is who was behind it. "You can't drive to the library therefore copyright is suspended" is a feat of such mindbending idiocy it had to be driven by a single powerful individual or a small group.

Don't underestimate echo chambers. In this case there were at least two echo chambers at play. The first being IA team itself, generally all being on the same wavelength as transgressive mavericks accustomed to pushing the bounds of copyright law. The second was much broader, mainstream society itself panicking about Covid, creating a zeitgeist of flaunting the rules to do something about Covid.

I would be surprised if any part of the IA org pushed back on this idea.


I'm all for pushing the bounds of copyright but the IA is too big and too centralized to blatantly flout it. For that you need a small team, like Z Library. The IA is way too important.


To be clear, I'm not defending what they did. Only explaining why I think they thought they could get away with it. They were already accustomed to operating in a transgression gray-zone (and generally getting away with it), and when covid hit and the zeitgeist became "do something!", they felt they had extra license to push boundaries even further.


I really hope this isn't the end of the Wayback Machine. The IA being a centralized entity with an agenda not everyone agrees with was a problem not explored enough until it was too late.


Indeed, the Wayback Machine is an invaluable resource, more and more so as older sites drop off the net.

But hosting, for example, multiple complete MAME ROM sets (and the existence of turnkey - albeit non-commercial - products which download them automatically) is in a completely different category from the Wayback Machine - and it would be a shame if the former was endangered by the latter.


I think there is quite a bit difference between archiving stuff. And making it publicly available. The first one make lot of sense, even if technically the content does not follow copyright laws. But there is important cultural history with it that should be considered. But on other side, maybe access should be limited in someways.


The problem is that an Internet Archive where you you have to show up at some building in the Bay Area during office hours to access the archive is quite a bit less valuable than the archive in its current form.


I'm not saying in person, but at least some type of vouching before sharing the material.


The details don't really matter. Having to say present research credentials from some university, even if virtually, isn't a whole lot more inclusive.


Someone would spin up a nonprofit and buy the assets and keep it going


If it was shut down due to a civil injunction for copyright violation, I don’t think the Internet Archive would be able to transfer the Wayback Machine’s data to anyone else without defying that injunction and risk being held in contempt.

Someone would need to immediately begin trying to mirror the entire Wayback Machine’s archive, ideally hosting the mirror in Luxembourg or the Netherlands.


Hopefully this is already happening...


Archive Team Archive Team, assemble!


Would it be more difficult to get funding during harash economic times?


It's already a nonprofit. The shell game works for for-profit corporations. I don't think it'll work when the kind of companies that usually play that game are the ones after them.


specific assets of a non-profit must be transferred to another non-profit, with some oversight about costs. It is possible and does happen.


The internet archive does facilitate mirrors under the LOCKSS (lots of copies keeps stuff safe) philosophy.


how would this painstaking work be performed by a decentralised entity? Human beings have agendas, I am not sure why you expect them to be machines or politically neutral. This is normally understood when one donates to a charity.


> against our library and the longstanding library practice of controlled digital lending

Isn’t this… deliberately misleading? As I understand it there wasn’t really a problem until they decided to embark on “Uncontrolled” digital lending.


No, it is not misleading. Almost all of the lawsuit is about controlled digital lending, with the COVID relaxing being an example of the things people could do while managing a CDL system. And the publishers were already objecting to CDL before COVID.

The end goal for the publishers is definitely the removal of CDL, not punishing the IA for the pandemic actions.


I agree.

I don’t understand why they took this risk. The internet archive is a great resource, why did they pick this digital lending hill to die / risk all that on?

It seems irresponsible.


The IA has basically existed because they did stuff that, for the most part, publishers/companies didn't care about or even secretly appreciated. Storing old web pages, magazines, millions of bits of other ephemera that would otherwise only have existed on musty library shelves if at all. And they would even take something down if some owner wanted some bit of history to disappear.

The idea that because COVID, in a world with the Internet along with also massive quantities of public domain works on Gutenberg and elsewhere, the IA just had to triple down on digital lending just makes no sense.


They're not in trouble for sharing books that were available in the public domain.


Lying about it comes across badly. As though they think their best chance of success is to pretend it’s about something else. I support the IA mission generally but this came across as exceedingly dumb when they did it.

Maybe they are right and this existential gamble will work and it’ll clarify an area of ambiguity that means it was legal - I’d love to be wrong. But I’m not placing any bets on that.


> Lying about it comes across badly.

I agree 100%.

It makes me wonder about the leadership that they made this decision, are going through with this legal situation, and keep trying to push this story.


> I don’t understand why they took this risk.

They, and many other people around the world, got it into their minds that Covid had suspended all normal rules and left them free to do whatever they thought to be a reasonable response to the circumstance, which in this case was "the normal libraries are closed so we'll give out free access to all our books."


It’s so strange as my library has ebooks. The libraries were still available in pretty much the same way.


My library has ebooks too, but the collection is abysmal compared to Archive.org. Archive.org digitized a ton of books that you'll be very hard pressed to find ebooks of for sale let alone in public libraries (except Library Genesis, Z-Lib, etc.)

Archive.org's error wasn't in believing that their act would be useful. The error was in their belief that laws were effectively suspended "because Covid".


Yeah I got no problem with them being an elibrary. I agree 100% the "because covid" thing was just dumb.


Does your library restrict the number of copies they "loan" out? I have waited for 6 months to check out an audiobook, and sometimes a few months for other less popular ebooks where they only have 1 license.


They do, but I've never had a long wait. Maybe it depends on the book / I haven't wanted a high demand book and just not run into it.


They're arguing that the IA's library isn't even a library.


On one hand, we can keep acting like companies are good stewards of digital information or we can look at the reality that the industry (books, movies, media, etc) does a reliably bad job at keeping sources around.

Now it is possible to keep media around longer as a primary source which seems extremely valuable given the technology that exists. Copyright laws are due for an overhaul, and maybe we'll see something that embraces the fact that we have a new Gutenberg press capable of spreading information.

Anecdotally, I'm a bit tired of subscription services add/removing videos or books from their catalog. I have cancelled, since I don't want to pay to be gaslit that I saw something that "doesn't exist". I know, I'm not paying to have everything forever, I'm renting etc. However, whatever their licensing problem is, it is Not My Problem (tm), and so I voted with my wallet. I don't like when companies exploit object permanence in a way that makes me feel like I'm the crazy one.

Lend like print is a good model, but that creates more DRM, which on the long haul has not panned out in the technology world as a good thing. There has to be a balance, but what of the long view? Publishers need less execs/admin, as publishing costs are dropping very deeply.

I think Brewster Kahle and team are doing a great work, and I hope they win.


Does anyone know what the legal basis was for uncontrolled online Lending was in the first place that caused this mess? I don’t understand why IA is continuing to double down that this was legitimate behavior.


Their justification was that the lockdown removed far more copies of the books from public library circulation than were ever checked out from the IA library. Something their checkout records easily confirm is true.

However this lawsuit isn't really about the lockdown period, it's about CDL as a whole.


They joined up with a specific group of other libraries, so they're not just arguing in general that circulation was down, but that there were enough copies of books in the group to cover the online lending.


That is how their general CDL works, but for the pandemic they removed the restrictions because libraries they did not have a partnership with were closed.

The brief says this, but it's sort of confusing as it's blended in with discussing their general policy.


There isn't even a legal basis for "controlled" online lending (or indeed much of what the IA does) But they've mostly always managed to keep things sufficiently non-provocative that it hasn't been a problem.


Good background here:

https://slate.com/technology/2022/09/internet-archive-nation...

I have this suspicion that the college textbook publishers in particular want to block CDL.


DRM, copyright, and the concept of lending online are impediments to access.

And, indefinite rent-seeking is ridiculous.

Let old data be archived and free.


I wonder how much of the LLM training content came from IA.


As far as is known, IA is poorly represented in existing LLM datasets. They don't allow indexing or scraping, so they don't show up in Common Crawl which is the starting point. (The occasional link to IA might show up, and someone processing CC might choose to follow it, but that's relatively unusual, aside from image links: most people focus on the text inside CC itself.) And their servers are quite slow & overloaded, so if you targeted them manually, your scrapers will be rate-limited, banned, or just incredibly flaky. They contain a lot of highly redundant snapshots, so they're a hassle to post-process. And much of what they contain is implied or covered by easier to get datasets. I also haven't seen any hints in either randomly-generated samples or prompted samples from GPT-3, ChatGPT, GPT-4, or other LLMs of signatures of IA snapshots like their book OCR or their HTML headers. So... yeah, it's possible, and I wouldn't be surprised if data-hungry LLMs like GPT-4 have or will start tapping into IA, but right now there's no real reason to think that.


I wonder what an LLM trained on the entire Wayback Machine would be like


Why do companies always sue the goodies as if that’s somehow going to save their failing business.

Spoiler: children are morons now. The book industry is dead. Their parents can barely read so they sure as hell aren’t going to teach their children to. Everything is iPads. Suing random people won’t change that. Books will live on, but as a niche product rather than a major industry.


so, no. fortunately the last five hundred years of printing is not subject to a binary declaration of ALIVE|DEAD.. I will agree it is a SPOILER to say it like that, like, go ahead SPOIL my day !


Without taking a position on the legal arguments:

"How to participate in Monday’s oral arguments" is a deeply offensive thing to say. Once something reaches SCOTUS, it's strictly about matters of law. It's not a question of the merits or whether you like IA or not, and rooting is definitely inappropriate.


> SCOTUS

‘…the Southern District of New York will hear…’

> question of merits whether you like IA or not

Whether the court ‘likes IA or not’ may be a relevant question in respect of public policy considerations.

> rooting is definitely inappropriate

First, participating (well IA really mean listening which is even weaker) does not amount to rooting. Second, there’s nothing wrong with having a view on what the judgment should be, unless writing an article in a law journal disputing a judgment would be ‘inappropriate’.


Days until someone does an audience meter like this during a live SCOTUS debate: I give it 14.

https://www.imediaethics.org/its-entertainment-not-polling/

We have a rule of law, not a mobocracy.


I’ll make sure to remind my friends doing legal research that their informed opinion in fact amounts to ‘mobocracy’.


Why is it deeply offensive? "Participate" here just means to follow, kind of like a Steve Jobs keynote liveblog.

"Rooting" for one side of a legal case is absolutely appropriate for a defendant (or plaintiff) in our adversarial legal system. A defendant with a weak legal case but a strong public policy argument /needs/ to rally public support for their cause, so that if/when the court rules against them, they can push for the law to change.


"participate" will very shortly mean "pushing an up- or down-arrow for every judge's question, and every counsel's answer."

Oral arguments are not a HN posting.


It may be wrong, but it shouldn't "deeply offend" you. You're taking this way too personally.


Court cases have been widely covered in all kinds of media for centuries. They are matters of public concern.


there's a distinction between "watching" and "participating."

You watch a football game. You don't go out and participate in it.


This is judicially illiterate:

"Common law refers to laws that are based on the customs and principles of society, which are used in court case decisions in situations not covered by civil law statutes."

Your post appears oblivious to the distinction between common law and civil law systems, to the role societal customs play when interpreting the law, and to the role SCOTUS plays in the judicial machine


Your post appears oblivious to the distinction between a question of fact and a question of law. You, in fact, are judicially illiterate.

I don't care how it's done in France. This is all about the US. An appellate court here deals only with questions of law. "Societal customs" may have created our system, but that's the system we have.


I make no secret of the fact that I am judicially illiterate, and I can see that so are you.

You can read any of the famous dissenting opiniona of a suppreme court justice, and see that they highly consider 'how will society be affected if we choose A vs B'. Thats is not a matter of law.

Furthermore, there are constant argument about how much legislaring from the bench supreme court should be doing. And about the fact that lawmakers are sleeping on the job forces SC to do so.

Lartly, supreme court unding their own decision on abortion, the law hasn't changed.


.. and now I understand that you think "doesn't agree with me" is the same as "illiterate."

You've read a few articles and have opinions. That's fine. Many people do think it's all subjective and political, as you apparently do. Someone else here could argue with you and it would just degenerate into The Culture Wars, iteration 8799.0. I don't feel like doing that today, though.

My background in legal training, publications, and experience would shut down your "illiterate" part. You'd probably just ignore it and repeat what you already said about judicial decisions, though. So I think we're done here.


> I am judicially illiterate, and I can see that so are you

you're half right. Assuming you're not a lawyer: what credentials do you bring to the table?

As far as I can tell, you're as ignorant as the average person.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: