Again, it's just insane to me that we don't even much have a meaningful discussion of:
"Hey, wait, literally everyone could have the entire library of Alexandria in their house for a couple hundred bucks per person. Like, all the knowledge ever. Maybe that should be considered the good default of things.
At least one in every town that everyone could use, for free, forever, without restriction to ANY of the knowledge anyone desires."
This is why I'm vehemently against further increasing copyright law, software patents, and treating Imaginary Property as something that can be owned.
We finally build the tools to allow the free sharing of information and almost immediately a bunch of rent seeking lawyers lobby congress to make doing that illegal.
I've always lived by the rule of "If you put it on the Internet, it's not yours anymore." Because that's de facto the truth. Only if you employ lawyers do you have even the slightest real power over something online, even if you're the original source. This isn't something 95% people can afford. Turning the Internet into just another place where the rich get preferential treatment is a terrible thing.
can we just offer free text book for everyone? maybe also some online lecture. everything is there, I trust will need incorporate it into school, library and local community.
That's already the case minus the last ~70 years or so. The overwhelming majority of our knowledge is in the public domain, in particular cultural artifacts.
It's a nice sentiment but like, people can already go to gutenberg.org and download pretty much most important works of literature in existence and most books have like 5k downloads so there's that.
I assume they are referring to the fact that books older than about 70 years are in the public domain. The rest are protected by copyright and not free.
Project Gutenberg is missing a lot of content that is public domain, and the oldest entries are often pretty poor in quality (and some of the oldest entries aren't lucky enough to get redone like Carroll's work has been).
Which just goes to prove parent's point. As much as it might seem otherwise, IPR restrictions are not the main bottleneck to widespread availability of content (at least book-like content); actually making the content available is far more important!
(Also, keep in mind that content is now entering the public domain every year, and projects like PG are nowhere close to keeping up with that flow of newly-unrestricted stuff. So this dynamic is becoming more extreme over time, not less.)
I really don't think it proves any point aside from Project Gutenberg being a bad example, as it doesn't really contain the sum of human knowledge in the sense that a pirate library does.
In a lot of countries it's life plus 70 years. If it was only 70 years, we'd already have everything from 1951 and earlier. However, we have to wait for the authors to die before we even start counting, and good luck if it's somebody obscure and you can't find their date of death.
It's actually fairly unlikely that the bulk of published knowledge is in the public domain.
By copyright expiry, public domain in the US begins in 1927. Later works may be in the public domain, but all works published prior to 1927 are in the public domain in the United States.
(This may not be the case in other countries.)
There were not many published books prior to the invention of the printing press, and many of those didn't survive. The total number of books (not individual titles but actualy bound volumes entirely) in Western Europe as of 1400 may have been as few as 50,000.
By 1800, about 1 million titles had been printed.
Over the course of the 18th century, presses became vastly faster, as they evolved from hand-operated wooden screw-press to iron-frames to steam and electric-powered rotary and ultimately web presses. Paper became much cheaper (and less durable --- a factor commented on at length in the Librarian of Congress's annual reports to Congress in the late 19th century). Literacy exploded from ~25% to 95%+ over the 19th century (and probably accounted for numerous revolutions and political upheavals).
Through much of the 20th century, certainly by 1950, US publishers were issuing about 300,000 new titles per year, a rate which state remarkably constant through the early 21st century. By the aughts, "nontraditional" self-publishing (a/k/a "vanity press") was nearing or exceeding 1 million titles per year more than had been published through all time to 1800.
Reports that all recorded data was doubling every few years date to at least the 1960s. That would mean that in any two year period ... half of all recorded information was less than two years old.
The catch is that not all recorded data is published. So I'm not sure what the time-distribution of all publishing looks like. But I'm pretty confident it's skewed far more recently than 1927. And would thus tend to be copyrighted rather than uncopyrighted.
If you want to measure works by significance, you might make a different argument --- there are many great works of literature, philosophy, history, and religion which were first published before 1927. But ranking and tabulating these is more challenging than a simple enumeration.
The lawsuit has been settled. Gutenberg has been unblocked in Germany since late October. The site was completely blocked for three years, from 2018 on. Now, a total of 19 German language books are blocked, per the "terms of the settlement are confidential." Everything else works.
I have information that the abysmal filth is still in place: "Sito sotto sequestro".
I wonder how your provider overrides it. Is it too much of a wet dream to think that some operators (behind the DNS) responded to the requests of the vacuously appointed low scum* with disdainful neglect? In terms of "If you really had any credential for existence but some forms of physical expression, we would say that you must be joking - before putting the proof of your vilty in the trash archival it deserves".
*May I remind that they appended the apex of Civilization, Project Gutenberg, to a list of a dozen pirate sites in their "operation". No, really, I cannot find words to label those lowest abysses.
Just tried it at home and loads fine from broadband too. That "Sito sotto sequestro" literally means "site has been seized", which of course isn't the case here; that page might indicate that the carrier has been forced to resolve that domain elsewhere, therefore all it needs is using an external name server to get around the block.
I'm currently using one of Cloudflare DNS addresses and it loads just fine.
I don't use the (ex) national provider (Telecom Italia / Tim) however. Crappy service aside, they're well known for jumping every time the government tells them to, and happily filter a lot of stuff.
Go to your local library. There is actually a lot more there than books. There are videos and audio records too. There are news paper clippings too.
Worse, data is being generated very fast. Let's look[0]
> [In 2012], CenturyLink projects that 1.8 zettabytes of data will be created. By 2015, the projection is 7.9 zettabytes.
> MAST is currently home to an estimated *200 terabytes of data*, which… is nearly the same amount of information contained in *the U.S. Library of Congress.*
But that said, a few petabytes is going to only cost in the 10s of thousands of dollars and max in the low 100s. So this is well within the budget of even every modestly sized city and definitely every university. It would be even easier if this was done with torrenting.
I think the real question is^: why are we not, as a species, creating this system? It seems very reasonable that we could back up all data. (There is also a dark side to this too! So let's not forget about that) At least, why aren't we creating a full access world library for all scholarly data, which is probably something that could be housed for a few grand. Should we do this on our own given the relatively low cost? Books + sci-hub + arxiv + *xiv?
I'm a writer. I can imagine it. But the landlord, grocer, car dealer and pretty much everyone else dealing in physical things wants me to pay for what I consume. I'm not independently wealthy. The only reason I can afford to write books is because the publisher pays me. The only reason the publisher can pay me is because people buy the books.
I can guarantee you that this type of behavior means I'm writing fewer books. It's very short-sighted.
Writers getting paid by the amount of books they sell is a pretty efficient way to allocate their work, but it's not the only way. If book sales vanished over night I'm sure interest in alternative means of financing would surge, and a new "standard" would establish itself.
Historically most art was financed through patronage/sponsors, wealthy people or institutions that sponsored people for the social status, or commissioned works from them. Both of those systems have been democratized by the internet, with people sponsoring artists they like with systems like Patreon, or just straight paying them money to create something. The same is happening with Twitch subscriptions (essentially donations to the streamer). There's no reason it can't work on a larger scale for most literature. Sure, priorities would shift, but it might also enable authors more creative freedom than publishers are comfortable with.
I understand your point but it is time for the system to change. Most writers in traditional publishing get 50 cents or a dollar for every sold book. My wife wrote a chapter in a science book that is being sold for over $100 and she received...nothing!
Everybody deserves to be paid for their work and I would argue that this is more about preserving knowledge than it is about not having the writers get what they deserve.
I get what you're saying by this, but I think OC is just being practical. This is the world we live in. It's shit and I hate it, but if you want something done, 9 times out of 10 you need to consider these questions
Maybe if you read the right books you would cultivate a worldview where you don't "need to consider these questions" so much. Maybe if more people died that we would even help build a world like that.
it's counterproductive to passive aggressively imply that people should cultivate a worldview that you think is correct. even if you are right, you seem not worth agreeing with.
no one worth agreeing with starts a sentence with >Maybe if you read the right books
I was only addressing the "this cannot be understood" aspect.
It's not what most people would want when given a choice, but that does not mean it's very hard to imagine how and why it happens.
I did not mean to imply I approved either, just that, I think "this is insane" is kind of silly. It is, but, it's also inevitable and not new or hard to follow.
In our current world, the currency is money, (vs land or humans in the past) and things pretty much only get done if someone can make money from it, and someone else doesn't make money inhibiting it. That's it. That's the entire mystery.
The unspoken answers to my two questions were "no one" and "many many many". Everything that you can do for yourself, some company somewhere would rather you pay them for it instead. The product doesn't exist because there are zero companies out there with any interest in producing it, and unlike pure information (software), individuals can't just do it for free for the feel goods.
It would require some imaginary and highly different societal structure to alter that equation. The people who would manufacture the home libraries, where do they get the materials and facilities from, and how do they eat, if they aren't selling these either to you directly, or by way of your government and it's taxes? Some kind of co-op type organization that takes the place of a for profit company? And they get their electronics parts and their vegetables from other co-ops since they don't have money to buy them? It could all be done some other way than by counting dollars, but literally everyone in the world would also have to be doing things this other way. And all the people currently on top of the current system will not be the ones rewarded with success under any other kind of system, and so they wield their power to keep things exactly as they are.
I don't mean in an illuminati way like there are 11 people running everything, I mean countless people inhabiting all levels and niches all act to preserve what they think is all they have, in countless little ways. Even people who actually have essentially nothing and would live far better in any other system, because it just takes more imagination than most people have to even consider not having dollars, and more generosity of spirit than most people have to even consider not being able to boss other people through their need of dollars.
We would somehow have to collectively figure out how not to reward the very worst of us with all the leadership positions, both private and public. Ultimately it comes from that. The exact wrong kind of people to make rules are just about the only people who gun for those roles. There are all kinds of asshole things I wouldn't do if I were running things. And I have no interest in being anyone else's boss, nor in figuring out ways to parasite off everyone else or harness them to some will of my own or anything like that, and so I will never end up running things. You have to be a sociopath to do what it takes to ever get into such a position. It doesn't happen by any nice fair reward-the-good kind of way, and we have almost no mechanisms to detect and defer such people from suceeding in their shark-like process. It just works and they go right to the top.
I'm not sure what the point is supposed to be of this imagining.
Yes obviously pure capitalism is as shit as pure anything else, and no system made by humans, and made of humans, actually works very well to create sanity and fairness. So even not-pure capitalism creates all kinds of undesirable results, like available tech doesn't get used in ways that would be wonderful for everyone.
What solution to human nature have you imagined in your split second?
Society as a whole. Exactly the kind of initiatives governments are supposed to be taking. Instead our governments are totally captured by the profit-seeking organizations and are hellbent on using their monopoly of violence to imprison activists working on these initiatives
In order to run Codex you need 4-8 large GPUs connected in the same box or cabinet. Such computers cost about $100k to buy. Then they use about 5..10kwh to run, that adds up to 50MW/year. The Codex model needs to be very low latency, means more expensive. This is just for one single replica of the model, you need many to serve all the requests.
Not to mention the cost of development - for that they needed thousands of GPUs or TPUs and a large team of top AI talent which is very very expensive to hire.
Regarding the open source copyrights - wasn't that code released to be "open"? This is actually a radical way to open the code, make it more useful for everyone. Open source devs can also use it to create projects.
I think $10 per month for this whole process is justified. It's not like you could run it at home.
I'm sorry but can you run a mirror for possibly the largest collection of written knowledge ever assembled in your home? What exactly is the point of this exercise?
Why should anyone make a profit ? Why is it important to guarantee individuals accumulating money belonging to others, instead of a common enrichment of everyone ?
No one would make money off of it. That is the point. There are innumerable things which are worthwhile which are not profitable.
The idea that we do not do a world library of free digital copies of every book ever written really highlights the problem with the thinking your comment has demonstrated. The idea that individual pursuits must make a profit to be justified leads to us doing terrible things like: not making a free world library of every book.
Though in this particular case this also demonstrates a major problem with intellectual property concepts. Actually hosting the library isn't very expensive. But we have made doing so illegal. Of course, authors deserve to live a decent life just like everyone else. We currently do that by restricting all access to duplications of information they have produced so that they can charge a fee for access, and that fee provides for their survival.
But we suffer an incalculable loss by making all this information restricted. In my view we would be MUCH better off as a society with respect to creativity, innovation, and other popular metrics for progress, if we actually made sure as a society that every person's survival was provided for with no need for them to pay for it. Then authors wouldn't need to get paid, engineers could do what they love to do and post all their work as open source, and we could have a free library for everyone. This extreme openness would in my mind lead to more rapid innovation, and markets would still function as first movers would maintain an advantage for new product releases, though they would have to keep moving as anything they've done that is worthwhile would be copied. But since no one's livelihood would be at stake, this is not a real issue.
This can all be done in a voluntary, libertarian society as long as we have community ownership of the means of production, and promote these ideals of community support in this society. And I think we would be way better off. Doing this would allow us to offer every book ever recorded for free to every person on Earth. A big change, but one with obviously a very big benefit to humanity.
One note though: people who want to own a lot for themselves really mess this up. So people would need to dissuade those people from acting that way. My preferred method of doing so is by starving them of workers and customers, though when it comes to control of land matters get more serious.
Ah. Reading it again, I can see what you mean. Unfortunately I encounter a lot of people here who would write that comment in all seriousness, so if it was sarcastic I hadn’t noticed. I think I’m still not sure about that.
Aside note, but: "Hear, hear!". Until years ago, I would normally use that kind of mockery of absurd ideas by reciting adherence to impossible beliefs and opinions, with the implicit that nobody would hold them unless insane, so of course I was being purely ironic.
I do not do that anymore. The world is clearly full of cases for which absurdities are tenable - not by an odd minority, but by "(pseudo)random people". If you declare them, the presumption of irony is gone.
I'm not sure what dead serious means, but the comment was quite literal and simple.
Which is not to say that I approve, merely to state the root explaination for the insane state of affairs. You could say dead serious in the sense that I wasn't joking, that really is the explaination.
I can't say what we should or shouldn't do. But I think we would be better off if we did that. Notice I said that in order to make this viable we would want to make sure to provide for everyone so that no one's livelihood is threatened. I look at it this way: digital information has value. digital information can be copied for free. therefore we can copy value for free. if we leverage this principle we can generate a whole lot of value for essentially zero cost. this distribution of value would lead to more value creation, as for example children that grow up with access to way more books are probably more likely to produce more valuable art, science, and engineering output of their own.
I think we would be better off if we do this with movies, music, games, and engineering designs like designs for medical equipment, cars, etc. most of the standard economic arguments for intellectual property restrictions are wrong and short sighted, and I have heard them all. I think this concept is well worth exploring both philosophically and in practice
What are the benefits vs. deadweight losses of a copyright-maximalist regime?
Who benefits? Who loses? And to what extent?
And what would the net total cost of rectifying any shift of advantage to a copyright-minmalist regime be?
For that last, look to the net total annual revenues of commercial media.
And keep in mind that we're carrying out this discussion on a technological ecosystem founded in very large part on FSF Free Software / Open Source software.
> And keep in mind that we're carrying out this discussion on a technological ecosystem founded in very large part on FSF Free Software / Open Source software.
That's about as relevant as saying we should keep in mind we are carrying out this conversation on hardware mostly made in China (and most free software is also written on hardware made in China) so perhaps that is where we should be looking for answers.
Your comment above suggested that stripping direct profit motive and incentives from copyright would produce a net harm. The Free Software model is based on the premise that stripping most exclusive rights under copyright, and specifically the ability to create new copies of, and new derivatives of works, is actually a net benefit to society, and often to the original author of a work who benefits by the collaborative and collective development and advancement of it.
One of the more fascinating characteristics of not only the Web but modern computing is that it's been the abandonment of proprietary interest in works (code, standards, protocols, operating systems) which have delivered the greatest value. AT&T did not voluntarily relinquish UNIX to the world, it was forced to do so under a 1950s US antitrust decree which granted the firm its monopoly in telecommunications but forbade it from engaging in the computer business.
Arpanet, IETF, the GNU Project, TBL's CERN document-distribution project, Linux, multiple programming languages, and other elements are all based on the notion of non-proprietary ownership rights to the extent that free use and (generally) modification are permitted, often with little further restriction (MIT/BSD licenses), or with the requirement that further distribution also requires source availability in preferred form for making modifications (GNU GPL, AGPL, etc.).
That is: my comment was salient to the (implied) central fallacy of your previous statement. A foundation which your subsequent replies suggest is in fact your point.
The broader class of cultural products includes different types of expression as subclasses: the discriminator is in the broader class - cultural products.
Ah hello Walter. I recognize your username as you and I have disagreed on this before.
I’m not saying people shouldn’t have the right to own property. But that a good way of organizing society is collective ownership of the means of production. If you are part owner in something with shares and a contract, that obviously still relies on property rights. That is how the stock market works after all.
EDIT: I am basically proposing a change in norms, rather than a change in rights. Currently the norm is individual private owners or ownership by a board of directors. I am proposing ownership by communities as a collective. Same rights involved, but a different norm.
Changing norms is a nice idea, but you'll have to lead by example I believe.
That said, I think privately- and collectively-owned enterprises can coexist (and compete!) just fine, and in a truly free market we'd see a healthy amount of both.
I am working on leading by example, though putting the pieces together for my career will take some time. I have so far already been working on this for a few years. I am spinning up a non profit open source project to design solar powered farming robots and I run two YouTube channels to promote my economic ideas and the farming robot project. Still, I do like to discuss the merits of the base philosophy here.
To be perfectly honest this feels like a knee jerk response that fails to engage with what I am saying. Nowhere did I say I am being prevented from forming a voluntary collective. But obviously if I believe we should form an economy comprised of many collectives then I need to discuss this idea with other people! I literally cannot form a collective by myself. For my part I am working on a career trajectory that will allow me to co-found a collective that owns machines which produce free hot meals for people. But that is a few years away. Still I find it interesting that you want to argue without really thinking about what I am saying. Obviously forming collectives necessarily involves discussing these ideas openly with others.
There certainly is no shortage of people who believe as you do in collectives. Go ahead and form one with them.
> you want to argue without really thinking about what I am saying
I've thought about it for many decades. I'm not trying to suppress your discussion, ideas, or any efforts you may make towards creating a (voluntary) collective.
Feel free.
BTW, I read somewhere that over 10,000 collectives had been created in the United States. Where are they now? Oh, they all failed.
A famous one was the Summer of Love in San Francisco in the 1960s. It only lasted for a summer. Seattle's Summer of Love a couple years ago lasted a few days before imploding. (Google CHAZ - Capitol Hill Autonomous Zone)
> There certainly is no shortage of people who believe as you do in collectives. Go ahead and form one with them.
It really feels like you are not reading my comments. I just said in the comment you are responding to that I am working on this, but it is a few years away. That is not to say I am not doing anything now, but I have to lay the ground work. For me to say I am actively working on this and to read your response as "go ahead and do it then" really feels like you do not want to engage with what I am saying.
> BTW, I read somewhere that over 10,000 collectives had been created in the United States. Where are they now? Oh, they all failed.
They all failed? Walter, I can easily find a long list of active co-ops just in the Bay Area alone. It seems to me you are speaking with authority on a topic you don't know very much about.
https://www.cooperationrichmond.org/coop-movement/worker-coo...
That said, failures do happen. Lots of privately owned businesses fail too. Would you suggest then that privately owned business as a concept is not viable?
I think what you are missing is that people in today's economy are suffering, and people are desperate for something to change. I suspect you are doing okay, and not desperate for change. Good for you. But the number of people in your position is literally dwindling from a statistical perspective in the USA. For everyone else, they cannot just sit comfortably and dismiss suggestions on how to change things. They need change, as for many it is literally a life and death situation.
> I can easily find a long list of active co-ops just in the Bay Area alone
A long list? A handful out of the zillions of companies in the bay area. It's statistical significance is zero. Some are charities, which are obviously not self-sustaining. I wouldn't be surprised if others were sustained by government checks. People start collectives in the US all the time, but the test is if they last. A typical collective seems to last about 2 years before imploding.
> Lots of privately owned businesses fail too. Would you suggest then that privately owned business as a concept is not viable?
Compare your list of a few communes with what's in the Yellow Pages for the area for the answer.
> I am working on this, but it is a few years away
While I wish you success in your endeavor, it's not really a collective unless you have other collective members signed up and collaborating with you. From much personal experience, I can attest that announcing one is working on something doesn't mean anything to anybody. It only matters if you've got something to deliver. I know that sounds harsh, and it is, but that's how things work.
> what you are missing is that people in today's economy are suffering, and people are desperate for something to change
I understand very well that they are suffering. They are suffering as the result of leftist government policies, not capitalism.
> A long list? A handful out of the zillions of companies in the bay area. It's statistical significance is zero.
You are moving the goal posts. You said "they all failed" but this is false. I was pointing out your error.
> Compare your list of a few communes with what's in the Yellow Pages for the area for the answer.
I believe from the above two quotes you are saying that co-ops have failed as a concept on their own merits. I.E. we can observe their relative popularity compared to hierarchical corporations and conclude that co-ops as an idea are a failure.
But of course, as you seem to be a libertarian capitalist, you must also believe in the correctness of ideas which have so far failed to gain traction. In fact there are many philosophical ideas throughout history which were superior to the status quo but had not gained traction. For example take democracy versus monarchy. Would you have looked at Europe 1000 years ago and concluded that democracy had failed in the marketplace of ideas? Certainly not. In the same way, cooperatively owned businesses are part of a broader labor movement which was systematically attacked by those in power to dismantle its strength.
One notable example is the Taft–Hartley act of 1947, which placed significant limitations on the legal right to strike or boycott. I am not an expert on the labor movement in the USA but without going in to an entire debate on the subject, you can understand how the existence of unions and cooperatives (which I lump together as part of a common labor movement) may not have failed purely on its own merits, but may have failed due to attacks from organized special interests. So without proving or disproving that claim, you can see how a simple examination of the state of the labor movement today is insufficient to conclude the merits of those ideas as they relate to the average person. Just as you could not look at Europe in 1000AD and conclude that democracy was an unworkable ideal, and that monarchy was obviously superior.
> I understand very well that they are suffering. They are suffering as the result of leftist government policies, not capitalism.
You mean the leftist government which placed significant limitations on the power of labor organizations? The leftist government which funds the largest war machine in human history? The leftist government that remains the only major country in the world without some form of universal health care? Sorry Walter, but only in the wild fantasies of FOX News commentators looking to get re-hired and Republican politicians trying to scare their base to secure their election is the USA a leftist government. Though to be clear, I am a libertarian communist and I want the US government to stay out of my way. This is why I advocate for cooperative ownership of the means of production and not government control of industry.
I appreciate that you finally chose to engage with what I was saying instead of dropping the same tired arguments I have heard and disproved so many times before. But of course I still disagree with your position. My issue with capitalism is not free markets. My issue with capitalism is direct control of our economy by an elite few who have class interests that run counter to the interests of the average person. The mismatch in interests between those in control (who seek power and profit) and the other 99% of the world population is what creates poverty and strife. And only when the people have control over the machinery that provides for their survival can we truly be free.
You say this as if this forum is not an appropriate place to begin to make such contact. We must discuss with other like minded people the merits of sharing our work as open source, but when people try to do that here it feels like you are saying they should go somewhere else to do it.
And we can certainly discuss the merits of our government granting ever increasing copyright terms. Copyright is not a natural concept, but an artificial one created by government agents. Surely this forum is a good place to question those choices and discuss alternatives.
I'm sorry if you inferred I thought your posts were inappropriate here. I do not think such, and it is not my intent.
Have you heard from any?
> copyright terms
I have a history here of advocating short copyright terms. I speak as someone who made a living selling copyrighted materials. The stuff I write today is all released under the Boost License, which is as close to public domain as it gets (because some countries have no legal concept of public domain).
I also have a history here of dispensing with patent laws, even though I hold some patents.
> I'm sorry if you inferred I thought your posts were inappropriate here. I do not think such, and it is not my intent.
Thank you for the clarification. When I try to discuss these ideas I feel like you respond, without asking what I have been doing to pursue this, by saying "go and start one then" "it's not illegal" "go find other people to talk to about this then", which feels very much like you do not think that is exactly what I am doing when discussing it here. It feels like you think I need to be somewhere else to be doing that. And it feels rude that you tell me to go and start one without understanding that I am laying the groundwork in my own life to be able to do something unprofitable indefinitely (my collective will be a non-profit collective). I cannot just quit my job and do that tomorrow, but I have been working very hard for years to arrange my life appropriately to follow that pursuit, and I have made great progress. Already I am getting several hundred dollars in donations to my non profit. Not enough to live off of, but I continue to produce youtube videos promoting my non profit engineering. Your comment of "go and start one then" feels very dismissive.
> Have you heard from any?
From like minded people? Yes! My generation is extremely interested in libertarian leftist ideas. In providing for everyone regardless of the market value of their skills. I run two small youtube channels where I promote these ideas, though there are many very large channels with millions of subscribers who talk about them with more skill than I can, as most of my time is dedicated to my non profit engineering organization.
I am glad we agree on copyright and patents. I too have a few patents but would prefer we dispense with patent laws. And I release everything of mine under permissive open source licenses.
It would be anarchist-communism. Sure, it has been tried, notably in Spain in the 1930s, but it was suppressed by the Marxists and Fascists. I don't think "it always results in poverty" has been reliably established.
We've basically already had that for decades thanks to public libraries. In fact, between large collections and inter-library loans most provide 1-2 orders of magnitude more content than the library of Alexandria ever held
My relatively uninformed opinion is that the Library of Alexandria was an amazing resource for its time, but in modern context is tremendously overrated. While it certainly contained a vast amount of knowledge for the time, the amount of valuable, useful, accessible information of a modern mid-size city library I would guess is substantially greater.
I think it's particularly insane that (ignoring scihub, which still faces legal battles and is at risk of being our next Library of Alexandria) the world's scientific knowledge is largely behind paywalls and inaccessible to most of humanity, even those millions whose taxes funded it.
> "Hey, wait, literally everyone could have the entire library of Alexandria in their house for a couple hundred bucks per person. Like, all the knowledge ever. Maybe that should be considered the good default of things.
There are projects out there that lean in the direction of offline viewing of lots of content, for example, having an offline backup of Wikipedia, such as:
- https://wiki.kiwix.org/wiki/Main_Page
- http://xowa.org/ (HTTPS seems not to work though)
I just wish that the process of actually accessing the data was a little bit more straightforward: https://dumps.wikimedia.org/backup-index.html (given how many different files there are to choose from, you'll probably want to read a tutorial or two)
That said, while text is perfectly doable, things do tend to get more difficult if you also want images or videos, because those do take up a lot of space.
what's even more mind boggling is that library is more or less publicly accessible _now_ via a home internet connection or a connection from your... friendly neighborhood library! but you're right, having unmitigated access to that is even more incredible.
I don't know about "all the knowledge ever", but to give you a baseline, the entire Wikipedia with images but without editing history, as last archived by Kiwix in May, 2022, is 90 Gb.
A web server can run just fine on e.g. Raspberry Pi Zero W ($10), exposing any such content to any smartphone etc able to connect to it via WiFi (Kiwix sells preconfigured SD cards for their content, even). So, assuming that most people already have a phone or a laptop or even something like a Kindle, the only non-negligible cost here is storage. And a 1 Tb SD card can be had for under $150 right now.
So if anything, I think OP is overly conservative, given that their estimate was "$200 per person". Unless that counts the devices used to consume the content, and not just storage and the server.
There are some language models like DeepMind RETRO that can make use of a 1TB text collection after the model is trained. The idea is to chunk up the text and make the blocks searchable with a neural embedding index. When you ask a question to the model, it first searches for relevant information and then adds it to the prompt. The result - you can get GPT-3 quality on a 25x smaller model. That means you could have your own GPT-3. Another advantage is that you can reference the sources and the generated text is more grounded.
An open source project could make a "search-engine-language-model" and in turn make this library much more accessible.
The interesting aspect of GPT-3+ is their reasoning like behavior such as Chains of thought or "step by step" generation. There's no evidence RETRO as is has sufficient capacity for the actually interesting behaviors.
I've specced this out a few times, it's an interesting exercise, and it depends on what "all human knowledge" is.
From the perspective of how much you individually could read in a lifetime: that's about 4,000 weeks (to roughly age 80-ish, and presuming you're not already reading at birth). Multiply that by the number of books you plan on reading a week. You'll probably read fewer than 4,000 books over your entire life. Many people read few if any books after graduating secondary school, and even a highly-motivated reader might be challenged to crack, say, 40,000 (ten per week for life).
The US Library of Congress has the world's largest book collection at about 40 million distinct titles.[1] There are another 130 million total catalogued items, including photographs, films, audio recordings, maps, pamphlets, and other items. Not all are textual. I'll stick to books.
At 4,000 per lifetime, you'd need 10,000 people simply to read all of the LoC's collection at a reasonable pace.
(By comparison, at the turn of the 20th century, the LoC's annual report to Congress noted that its cataloguing department could handle about 3,000 books per cataloguer per year, or a pace of about 60/week or 15/day. Over a 40-year career, a cataloguer might handle, if not read completely, 120,000 books.)
As a rough rule of thumb, a digitised ebook in PDF format runs about 5 MB.
Every single book in the Library of Congress in digital format would occupy about 200 TB of storage.
Current disk prices are running about $5 -- $20 / TB.
200 TB in raw disk would set you back $1,000 -- $4,000. Figure 2-4x multiplier for a disk storage system all told, and it's still roughly $4k -- $16k to have as local storage every last book in the US Library of Congress. That's well within scope of a moderately wealthy US household budget, and would be reasonable to consider for a small-town city library.
That's the technical storage cost, obviously not the rights or aquisition costs for the materials.
The Library of Congress's budget is about $800 million/yr.
Those 4,000 books you might read in a lifetime? They'd fit on about 20 GB worth of disk. If you're a 10x reader, 200 GB, and a truly dedicated 100x reader, about 2 TB. Those are well within the range of present desktop / laptop disk allocations, and represent a few tens of dollars of storage expense.
The Internet Archive computes $2/GB for storage in perpetuity. That's roughly 400 books worth of storage.
But a small household NAS and a very modest server platform (most routers can effectively operate as media servers) could rival a mid-sized city library for about $100 or less in actual hardware outlay. Those prices are falling by half about every 18 to 36 months, as they have been for decades.
Books are not all published content, and all published content is not all human knowledge. But standard published books are a good proxy for total cultural knowledge, and in all likelihood, then some.
________________________________
Notes:
1. See: https://www.loc.gov/about/general-information/ That includes about 25 million books in the main collection, and 15 million in "nonclassified print collections, including books in large type and raised characters, incunabula (books printed before 1501), monographs and serials, music, bound newspapers, pamphlets, technical reports and other printed material". I suspect you could roughly halve my estimates above based on 25 million vs. 40 million volumes, as many of the nonclassified works may duplicate the main collection.
"The African economy will start booming once they get UNLIMITED access to those 1800s manuals on measuring the Ether and where the best seal clubbing sites are in Alaska." Nobody ever thinks this through. Books are basically worthless at this point
This is like saying tribal knowledge is useless, or internet access is useless. Books are useless only if they’re inaccessible.
The point you’re trying to make (I think) is that information isn’t particularly useful on its own. True. It has to be relevant. With good, useful information you still need community support (whether that’s your household or your country), funding, time, etc. But that’s going to be the case whether you’re using a book or learning from a forum, or a video. Internet videos are worthless if you have no internet. Tribal knowledge is useless if no one can get the information from the previous generation. The medium is not the problem, and suggesting that relevancy is a fault of the medium is a misrepresentation.
then maybe we stop letting people push the copyright periods tentatively so more relevant things come into the public domain as well. this is a nothing argument.
There's a compelling argument that Germany's lax attitudes toward copyright in the 19th century helped propell that nation[1] to a dominant position in science and technology.
Germany now is a leading copyright-maximalist power, unfortunately.
________________________________
Notes:
1. I say nation and not country as the German language and cultural nationality pre-existed the German state in 1871.
So, if I get it right. First there was Libgen, which is mirrorable. Then, some Z-Library copied Libgen and added some more books, without making it mirrorable. The goal is to make these new books, which are not mirrorable — mirrorable (i.e. to "preserve" them).
So, why not just re-upload them to Libgen, then? I guess somebody will do that now anyway, but you could easily done it in the first place, without making your own mirror, which is not a mirror of Libgen. Just upload them to Libgen and make a mirror of Libgen.
> Q: Should the Z-Library collection be added to Library Genesis?
> A: Yes! However, it is tricky. Library Genesis splits out its collection between non-fiction and fiction. They also have relatively high quality standards. If you are interested in organizing all the books to meet their requirements, let us know.
Kudos to you, but I would probably be wary of admitting to participate in the illegal book piracy scene unless your pseudonymity is really tight. Thanks a lot for helping out, though.
Excellent. I hope more and more people start to see how absurd and evil the concept of "intellectual property" is. It should be totally rejected and any form of keeping useful information for yourself should be shunned and tabooized. In todays world, many have been programmed to believe that the would could not exits without such immoral restrictions, which is horrible.
I can't imagine getting rid of it completely would have good effects, as it would make any large scale production impossible. You might still get a few blog post, but getting books produced will be tricky and something big like a movie might be outright impossible. This is doubly true in the modern digital world were everything can be copied in a fraction of a second and where the piracy site, not the authors, will be what bubbles to the top of the search results.
It could also result in far more draconian DRM, as that would be the only way left to protect your work.
Now drastically lowering the time of copyright might be well worth it, something in the realm of 20 years should be enough. As copyright needs to get back to a point where things you consumed in your lifetime, make it into the public domain in your lifetime.
There are plenty of ways. Limit playback only to official DRM-locked devices and have those devices film the user. The moment a camera makes it into the picture you block their account for life. That's not even new tech, 3D face detection is standard part of many smartphones and laptops. And companies like Facebook have no problem locking you out forever from their services, for much milder infractions.
Or more practically, just look at cinemas. They already film the audience to prevent filming and with great success. While you still get illegal copies of a movie easily, it's only extremely low quality smartphone rubbish. The high quality piracy videos only shows up months later once the films hit streaming services or Bluray.
And all of that is just current tech, lets assume VR will become a success in the future. Now you have a device on your head that tracks every little one of your moves, including things like heart rate and eye-tracking. Furthermore, what streams to you isn't an easily ripable 2D copy of the movie, but the 3D view of sitting in a cinema. Good luck trying to rip that. And of course tamper proof hardware is a thing as well, so any attempt at opening it up will automatically self destruct it and phone home that you tampered with it.
You're in their building, so I imagine your consent is stated as part of the ticket verbiage or in small print when you buy it. Regardless, closed-circuit cameras are pretty ubiquitous in private businesses, and have been for decades.
The idea of the poster was that customers are recorded during projection and that there is active real-time monitoring of said reception, e.g. to detect abuse.
So, "you are being watched as you watch". Not by an uncaring attendant, but by some actively processing automation.
>There are plenty of ways. ... have those devices film the user.
I think you've identified pretty much the only way (that I can think of, anyway). If you surveil all consumers and instantly arrest them the moment they make a copy, mission accomplished.
Short of that, though, as long as the data is being presented out in the open (light waves, sound waves, text), it's going to be possible to "rip" it.
As a jumping-off point for thought: Piracy-by-default would work very well in a world where access to a work is not charged, but the social custom is that revenue is collected via pre- or during-production crowdfunding
You may see lower revenues, but how much cost is currently poured into resolving licensing / investing in DRM / etc, all for works to be pirated anyway?
While that is not fundamentally impossible, Star Citizen is so far the only crowdfunded thing I know that managed to collect AAA-game levels of money. Pretty much everything else isn't making nearly enough to actually finance the project and requires more funding outside of the crowdfunding. Furthermore, a lot of successful crowdfunding is build up on prior copyrighted work. People spend money on Star Citizen because they liked Wing Commander and Freelancer. Without copyright, those earlier works might not have existed to begin with or never gained the popularity they got.
You can't look at current crowdfunding levels in a world where the crowdfunded product is the exclusive "IP" of those receiving the money and then assume that tells you anything about a different world where there is no copyright and pre-funding creators is the only way to get new content of a certain scale. Or in other words: you're arguing that not having copyright would not work in a world built around copyright.
Yes, AAA games and live-action movies are so costly to make that they generally wouldn't get traction on crowdfunding platforms if they insisted on full funding. So we get this middle-of-the-road solution where crowdfunding pays for a fraction of the costs, with the rest to be recovered via post-release sales. But other kinds of content don't have that issue to anywhere near the same extent.
Many ebook vendors I use (InformIT, Packt, No Starch Press, Pragmatic Bookshelf, Manning, O'Reilly Back in the day), work at piracy-by-default mode. Only InformIT watermarks PDF copies, and there's no DRM to talk about.
So, when you have good content with reasonable prices, people also come and buy.
Also, there are some eBooks in Kobo store devoid of any DRM. So, publishers are not forced to use DRM on Kobo, as well.
Another idea is serialization. You release your work little chunks at a time, and if it's not sufficiently supported financially, you stop. A lot of Patreon is effectively funded like this. Obviously the medium has to be amenable to this (e.g. novels, graphic novels, visual novels, some video games).
> something in the realm of 20 years should be enough.
God no. Right now, we'd be getting remakes from every piece of pop-culture that was semi-popular in the 80s-2002 time frame. Not just movies, but TV series, books, theatre, musicals, ...
Sure, copyright should be shortened, but I don't begrudge (eg.) a one hit winner making money off their hit decades layer. Life of artist is reasonable, I think. They take a gamble on a profession with risky pay-out; if it works out at least once for them, let them reap the benefits.
Those remakes are exactly why it shouldn't last much longer. As those remakes aren't actually about the work itself, but about the brand recognition that work has in the public consciousness. It's just free advertisement that you don't get with an original work, which is why remakes and sequels are so popular, even if the connection to the original is little more than the title.
It's not the job of copyright to allow people to get lazy or companies profiting forever from the rights they bought. The goal should be to encourage original works and current copyright isn't very good at doing so.
Also it's not like the author would go completely penniless here. Just because everybody can make a StarWars doesn't mean there won't still be a George Lucas approved canon-StarWars. Slapping the authors name on your product to declare it the "Read Thing™" might still be worth a bit and might frankly be better than today's sequels that happen completely without any of the original creators being involved.
> Right now, we'd be getting remakes from every piece of pop-culture that was semi-popular in the 80s-2002 time frame. Not just movies, but TV series, books, theatre, musicals, ...
We are getting these things anyway, except that the originals are far less accessible than they should be. Entertainment trends are cyclical.
I meant a remake of Mr. & Mrs Smith; of Twins, of "Stop or my mom will shoot"... basically of everything that turned a profit, however meager.
Moreover, those remakes will be oblivious to what made the original a success. E.g., we'd get Dave Chapelle offered a Seinfeld show, much in that style, without a view to why it was (considered) funny back then, nor with tailoring it to Chapelle.
Put differently: we'd be inundated with much worse schlock than we get now.
I think 30 years is a decent base in my mind. Let it be renewed a couple times for an extra years if the owners think it’s worth it. That way most stuff flows into the public domain but some works can keep creating value for their creator.
Upvote for 30 years. Try it for a while, if creative industries aren't meaningfully damaged by the reduced IP rights, then maybe even take it down lower to like 25 or even 20 years.
> Right now, we'd be getting remakes from every piece of pop-culture that was semi-popular in the 80s-2002 time frame
That's exactly what we are getting right now though. Looking at the top ten of the box office right now, only three are not part of an existing franchise. Three of them are reboots of 80s movies, four if you count comic books. Large IP holders recognize under the current system, it is much more profitable to exploit their existing IP than to come up with new concepts. If the copyright terms were significantly shorter, the pressure to be original would be far higher.
It protects nothing important. That people must participate in selling their creativity and creative labors to eat is a more fundamental bug in society and as someone who makes a living off "IP" only I can't wait for the day that my life is secured by something other than armed threats of violence against people sharing ideas and information.
I've just spent 2 years writing something which ain't got anything else like it. It was technically pretty difficult and needed a lot of background knowledge.
Should I be disallowed to commercialise it?
I partly get where you stand but if I was in a society that you seem to endorse my first question would be, other than for the love of doing it, why sink so much effort into a thing only to get nothing back. It almost is the opposite of a meritocracy.
As a musician and filmmaker I see it like this: once my work is published I stop to be in total control of it. That means I cannot control (or sometimes even know) what paths my works will take, how it is being understood, who will do what with it etc.
Of course I like to be credited for my work — but some fan adding my work to a pirate page would not be a concern but rather a bit flattering. What would anger me would be someone claiming credit for themselves, some rich company taking the material without paying me, things of that sort.
You're being short-sighted if you think you're going to be able to sell material in the future if the recording company or studio goes out of business. The only reason they have money to support you is because the public supports them.
Huh? I do sell material right now without any recording company or studio in the loop. Why would that change if they go bust? If anything the demand for people like me would increase.
There are quite some hoops small self publishers have to jump through to get their music sold in a way it can actually interfere with the big corps in that space. These hoops are all there to make the market entrance harder, they are not there to protect artists.
Before refreshing the thread I wondered if I was overlooking something in your earlier post, it feels so strange to me to assume a record company/studio are involved!
If you as a private person own a patent, you are losing it anyways...because you cannot fight some mega corporation in court to defend it. It's too expensive, and the biggest corps just take what they want due to having more financial resources.
Note that if you don't defend it in court, our justice system thinks it is less valueable to protect. Which in itself is kind of ridiculous.
Also, right to commercialization has nothing to do with intellectual property.
ONLY private persons should own patents, and it should be illegal even for employers or institutions (academic, research) to own patents of people employed to do research. At most companies and istitutions should be allowed to add a clause of "perpetual-free-usage of any patents of employees resulting from direct work" - but an employee or group-of-employees holdig a patent should still be able to license it to other companies too. If businesses are hurt, that's GOOD, most should not exist as coagulated entities.
We're not gonna have proper freedom preserving capitalism ultil we properly decentralize: we all work like swarms of 1-person-companies / solopreneurs contracting between eachother. (No, not the gig-economy, in that distopia we're all still slaves that can't band together to fight the masters.) Legislation will automatically have to be refacored to make this work. With some exceptions, only human individuals should hold most property, not companies and not institutions. Groups/collectives only when the group members directly worked together and know eachother.
And Intellectual Property would just "click in" in in such context. IP sounds hellish and disfunctional because our own practically techno-communist society (yeah, even USA is practically "communist" nowadays in a way - newsflash: "the reds" have won! even the f symbolism is there, "the red pill" is the good one now... all's backwards) is messed up. It makes perfect sense in a hyper-decentralized hyper-individualistic REALLY democratic and REALLY capitalist society.
> If you as a private person own a patent, you are losing it anyways
There's always someone who just has to spread the despair and helplessness. Every bloody time. Tell me, in what way does this add to the discussion? I wish there was a ban on these kind of comments.
> right to commercialization has nothing to do with intellectual property.
I don't understand. If I don't own it I can't market it, right?
> There's always someone who just has to spread the despair and helplessness. Every bloody time. Tell me, in what way does this add to the discussion? I wish there was a ban on these kind of comments.
It you only want answers you like go talk to a mirror. The way this adds to the discussion should be pretty obvious, but let me spill it out: If one of the main arguments for IP is wrong it's costs/benefits have to be reevaluated.
That's not what they said. They said "the power distribution is so skewed against actual people the current system must be destroyed as a matter of principle". What about that is hopeless? It offers a clear escape and very little ambiguity. If anything, it should be energising to a casual reader.
He said "If you as a private person own a patent, you are losing it anyways...because you cannot fight some mega corporation in court to defend it. It's too expensive, and the biggest corps just take what they want due to having more financial resources"
IOW he didn't say what you claim. Plus he gave no way forward to achieve his goals, and am I not in any position right now to Bring Down The Man, much as The Man may need it, so it was just a hopeless valueless post.
Within the larger context of this thread, the intent was clear. You yourself were responding to a post that gives this necessary context: Intellectual property is absurd and evil. The poster you reply to tells you why it is absurd and evil. Those two things together lead you naturally to the conclusion that it must be abolished.
As such, I would say what they said is absolutely what I claimed. I just explained it in plainer terms and without requiring you to (re)read the rest of the thread.
Edit: The fact that you summarised "this system must be destroyed" as "hopeless" says something about your own fundamental hopelessness and despair. Which you kind of ironically attributed to someone else.
I don't agree, just the excessive enforcement of it may well be.
> The fact that you summarised "this system must be destroyed" as "hopeless" says something
No. What I said was:
> hopeless valueless post.
The post was valueless because it gives no direction, no means. And I detest such posts because they offer nothing useful. They are unconstructive. Hence are valueless.
That's a constructive answer, so thanks, although I have to ask your view of how BSD was used by apple to build a massive fortune but without paying significantly back to the BSD community.
Overheard from an IP lawyer I stood near to once - something about f/oss software being incorporated into commercial products being a big issue (for the free stuff, not the company doing the 'stealing' of it). Your view?
This is the exact reason why copyleft licenses are important: you can reuse (A)GPL content, but if you do so the result must be given back to the community. If you're not going to pay anything, at least your product benefits everyone
BSD content can be taken without contributing back, it's in the license. Copyleft content cannot. That's the main difference, and even Google doesn't want to touch copyleft content with a 10-foot pole because of the fear they'd have to share their internal sauce, so I presume companies still take licenses into account.
Stop ignoring what I said. It's not about BSD licenses it's about more restrictive ones
You:
> This is the exact reason why copyleft licenses are important: you can reuse (A)GPL content, but if you do so the result must be given back to the community
Me: the frigging licences are being ignored. Giving back to the community is not happening. Code is being stolen. Are you trying to ignore what's being said?
>BSD was used by apple to build a massive fortune but without paying significantly back to the BSD community.
Welcome to free software. :)
If you had to "pay back significantly" to use it, it would not be "free software".
It's regrettable, but it's the price of freedom. Whether that's worth it is subjective. Stallman's answer to this was the GPL. (I bet Apple wouldn't've touched BSD if it was GPL'd.) Newer, hybrid license have also emerged (like the MPL) that attempt to strike a better balance between freedom and back-contributions.
The way I see it the problem in both cases here is that the company is using stuff created by others but not giving the same freedom to users of ther derived work.
Without copyright this situation is a lot more equalized as now you can have people make modified versions of macOS and redistribute those legally - yes, not having source access makes that more difficult, but not impossible and even if you did need the source, it only has to leak once.
> Can I freely use your toilet then? your electricity?
No because once you used them I don't have them anymore. There's a reason IP has different rules than physical property.
> You pay for the plumber, why don't you pay for entertainment?
Ok, that's a more appropriate analogy, but then again, should my plumber get a recurring fee for the work they already did, when I use the faucet to give drinks to my friends?
> No because once you used them I don't have them anymore.
I mean entering your house, doing my business in your toilet and leave. I won't take your toilet with me, I'll come back when I need it again.
> when I use the faucet to give drinks to my friends?
But your friends all have their own house with their own plumbing they paid him for, so he can continue making a living from his craft.
Writing a book takes months or years, not 2 hours like repairing a toilet, so of course the author needs to ask for money from everyone who wants to access it.
> I mean entering your house, doing my business in your toilet and leave. I won't take your toilet with me, I'll come back when I need it again
We're moving the goalpost here. And we're still talking about physical property (or possession) vs. intellectual property. I suggest we stop with that line of reasoning/metaphor.
> But your friends all have their own house with their own plumbing they paid him for, so he can continue making a living from his craft.
Again, the metaphor does not hold. Such a situation only means that the plumber is the only plumber in town. If we have multiple plumbers (so we can stick to the metaphor) my plumber can't forbid me to use my plumbing for certain uses (like watering my plants or offering water to my friends for free or for a fee).
> Writing a book takes months or years, not 2 hours like repairing a toilet, so of course the author needs to ask for money from everyone who wants to access it.
So it's just a quantitative difference? I can pay 1 cent per 1000 toilet flushes then. Seems fair.
My point is, I think these kind of metaphors don't work here precisely because intellectual work is its own thing.
> I mean entering your house, doing my business in your toilet and leave. I won't take your toilet with me, I'll come back when I need it again.
If your use of my toilet doesn't affect me in anyway then I don't see why I should have a problem with that. If we are talking about you stinking up the place, using all my toilet paper and blocking the john whenever I need to go then we are talking about something very different from "IP".
You can freely use the design of my toilet, certainly.
> You pay for the plumber, why don't you pay for entertainment?
I pay for physical objects that are made for me, like books; and I pay when people come play their music (even if payment is not mandatory).
But TBH - I don't think that's the appropriate moral basis for things. For example, we don't pay for the huge amount of work our parents do for us; nor for the not-for-profit activities we often rely on etc. I would much rather support a non-exchange-based social arrangement.
Exactly. I'm in complete agreement. If someone else wants to give away their work for free, that's fine with me. But if I want to charge, I should be able to charge for my labor just like the farmer or the baker. Certainly they'll charge me for bread.
Writing is hard work. Writing books and getting them technically correct is expensive. This is very short-sighted.
but since noone can ever prove that his was the first incarnation of an idea, nobody can be criminalized for also doing things in a certain way.
the concept of protecting invention for some time to facilitate reward is not without merit, but the implementation of IP law and practise has gone so far astray that it's overdue to rethink the whole thing.
I'm also write technical writing (including academic publications) and I do feel like I need to get monetary gains from all my works. But I also believe that copyright law is too restricted and too long (15 years after the death of original author/artist/writer/etc should be enough)
15 years period should be enough. If you haven't contributed anything to society within the past 15 years, why should you get to live off the 1 thing you did? Shouldn't you be incentivized to be productive? That's the point of copyright law in the first place after all.
It's extremely easy to say what you say without having written a book. Harry Potter into public domain I'm prob OK with but some people put years of their life into technical books. My father did.
EDIT: ach, didn't read your post propely - first line says "
I'm also write technical writing (including academic publications)" so you have a strong position to hold your view - sorry
Fixing the plumbing usually takes much less time than writing a book, so recompensing the plumber for the time spent doing the plumbing in a one-and-done lump sum is much more tenable than doing the same for an author, who might have taken weeks, months or even years writing a book.
Plus fixing the plumbing usually only needs to be done infrequently, whereas a book is read in a comparatively short time, so from that point of view a consumer would also be willing (and able) to spend more per plumbing fix than per book.
So consequently you need some sort of arrangements that allow for splitting the necessary payment to the author up across multiple people and/or over time.
Additionally, artists often speculatively create works without knowing for sure whether the public will take any interest in their work, or not. Copyright certainly has its faults, but it does cater for precisely that scenario by ensuring that you can insist on getting paid afterwards if people enjoy and want access to your work, and you don't need to acquire all the necessary funding up front. If you can't come up with enough money, you can even "just" invest your spare time instead and still get paid back if the work turns out be successful.
Plumbers on the other hand I assume rarely have the desire to speculatively fix up other people's plumbing and then hope to get paid afterwards if they did a good job.
Rewarding artists after they've already produced the artistic work if they're successful also makes sense in that the quality of artistic output can vary, and so there's a bigger risk of disappointment if you need to pay far in advance, before the work has possibly even been produced.
And because the quality of an artistic work is also very much a subjective matter, it'd also be much more difficult getting your money back in that case, whereas plumbing can mostly be judged according to much more objective standards, so getting your money back – through the legal system if required - is again a more tenable affair.
Of course the existence of Kickstarter and the like or even just plain old pre-orders show that to some extent people are willing to take that risk of paying in advance, but whether that would be enough if it was the only reasonable source of funding for artistic works? It'd also mean that if you can't convince people to pay you in advance (and good luck with that if you're some unknown newcomer), then good luck getting any more money afterwards, even if the book/… then turns out to be wildly popular afterwards.
> Is it wise to sink your time into something you don't really like doing?
If it pays your bills and gives you spare time to do what you like, then often yes.
creating nontrivial intellectual property usually takes a lot of work, work that has to be paid otherwise it could literally not be done, i.e. the great people who create those works could attempt maybe one such work and in most modern cases they would not get close to finishing it before they literally run out of money to pay for rent and food.
the system around intellectual property has some issues but some form of protection / ownership needs to be there.
if you had your wish and the concept of IP was treated as shunned and taboo you would quickly live in a world with vastly diminished amount and quality of art, science and technology.
There should be a fine line in intellectual property rights. I see where you are coming from - quite often intellectual property is used a a moat to protect insane revenues and, as a repercussion, delay or slowdown our progress as humanity.
But it is also use to protect unique creator revenue and encourage to create more.
If you ask where the fine line should be I have no immediate answer, but abolishing intellectual property rights just like enforcing them at all costs doesn't seem to be the optimal course of action to me.
> But it is also use to protect unique creator revenue and encourage to create more.
This thinking is an artifact of an economic system so dependent on scarcity for its motivation that it is now generating most of the scarcity in the world.
We now have the technology for creative implementations of "From each according to its ability, to each according to its needs". Just keep track of how much each thing is used, and reward creators from a corporate-tax-funded pool. Every for-profit entity contributes proportionally to its profit, and can use any idea for free.
Even just having substantial "prizes" for the most widely used stuff (whether publicly or privately funded) would go a long way. (Most of our public funding for content creation now happens as grants, basically rewarding compelling ideas and then simply trusting that the reward money will be spent on doing worthwhile things. This does not work very well, for obvious reasons.)
Well said. The reality is that even in our current work without copyrigh a lot of value is added by creators who never see any real reward for it. Yet instead looking at how we can better reward creators we continue with continue with this insane system that has been shown be easily abused to concentrate wealth while having a massive cost for society by restricting what we all do and making everyone pay for enforcing those restrictions.
I think that's pretty much the case for shortening time frame of IP rights. Which means, they shouldn't be treated as property has been traditionally treated. Although as a socialist, I think perhaps we shouldn't have time unlimited property rights (above certain reasonable boundary, say $10M) in general.
Interesting, but being very opposite of socialist myself I am wholeheartedly with you here on limiting timeframe of IP. It partially solves the problem.
And this arguably should be extended to tangible assets as well - I like Singapore model where housing property is sold for specific timeframe. It simplifies a lot of redevelopment.
We tend to think of ownership as in absolute owning of an asset for indefinite time. For a lot of things in Singapore you can ownership (i.e. own it) in that sense, but after sone time you have either to return it or stop using (rendering it useless). This applies to assets like homes, cars, etc.
If it’s not feasible to enforce those policies, government just imposes hefty tax on assets, ensuring you extract (or contribute) sufficient added value from asset.
Such comments remind me provocateur methods used by police to suppress any legitimate critique. It works like this:
1. there is some legitimate issue
2. people protest (peacefully)
3. a provocateur does something over the top (violence, absurd statements like "defund the police")
4. legitimate protesters are discredited because of 3.
If you want to drastically reduce the number of new books, songs, content then sure.
Otherwise, I am having a really hard time understanding how can you suggest that I don't own the book I spend a *decade* to write. It is just as mine as the car you drive is yours.
To tighten regulations around intellectual property to make sure that it is not abused - sure.
Most people writing books, e.g. these, https://www.oreilly.com/ would not do so if they could not monetize it - which would be even harder if there was no legal protection of the work they did. I don't like your idea at all.
Its funny that often the only people who think this are programmers. everyone else who hopes to make a living doesn't. ATM, not even the NYT Best Sellers make a livable wage and now with Dall-E, artists wont either. COVID basically killed a lot of musicians income. DJs and so on. Its probably also why media itself has achieved such a mediocre state, no authors, no novels, no adaptation. we are in a future which much of the current media is a sea of mediocrity. There's lots of content sure, but barely any that's worth a damn.
Interesting to call it evil, when your whole reasoning is driven by pure greed. I guess you don't see greed as something evil? Or do you think there is a human right to consume? That everyone has the right to experience and own everything for free, what others have created and worked hard for?
Anyway, "intellectual property" has proven to be a driver of quality, as the earned money gives liberty and time for the creators. I don't see how this is a bad thing. Sure, there are warts in the system and we should get rid of them, but not by removing the whole good side.
I think making books, and knowledge in general, available to everybody is an essential public service. At a time when disinformation is so widespread, actual data and understanding are essential weapons against it.
In that sense, these piracy sites are acting like global public libraries open to everybody with an internet connection.
At the same time, I feel the authors, researchers, editors, and other support staff that gift the world with knowledge should be rewarded for their effort.
It'd be great if there's an honor system that enables readers around the world to pay them some amount to show gratitude.
The current system is of two extremes -- either first pay the price set by the publisher to even browse a book (and that price is ridiculously high in underdeveloped countries), or get the full book without paying anything.
There should be a spectrum of rental and gratitude amounts in between. The publishers themselves can together set up such an online library to make it all legal. Not only will they help humanity, but they'll also get some of the revenue they're currently missing out on. A balance seems to have been struck in the music business with most of it being legal and accessible nowadays. They should do it for books too.
Also, the information about nuclear, chemical and bio weapons should be accessible to everyone. Preferably as DIY recipes, that you can follow at home.
If you spend time learning, spend time putting together your knowledge, and spend time sharing, it is very sensible you expect some economic return if you choose to. Alternatively you can Open Source. It’s the authors choice.
But we live in an infant society, with many grown ups acting like spoilt childrens, saying “I want to get that fancy FAANG job, I want to be wealthy, and I expect to do it copy-pasting others people knowledge and infringing IP, but if someone else begs to differ I start whining”
I see things like this, and I wonder why the following software doesn't exist:
I want a piece of software to which I can add a collection of files, say multiple TB. The software will then behave a bit like a BitTorrent tracker, and know which peer has which files. A peer joining this swarm will be able to say "I want to donate X GB of space", and the tracker would tell it "OK, then download and seed these files, which are the least seeded".
The peer would download the files from the rest of the swarm and make them available to it. Then, a request layer on top of the swarm could be used to request a file from the peer which had it. Adding/removing files to this collection would also need to be a feature.
Does anyone know if anything like this exists? If not, how easy would it be to make something like it out of BitTorrent? I might give it a go.
Have always thought this would be a great way for many people to share well organized Plex libraries. Many semi-overlapping libraries basically creating a virtual Netflix, with some way to stream in Plex from the whole library no matter if you actually have the content locally or not
Freenet is built around similar ideas, combined with encryption an anonymization. Which hopefully adds the benefit of you not being legally liable of distributing CSAM. The gist of Freenet is that you can operate a freesite or upload files, and the content is redundantly dispersed in encrypted parts among a number of other Freenet peers. They don't know what they're hosting and neither do you, the client just fills up the allotted space and uses some bandwidth, that's all.
Regarding of how easy to make something like this network, I'd wager it's pretty hard. There will be a lot of questions, even while establishing the happy path, for example how you manage the updates, especially when you update the protocol, not just the software, and how you effectively manage the volume of search requests, how you distribute the files etc.
And then there's the abuse the network will inevitably get. How you handle spammers, CSAM, malware, ISPs that throttle/block you, the legal risk you put your clients up to, etc. Nice big can of worms. To begin opening it, I suggest a reading through Wikipedia's Peer to peer file sharing article, and especially the File sharing modal on the right, which nicely captures the ideas that have been tried so far.
My idea is similar but slightly different. You'd run your client and then choose whose provider's data to seed, eg you'd add the Internet Archive and Libgen to your datasets.
Only those entities would be able to push data to you, nobody else, so if you trust the providers you specified, you should be good.
I think BitTorrent has all the pieces needed for a fully distributed version of your idea. My initial thought is that you could publish a magnet link that points to a mutable DHT item, which in turn points to a torrent that has a JSON file with some metadata and a list of infohashes the publisher cares about. The client could then scrape the "leaf" torrents from multiple lists to get the peer counts and use that for local prioritization of what to store. By reusing existing torrents you could then share resources with standard torrent clients that are unaware of your system.
The list idea could be extended to nested lists (stavros recommends Internet Archive) for discoverability and composition.
If you go with v2 or hybrid torrents from the beginning you could deduplicate and cross seed files from different collections.
The lists could also be modified to have torrents to exclude, possibly using some salt + rehash idea to make it hard to reverse into a list of e.g. CSAM you don't want to publish as is.
Feels like a neat project that could interoperate nicely with existing torrents.
Thanks, that's exactly the feedback I was looking for! This sounds like it would work, though I'd have to see if it would scale to thousands or millions of files. Still, great for a PoC, thank you!
Donating space is only half of the equation. I think donating bandwidth is a more significant aspect, especially with ISP's like Comcast which provide very little upload bandwidth compared to download. You'd expect that uploads wouldn't impact download speeds, but it's not the case. A saturated upload bandwidth means, ACK packets getting delayed, which means connections would be established way more slower. So, it's not a feasible prospect unless the competition takes over.
I'm a senior software developer with experience in developing communications protocols and networking software. I haven't been able to set up QoS properly ever, especially in a way that addresses all my needs. Expecting end users to do it is way beyond the realm of possibility IMHO.
It is a huge pain in the ass to get it working right I will admit. Took me hours of reading docs and tweaking parameters. I'm still not sure I understand all of it, but I managed to get it working such that it at least meets my needs.
Something similar exists: iabackup[1][2]. It is designed to host an independent copy of (some of) the Internet Archive using git-annex. You tell it how much storage you want to donate and git-annex fills your disk with data from the least-seeded files IIRC. Its focus is on data backup, not data serving though.
For your idea, once all the local storage everywhere is filled up with evenly distributed redundant copies, and then a new file is added, would peers arbitrarly choose other files to delete in order to make room for the new file?
IPFS can't give you the least-seeded files. It can't give you any files, you have to manually pin the set you want, and you pin all of it, and it won't automatically change.
It's not very convenient for archival, whereas the system I'm talking about would be (in my opinion, anyway).
You'd trust the publisher, so you'd say "I want to help the Internet Archive with its archiving" and they'd be responsible for what files they pushed to your storage.
This is basically an opportunistic, distributed filesystem that's designed to work with high latency links. Only the tracker can write to its nodes, but anyone can read.
Private trackers do this somewhat via incentives. Files with less seeds earn you more bonus points. If you want to farm BP you can sort files by seeders and download poorly seeded torrents.
I keep thinking in my mind this makes sense but then ponder about the "local" impact - what would a user "donating space" be storing locally and might risk? Where does the risk lay, is it with the uploader, the seeder, or the service provider?
I would love to see this pickup but it feels as though it would devolve into one of the listed projects.
It "sounds" as though it should use some kind of Distributed Ledger Technology, IPFS seem the most fit, maybe some kind of next layer that offers just this service (IPFS being GDPR compliant might mitigate some risk?)
I wonder if there are any search engines dedicated to indexing these kinds of libraries. I know there's a decent one just for scihub, but it would be awesome if I could do a Google-style search that returned the contents of books, magazines and journal articles instead of just websites.
Google Books, like so many Google projects, had a dual purpose. Making books accessible is noble and on-mission. But more importantly natural language models can be trained on the scanned corpus.
The same was true of the original GOOG 411, which provided a free service, but was really put in place to train up their voice recognition projects.
This is a long running strategy of Google, and it's a shrewd one. The main thing is not to mistake it for a public good. It is an act of privatization.
That can't be true, Google Books was 15 years prior to the advent of large language models. Until 2020 nobody could train on such a large collection.
I think Google initially wanted to augment the web results with a large book collection to get "all the world information and make it searchable", same with Google News.
Before there were neural language models, there were n-gram models, skip n-gram models, latent dirichlet allocation models, a whole zoo of non-neural machine learning models. Google used some of them to power old Google Translate, which was still a very impressive piece of technology.
> That can't be true, Google Books was 15 years prior to the advent of large language models. Until 2020 nobody could train on such a large collection.
... pull the other one.
Okay, your statement could be true depending on what you mean by "large". But what makes you think that companies like Google haven't been working on language models without releasing them and/or without discussing them publicly? There's an advantage to be had by keeping corporate secrets.
The gap between research and publication is real but its length is less than one year. I don't think even Google has the resources to do it secretly. Who would work on it and how could they have kept the secret so tight? AI researchers want to publish, especially the best ones, it's essential for their careers. Someone else could plant the flag on their discovery and claim the fame.
For a time while I was working at Google I nursed the idea of transferring to the natural language processing / language modeling areas, but I only have a bachelors in linguistics whereas it seems like they strongly prefer PhDs. Linguistics PhDs can pretty much teach linguistics or go into another field. It would not be hard for Google to find a couple hundred of them and entice them to join up.
At the time Google Books started, “big data” was in vogue and there were data-driven ML approaches to NLP e.g. word2vec, that would have benefited from a large corpus
More like an ad company driven by dark patterns that needs to pretend being good and nice and innovative as otherwise it will lose goodwill just like Facebook did.
I think the Google founders were inherently nicer people than the Facebook founder, but their companies ended up converging on advertising because that's really the only way their products could be profitable.
I know it exists, but it has appeared to languish for years. They rely on third parties now for inclusion of books, whereas in the early days they innovated on their own with specialized scanning technology. They seemed quite proud of it a decade ago. When was the last time Google has touted their books project? Have they even integrated searching books into their main search (which was supposed to catalogue And make searchable all the world’s information)?
It was paralyzed by legal disputes with book publishers.
In the years the lawsuits were going on, nearly everyone left the project. And then the lawyers have put in so many red lines that it's nearly impossible to make any changes to it.
Yup, I read about that on Wikipedia, but I can't help but not care. If a company touts massive initiatives and then gets bogged down in lawsuits, it seems like they didn't do the basic due diligence to avoid that. (Uber, AirBNB, and others seem to also have these headwinds, though not to the extent that it led to permanent paralysis, so maybe Google made a bet they thought they'd win and then didn't, whereas these other companies did.). I can't help but wonder why, with Google's resources vs. Uber or AirBNB, they couldn't keep moving forward if they wanted to. Strike deals, pay people, whatever. If it matters (i.e. if it involved ads) they would have done it.
Given Google's behaviour since the end of their period of true innovative excellence, I don't cut them much slack.
This well-written article, "Torching the Modern-Day Library of Alexandria"[0], might change your mind on that. I found it to be a compelling and tragic story.
Edit: actually, I think this would make a good submission. Looks like it hasn't been posted since 2017.
> People have been trying to build a library like this for ages—to do so, they’ve said, would be to erect one of the great humanitarian artifacts of all time—and here we’ve done the work to make it real and we were about to give it to the world and now, instead, it’s 50 or 60 petabytes on disk, and the only people who can see it are half a dozen engineers on the project who happen to have access because they’re the ones responsible for locking it up.
Could have been great but it wasn't perfect for everyone so it was scraped because surely Congress would take up this noble quest...
I search, see the results, and then click on a result to view the applicable contents of a particular book. In the contents that get displayed, the search term I had entered is highlighted.
a) sometimes the snippet returned doesn't even contain the search term
b) you'll only get the snippets for the first three results, and that's that.
There might be even cases where the book is indexed, but the publisher has disabled even snippets, but I'm not sure about that. But a very limited snippets-only search instead of a full preview is certainly a thing.
Does Calibre fit the bill? The program itself is great, but it supports a plugin system that really puts it over the top. One of them automagically strips off Adobe DRM for any book loaded in to it.
Seems that with calibre-web[1] we can have nice web front-end. But it doesn't have full-text search inside the content of the books though, unlike google books.
> Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search
> *Easily deploy Ambar with a single docker-compose file; *Perform Google-like search through your documents and contents of your images; *Tag your documents; *Use a simple REST API to integrate Ambar into your workflow
But Google could never do it right. It strikes me as obvious that anything that's going try to be a genuinely modern library can't sidestep, or even "work with" the present capitalist+copyright regime. It will just have to be a fight.
To be fair, Z-Library doesn't charge unless you want to download more than 10 books per 24 hour period. That's per account and although they ask you not to open multiple accounts they don't seem to do anything to stop you.
You don't have to pay them if you go steal them yourself. If you find it more convenient to pay another thief to do it for you then I don't think that's significantly less fair.
Could you please top posting unsubstantive comments to HN? You've been doing it a lot, unfortunately, and we're trying for a different sort of discussion here.
I don't quite agree. I mean, they provide useful service, and it costs money to run it. It's ok that they earn (even if it's actually making a profit, not just covering the costs).
That being said, 10 downloads/day feels a bit restrictive to me. I'd get if it was 100, or 50, heck, maybe even 20. I mean, I don't appreciate that it's not mirrorable in the first place, but maybe they cannot afford it, I don't know... But 10 feels less than somebody researching a new topic might need to access in a day, even if he won't read them all immediately.
…That being said as well, it has some really nice UI. I wish somebody did it for Libgen.
It's a good thing their hosting provider is okay with providing bandwidth per individual user account on the site of up to—how many books did you say again, 50?
Without sarcasm, I don't think the bandwidth bill cares that you find it restrictive. That even more than a handful are free every day for every account is honestly a lot, since that means virtually nobody will need to contribute to the costs they're collectively incurring. And if you're unable to pay, you can still skim a few dozen books (making two or three accounts isn't that hard to do by hand) every day, and go back to any you've already downloaded previously too. And offer them to friends to offload the server.
Read — no, I don't. Download to skim and see the contents — yes (even though I don't do it everyday, obviously). In fact, I rarely download less than 4 books at once, except for occasions when it's a new book of my favorite writer (in which case I can as well just buy it). Instead, there is some topic, some reason why I need these books, and I somehow can gather a dozen of recommendations, maybe more, then I need to actually get a look inside of them, to see what I'll be reading (if anything). It also happens that I kinda know the book, but not precisely enough, because some authors really like to milk the topic by publishing 5 books kinda the same as the first successful one, and if they are technical they can have 5 revisions each. I may not read them at all, or I may be reading them during the whole next year, but I'll need to get them all at once at first.
And if we also count papers, which this site provides too — easily.
I worked as a volunteer to fundraise for my local library. I can see no meaningful moral difference between this and that. I get that the law is different, I just think the law is off here.
Total compliance with the regulatory capture of publishing companies? Full support of the landgrab claims of the Disney corporation et al?
You don't have to be an anarchist to look at the status quo and think some amount of civil disobedience is the correct, proper and right thing to do. Also believing that a claim, fully Disney supported, that such an amount of civil disobedience is somehow ethically evil is, in fact, somewhat disreputable. As disreputable as the insanely high journal subscription fees for taxpayer funded research, for example.
The publishing companies chose this path willingly and with prejudice for their profit turning the relevant law against the people. Are they reputable given they did so? It's hardly an outlying position around here to think they really aren't anything of the sort. Refusing to accept that on mass, until appropriate reform is supported and enacted could be considered quite worthwhile.
Not all authors and not all publishers are alike. It is easy to cherry pick a few very bad apples, like Disney (especially since much of their success can be attributed to retelling stories from the public domain) or those who are actively anti-library (from the perspective of property rights, at the level of a physical book or purchased ebook - not the right to make and distribute copies). On the other hand you also have authors who have reasonable views of copyright and even give away their books. You also have publishers who are willing to have their books lent by libraries or don't place artificial restrictions on consumers when it comes to ebooks.
Of course, access to publicly funded research is another issue altogether (and one that should consider patents as well as copyrights).
The question becomes, how many of those targets of "civil disobedience" have earned that privilege and how many are simply collateral damage? Also, ideally, civil disobedience should be aimed towards changing laws rather than people simply taking what they want.
This is exactly backwards. If you accept some amount of civil disobedience is justifiable then you can't very well go and say anyone "infringing copyright" is "disreputable."
I said "some amount" and you don't disagree. We're not discussing what that precise amount is or how it should be targeted, that's a different discussion and can be framed in multiple, competing ways.
Feel free to lay out what you think is the correct amount and correctly targeted but even if you are wildly wrong in your thoughts, your being wrong doesn't make anyone else with differing thoughts about that amount and targeting disreputable, even if they are for some other reason.
There is a legitimate argument that the law should always be followed and any law-breaking is inherently disreputable. It's not my argument although I acknowledge it and isn't fashionable nowadays given how that must be applied to, for example, the civil rights movement & Dr King.
It would depend on how they use the funds. I wouldn't be surprised if bandwidth expenses made up a majority of what it cost to run Z-Library and that money has to come from somewhere.
It's really funny to think about how the advances of technology keeps changing how we perceive books.
7TB is even a commodity disk these days. And it's a lot less than the torrent of scientific papers that floated around some time ago (that was ~18TB IIRC).
I foresee storage density reaching the point that for most ordinary
people "online" becomes rather unimportant. What would be the effects
of technology when computers behave as in early science fiction, as
stand-alone oracles? [1]
If I could store all the worlds present information on the head of a pin I should still want to access the internet to find out how you feel about it or share my opinion on it with you. Virtually everything we do online isn't reading existing data its taking action on it by virtue of communication.
How would the appeal of streamers and live data/content settle out in that case? Sometimes context is available in the moment that makes it easier for all parties to consume and analyze in that moment as well.
Since transient, ethereal meme culture is also basically emergent culture now, it's difficult not to also foresee a greater cultural divide in such a case. This is saying nothing of live data tools as well, even weather data...
> How would the appeal of streamers and live data/content settle out
in that case?
It would be mostly unaffected. As another commenter (Michael) says,
communication, news and collaboration are distinct from storage as
needs/functions.
What you say here is fascinating:
> Since transient, ethereal meme culture is also basically emergent
culture now, it's difficult not to also foresee a greater cultural
divide in such a case.
There have always been bookish or gossipy people. Jane Austen and
Thomas Hardy both make note of that in stories about English culture.
The balance of those qualities may have changed in the internet
age. Perhaps the degree of reach via one-to-many communications has
amplified the ephemeral gossip side. The scholarly life less so.
Though it's enhanced by repositories like Gutenberg, Internet Archive
and SciHub and the like, a "reader" (to use Bill Hicks's take) can
still only process one media at a time. But as Ted Nelson pointed out,
if you have all of the worlds writing at your fingertips, all
hyperlinked and with awesome semantic search tools, reading becomes a
quite different non-linear experience. That's something centralised
walled gardens subtracted from the WWW as its 1990s conceit.
That 90's vision of a multi-modal, multi-media "internet community" in
which people read common news, converse and reference together hasn't
really survived "Social Media". But then it was always a weal
approximation to something like a group seminar in the university
library rather than the town square or local pub.
I've tried to do lossy compression of epubs with some lines of bash scripts; i.e. removing the images and fonts that were not needed. Many epubs could be downsized to a third of their size, but then I found a book that needed the supplied fonts and gave up. When doing lossy compressions can not have those kind of bugs.
What I also found was that many of the images in the epubs were already unuseable and nothing like their counter parts in phsyical books.
Good compression of lots of epub files can likely be way more efficient, as deduplication/compression algorithms can be run on lots of books at the same time. Especially so with a good dictionary.
Things can be rendered from compressed container files. For HTML with images, even slow-but-strong compression like LZMA is already fast enough to render pages as fast as you can click through them, even on fairly old hardware.
Kiwix .ZIM file format is a good example. The entire Gutenberg Library is a single ~65 Gb file, and you can read any book from it without unpacking anything.
All of library genesis is already available on IPFS (see https://freeread.org/ and https://libgen.fun/dweb.html). Hopefully someone will import this collection into libgen and then these books can be on IPFS too.
Torrents are simpler and more efficient for distribution, but IPFS is better for accessing individual files.
IPFS doesn't really work well for this because you'd never know if the peer hosting the last subset of some books went offline (and you'd lose those until someone who had them came online again).
This is my point about the shame it isn't. It seems obvious that ipfs would have both privacy and a self balancing way to pin a partial set of data. But no, which makes it unsuitable.
What makes you think IPFS has privacy? It has the complete opposite, it is not designed for privacy at all.
When you pin files, you announce to the whole world which files you're hosting. When you download files, you announce to the whole world what you're looking to download.
That would paint a huge red target mark on filecoin's back, given that they raised over a quarter-billion. They would take it down faster than you can have an 8TB HDD shipped to you.
They can and will use legal means to go after anyone trying to use the network this way in order to make the content less discoverable and discourage others from following suit.
Being coined as the Napster on Blockchain is the worst possible PR one can think of in their situation.
But why would it, really? At this point it absolutely plans to rejoin world economy once things "blow over". And if becomes clear that it's not going to happen - there is even less reason to host some books in foreign languages that probably have "extremist" content in them.
One of the projects on my secret TODO list is feeding Libgen into Elastic Search to get a cross referenced full text search. For now the hardware is prohibitively expensive, but time is on my side. I'm sure redundantly indexing a few 10s of TB will become trivial before the end of the decade.
Hm, from some site on the internet, this are the stats:
Enabled users: 5629
Active today: 986
Active this week: 2542
Active this month: 4228
Torrents: 566598
Total Size: 23.40 TiB
Retail Torrents: 421313
Creators: 401395
Seeders: 2421540
Leechers: 232
Snatches: 16523563
Transferred: 398.22 TiB
And those are only books. Library of Alexandria is already here.
The only problem is that there is so many books and so little time :( , which might be a far bigger problem than book accessibility. To find the time to read them.
There is at least 1/8 of books that I would love to read. But I don't have the time to actually pull it off, even if I am not doing any social networking etc., but the amount is really huge. Maybe, for a startup, someone could index all the content, not only share it.
I've seen it too, and it's a weird mistake to me because I am certain that at least one of my friends who makes the mistake did not make it in the past.
I wonder if there's a kind of memetic effect happening online, where people who lack confidence in their English spelling ability see somebody make this mistake and somehow think that it's correct, so they switch how they write it.
Interesting. What kind of usage have you observed?
I have come across similar use of "revert" in private correspondence from old-fashioned lawyers at small high-street law firms in England. As a rule of thumb, if a solicitor writes that they will "revert to you shortly" you may expect:
* They are fairly close to retirement.
* They don't know much about laws that have changed since about 1980.
* They almost certainly won't get back to you in a reasonable time unless you repeatedly send reminders. They are probably hoping that they will be able to retire before having to invent a meaningful response to your enquiry.
Even so, there are enough ESL readers on here, along with native speakers who may not understand the difference, that it makes sense to point it out once in a while.
Otherwise we end up in lose/loose situation where I see more people use it wrongly than correctly.
But automatically replacing a valid word with a less-common valid word that makes less sense in the context is not something predictive text should do.
You are presuming it was an automatic replacement. Some still type entire words via a thing called a keyboard, on a computer, others manually pick words to finish, from suggestions, not automatically.
Beyond that, google searches are not indicative of current usage, in SMS/messages, emails, etc. Google search spans all webpages, and many text pre-web too.
Do you really want your ISP to know which piracy sites you frequent? This is all being sent in plain text. Or they could change the content, insert a redirect, or inject ads without your knowledge. TLS is needed on all websites - not just those with interaction.
Those are problems too, but they aren't exploited nearly as often as MITMing cleartext has been historically. The solution you mention is already becoming widely-supported, as are newer protocols like QUIC that discourage snooping.
There's no reason to ignore a good solution just because it's not 100% perfect.
> They only know the IP address of the remote server.
It's the internet. Everyone can scrape links and measure/correlate which assets were on them to correlate likely visited websites.
Especially if every web page these days is pretty unique in terms of what kind of assets (network streams) with what kind of byte size were loaded at which point in the document loading timeline.
Now include the TLS fingerprint of your web browser and well, privacy went to shit.
HTTP needs an upgrade with scattering and rerouting on the fly, otherwise these deanonymization techniques can never be fixed.
and the SNI, until ECH is widely adopted, SNI is leaked in plaintext when connecting to a server, it needs to because how will the server know which TLS cert to reply with?
The answer is no. There was a Cloudflare article on ECH a while back that mentioned the fallibility of using reverse DNS, but I am having trouble locating it. In any event, the people working on ECH have coined a term called the "anonymity set". Below is a Cloudflare article that uses this term.
The "anonymity set" refers to the number of possible domains using a single IP address. The existence of that term implies that some IP addresses must have a number of domains associated with them, greater than 1. With these IP addresses, one cannot determine the domain name, the one that the www user sent, from a PTR query alone. Even prior to the introduction of SNI to TLS, when the only way to offer HTTPS was by using a dedicated IP address, discovering the contents of the encrypted Host header via reverse DNS was neither easy nor reliable.
If there are still people reading HN who believe that reverse DNS is reliable and makes plaintext SNI and ECH moot, and are going to comment as such in the future, I would be happy to post the results of an experiment where I take the DNS data for all the domains currently submitted to HN, i.e., a list of IP addresses found in the A records for these names, and do a PTR on each one. We can look at whether "most domains" are identifiable through PTR records.
Also remember the question is not whether ECH protects 100% from someone discovering what domain name the user sent. It does not. The question is whether ECH makes it more difficult to discover than simply sniffing plaintext SNI on the wire, which, of course, is even easier and more reliable than reverse DNS.
I will bet 10$ that with reverse DNS + DPI to try to suss out page size and caching behaviour you can identify anyone accessing this website and downloading the 7TB database.
No one has to "access this website" because they can read its contents in Internet Archive, Common Crawl, Google Cache, etc. Page size and caching behaviour will not work if the person is using HTTP/1.1 pipelining to request multiple pages from a variety of websites from Internet Archive, over a single TCP connection. (Using CDX API not HTML form at Wayback Machine page.)
The 7TB is via torrent, not via HTTPS. No rDNS needed.
If the user requests the page from Internet Archive, Common Crawl or even Google Cache, how does the ISP know what the user requested. (NB. Neither IA nor Google Cache require sending SNI,^1 so the ISP may only see IP addresses).
With IA, the IP address alone does not reveal which IA site or page the user is requesting. There is more to IA than only Wayback Machine.
With Common Crawl, the user can send the Cloudfront domain name instead of a commoncrawl.org domain. Are all ISPs going to know that this is Common Crawl. Even if they expend the effort to learn, what benefit is achieved.
With Google Cache, the IP address alone does not reveal which Google site the user is accessing. Needless to say, there are many, many domains using these IP addresses.
There is nothing that requires any web user to retrieve web pages from a given host. The page may be mirrored at a number of hosts. Some of those hosts might offer HTTPS, support TLS1.3 and not require plaintext SNI/offer encrypted ClientHello.
Even assuming an ISP can determine what domain name a customer is sending in a Host header or ClientHello packet, it would still be necessary to subpoena the archive/CDN/cache to figure out precisely what pages were being requested.
1. The same party is controlling all the server certificates. IA controls the certificates for all IA domains, Amazon (issues and) controls all the certificates for Cloudfront customers and Google controls all the certificates for Google domains. Perhaps there are web users commenting on HN who believe that ingress/egress traffic for site saved/hosted/cached at an archive/CDN/cache is somehow private as against the company running the archive/CDN/cache in a meaningful way. I am not one of them.
As for the question of an ISP modifying the contents of web pages, this is an issue that could be addressed contractually in a subscriber agreement. It stands to reason that if this was a serious issue and not merely a hypothetical one raised by nerds debating the merits of TLS then it would be addressed in such agreements.
As for the "injection of advertising" issue as a argument in favour of the way TLS^2 is being administered on the web, IMO this is a bit silly since (a) it is trivial to filter such advertising (e.g., Javascript in the examples I saw) out out of the page and/or block it from running/connecting/loading and (b) the amount of "tech" company-mediated advertising that web users endure in spite of using TLS is enormous. More likely than being seen as a threat to web users, the injection of advertising by ISPs was seen as a threat to the advertising revenue of "tech" companies. The later are responsible for facilitating the injection of advertising (by their customers, not their competitors, i.e., ISPs), not preventing it.
2. By "TLS administration" I do not mean encryption as a concept nor certificates as a concept. I mean TLS administration measures designed to support "tech" companies first and web users second, if at all. A system where the questions of "threat model" and "trust" are both decided by "tech" companies not users.
No really, I don't understand this argument. A static site served by plain http is perfectly appropriate. It's like a poster hanging on the wall for all to see. Of course people can paint over it, but it doesn't really matter.
That's quite dated by now. If you are in a position to inject traffic, you are likely also able to simply use that uplink to send traffic of your own. I'd be surprised if this is still in use, especially outside of China (or a poor not-so-tech-savvy country like North Korea) and wasn't just a quick hack at the time.
They could serve you javascript that exploits your browser. At the very least, they could replace that bitcoin donation address with their own. That's a tempting target if nothing else.
And "they" isn't just your ISP. It's also that free wifi hotspot you connected to, or the hotel service, or your company's network. Even if you trust your ISP (and you probably shouldn't), there are other bad actors to be aware of.
If you think you're high value enough to have someone target you specifically by getting on your LAN or gaining access to (or coercing) an upstream ISP to serve you a browser 0-day reachable only by laying in wait for you to visit an HTTP site because there is no other way in, that's not going to be for a free books website.
This is a site asking you to commit piracy, i can totally see the some agency intercept it and replace the onion addresses with theirs so they can track everyone down.
If enough people find it okay to take such extreme measures for mundane, nearly victimless crimes, I hate to think what the future will be like. In the past, hoarding exploits was considered something for the military, for national security, and even there it was a hot debate and controversial and many parties/countries wanted restrictions like time limits until it's reported to the vendor. Entering homes was a thing of warrants because we wanted to limit government overreach. Now it's okay to employ both of these for reading books without permission? If there's one of you then there's probably more. The future is bright.
Consider that the downloads page for this site tells you to use their tor hidden service. If you open http://pilimi.org/ in Tor, you'll go through an exit node that could be MITMing everybody opportunistically, not targeting you specifically.
They don't know the page. In the case of this site it probably doesn't matter, but which page you're looking at is always going to be more interesting and informative than which site you looked at.
When the prosecutor is looking through your internet records and they see 50 wikipedia hits in some relevant time period, they're going to be upset that https exists.
You don’t need to copy and paste your reply everywhere it’s relevant on HN. Even us flea brains can carry your remarks in our head and apply them to similar comments.
This, and also MITM (like ISP) needs to make their own request to this site, to know what I read. And, technically, they cannot really be sure it's what I've read, since nothing says that this site is static.
I'm not that offended, and torrents are only available via TOR anyway, but I do actually appreciate the sentiment. There's no reason to be not using TLS.
HTTPS only via a CA (instead of self signing) is even worse for site of questional legality. Then it allows a centralized entity information and control of who can access your site. Yes, you can get a new TLS CA to sign your certs if one gets pressured to kick you out, but that just means the next will too.
Use Tor to access the site then. There was a post here like a day or so ago about how Tor onion site operators were deanonymised due to their TLS certs for their clearnet mirrors.
So why does it have a clearnet address? To have more reach? What’s their threat model such that a clearnet presence could possibly out the people behind this?
I would love to contribute on seeding the torrents. However what happens when there are new books in z-lib? Is this only 1 time mirror? How do we expand the collection after a certain amount of time.
is there actually a command line tool which can bulkdownload according to filters from libgen/mirrors? i do not care about speed, rate limiting would be totally okay, but i do not see such tools, which iteratively go over each mirror and check for content in a service friendly way
7TB of compressed text? I don't think humanity has generated that much written words in it's entire existence. Although it would be an interesting Fermi Problem to estimate (and don't forget just how well text compresses).
This has to be a lot of duplicates or bad formats (images). This would be far more useful to people with some curating.
(Ignoring, as someone else already pointed out, that this is PDFs with images and scans probably, not just plain text... I got curious and did some napkin math about your claim)
The top ten countries average north of a hundred thousand books per year each[1], so let's say a million books per year globally because I haven't got the time to extract the numbers and sum them all up exactly. It probably wasn't as much fifty years ago, but we're also completely ignoring the Internet which is way bigger because of user-contributed content (not to mention things like newspapers, meeting notes, etc.), so let's say this was the case for the past fifty years. That's fifty million books. Average book has 85k words[2], and a word is like 5 characters on average. Not all countries write in English, especially some big ones like India and China, so we could probably double it on a global scale but let's go for a conservative 7.5 bytes per word and add a byte for the space (I'm ignoring other punctuation). That comes out to 8.5×85e3×1e6 which is less than a gigabyte and uncompressed. Decent compression iirc makes it a fifth of the size, so 145MB compressed.
I've still got to be an order of magnitude off because we write a whole lot more than just books (and even if we were just looking at books: there are also book revisions, drafts, etc. I'm just counting the published words), but that's still less than expected.
In conclusion, it comes out to about 1/50'000th of 7TB compressed, which I would say lends some credence to the claim—even if I feel like I must have made a mistake somewhere because 145MB for ~all books of the past 50 years from all countries seems quite little.
> We will release the data in stages, as we are still processing the files. Right now the metadata file and a few of the torrents are available. Note that the torrent files are only available through our TOR mirror.
Presently, only the first four of several dozens of parts are available.
"Hey, wait, literally everyone could have the entire library of Alexandria in their house for a couple hundred bucks per person. Like, all the knowledge ever. Maybe that should be considered the good default of things.
At least one in every town that everyone could use, for free, forever, without restriction to ANY of the knowledge anyone desires."