Hacker News new | past | comments | ask | show | jobs | submit login
Daily Usenet Feed Size Hits 300TB (newsdemon.com)
70 points by xhrpost 4 months ago | hide | past | favorite | 60 comments



Feel like it's obligatory to say this for people unfamiliar with Usenet, but there are text groups (think text threads like a mailing list or forum) and binary groups (shows/movies/software/etc). The size growth is due to the latter.

I use both types. Text groups almost certainly fell off a cliff earlier this year when Google Groups shut off their spam posting gateway. http://www.eternal-september.org/ is a good free project for the text groups. I've been a newsdemon customer for years but they suck for text groups (their headers got messed up a few years back after a move from highwinds backbone to usenetexpress backbone).


Serious question: why does anybody use Usenet for pirating rather than torrents?

It seems so fundamentally ill-suited to the task.

And if the answer has something to do with privacy or warnings from your ISP, it seems like VPNs would be the answer.

What am I missing?


> Serious question: why does anybody use Usenet for pirating rather than torrents?

No need to seed which reduces legal liability and bufferbloat issues on lines with anemic upload speeds. Most torrent clients penalize peers that don't contribute to the swarm. It's also really convenient to plug in automated downloads with a decent index and sabnzbd.


>Most torrent clients penalize peers that don't contribute to the swarm

I agree with you and I also want to add that I've had problems with trackers penalizing me because i can't get my ratios up since hardly anybody is torrenting the same outdated 1980s mecha animes i am. I don't usually have troubles finding or downloading them because there's always at least one guy who has it on a seedbox running 24/7, but my ratio is guaranteed to be stuck at 0.0 forever.

I actually got autobanned from a private tracker once because i downloaded a bunch of comic books which were all miniscule in terms of size and also extremely niche. this series was being distributed as one issue per torrent instead of a big archive torrent, with each torrent being on the order of a few MB. which resulted in me having several dozen torrents with a 0.0 ratio and therefore looking like a leech.


> I agree with you and I also want to add that I've had problems with trackers penalizing me because i can't get my ratios up since hardly anybody is torrenting the same outdated 1980s mecha animes i am.

Small trick. Pick one, popular, item that everyone wants now, pull it down and seed it. That will add to your ratio which can then be used to grab the obscure stuff that is hard to seed.


Download free popular ones and keep them forever to seed.


Automating *arr with torrents isn't too bad depending on which trackers you use.


especially if you use debrid/premiumize


> why does anybody use Usenet for pirating rather than torrents?

With the exception of the rare, well seeded torrent, a Usenet download will proceed at full speed while the torrent will trickle along at 34kb/sec.

> It seems so fundamentally ill-suited to the task.

It is. But it is there, so....


Seeders on private trackers often use seedboxes. I can saturate my gigabit connection with just one or two torrents downloading.


But then you need to deal with the complexity and maintenance involved in using a private tracker. Easier to pay for a provider or two and an indexer and be done with it.

Of course, private trackers are superior when it comes to older content and general availability (no DMCA or NTD to worry about).


Download speeds, no ratios, very low risk of any action by right holders. Would be things I would consider. Also setting VPN right is not entirely untrivial, you could get appearance it being in use, but traffic actual passing through other ways.


Why are the risks of action by right holders low?


Because you aren't uploading like you are with torrents. In many (most?) jurisdictions downloading for personal use is ok. It's the uploading that exposes you to legal liability.


Easy to see which IP is on a torrent from anywhere (it’s a service you can buy).

Legal differentiations in some places between downloading and distributing.


It is similar to watching illegal streams of copyrighted content. In most legal systems going after that is either hard or very little reward.

On other hand with torrents you usually automatically upload. And even single uploaded copy can result is some extremely questionable math of potential damages. Which the legal system in essence has blessed.


At least at one point news servers were in a kind of a limbo rights-wise. They were kinda like ISPs providing a service and thus weren't liable.

Just like I can send you a copyrighted image via email and the email server owner isn't liable for the material on their drive.

Also it's hard to track compared to the ease of monitoring public torrent swarms.

With torrents you can just enter the swarm, log all IP addresses and maybe download a bit that it's actually infringing material.

With news the stuff is ascii jumble spread in posts with weird names and you need to download it all to figure out if it's what it says. And just looking for material from the 300TB backlog is a huge pain.


There are many things I've found on soulseek that are simply not on the public (and few private) trackers I check. I can only imagine the same applies to usenet.

similarly, soulseek is limited to 1 peer at a time, which some might consider "ill-suited to the task"

anyways I love slsk

clients exist:

- https://github.com/nicotine-plus/nicotine-plus

- https://github.com/slskd/slskd


Serious question: why does anybody use torrents for pirating rather than Usenet?

100% (down) bandwidth saturation, no VPN, no letters from lawyers, and no using my (asymetrical) upload - what is not to like?


Usenet is a bit of a weird place to get into.

You need to find an indexer, the best ones you need to pay for or get an exclusive invite.

Then you need to find a news provider, and pay them too.

On top of that you need software that can plug into the indexer and the server and actually download the stuff.


Because it will max out your connection speed, got a 5Gb connection? no problem, it will instantly start downloading at full tilt.


You can still misconfigure or leak your real IP using a VPN if you are not completely right in how you set it up. This leaves you open for large-ish claims in countries like Germany because technically you are uploading and that's illegal. Usenet doesn't have the uploading component so that risk doesn't exist.


> Serious question: why does anybody use Usenet for pirating rather than torrents?

In addition to the other answers, Usenet binary newsgroups are much older than torrents. So another reason is that it simply never died. Torrents didn't replace Usenet, they just complemented it.


Afaik, there is no benefit to a VPN for downloading content from Usenet.


If you live in a place where downloading is a gray area legally, it could save you a lot of potential hassle. Even if you win in court, it is far cheaper and better to have never been summoned in the first place.


Yes, but is there demonstrated to really exist any such area when it comes to Usenet? I am not convinced that there is.

Downloading from Usenet is not like downloading from torrents; it is not p2p. One downloads from the Usenet service provider over SSL encrypted connections. There is no way for anyone else to track it.


But doesn't Usenet require an account that's linked to a payment method? If you use something like a credit card, then the server owner has your real info and presumably a list of every illegal download you've done. Seems like you'd be putting a lot of trust in the server owner, and couldn't it get subpoenaed?


It is safer to use a Usenet provider of a different country.

Secondly and independently, Usenet providers often don't keep very many logs, so there may not exist anything for the server owner to provide in response. For example, if the Usenet provider gets ordered to list all users that downloaded a particular movie, the provider could respond that they don't keep such records.

Also, think of all the pirate streaming websites that exist.

The worst that can happen here is that a court orders someone to pay a fine for downloading something, and this would make big news if it were to happen for Usenet, etc. It hasn't happened to anyone afaik.


Usenet, in my experience, is reachable through a variety of easy-to-use web-based providers. Easynews, etc... I agree that it's ill-suited in the sense that often the files are split into parts and often need re-assembing or other such work. But it's fairly trivial stuff that a user may already be familiar with. Other than that it's super easy. Torrents, at least in my experience are not so straightforward due to the required installation of a client. I think users are hesitant to install it on their machines. Just my take based on limited experience.


Honestly there's way more there, and you get consistent solid speeds. Find a provider with a lot of retention and you can find almost all mainstream media regardless of it's age. (Public) torrents tend to track what's popular and quickly fade. The masses seem to favour low size encodes too, so if you're looking for more quality (and again, public trackers) you're usually much more out of luck.


VPNs leak and can be tricky to manage. I believe most serious torrent pirates use seedboxes at this point.


It’s really not that bad unless you really don’t want to be found. Most DMCA/copyright firms just do a basic investigation which stops at the IP AFAIK.

Mullvad and I’m sure other modern VPN clients have “kill switches” built-in that shuts down traffic if the tunnel isn’t on. You can also do a leak check before starting anything up.


Unfortunately Mullvad and the majority of VPN providers don't offer port-forwarding anymore, last year when I checked only a couple, maybe three remained...

Having wrote that, another option, under Linux with network namespaces and Wireguard it's possible to have a pretty fail proof VPN...


Wouldn’t seedboxes be far more dangerous? The provider absolutely can see what you are doing on the box and provide that to the authorities whereas a VPN provider that doesn’t log activity, well there’s no paper trail at least retroactively.


Why do you trust VPN providers more than seedbox providers?


Mullvad and Proton at least have had their privacy claims tested by European courts, they really don't have much info on you. And conceptually it is obvious that a hosting provider could provide more information about a permanent activity than a networking provider could provide about a transient one.


Based on this table there's been a 1100% increase over the last 7 years, with a 60% increase between 2020 and 2021.

I'd be interested to see WHY this is the case. Is it attributable to a larger share of data that cannot be compressed vs more compressible data (e.g., Warez/Movies)?

It just seems highly unlikely this is driven by a growing user base; but, without more details other than this data table, I am at a loss for the reasons why.


It is a growing user base. Streaming services were convenient, now they are not (ads, increased cost, reduced content availability, fragmentation).

> It just seems highly unlikely this is driven by a growing user base;

If you are old enough, there was a time when everyone pirated stuff due to the alternative being rather expensive or unavailable (physical media). Then a golden age of streaming services that were cheap and had high availability basically killed torrenting for the general public. Now people are returning to piracy as the streaming services got worse for reasons I stated above.


One of the most popular types of content is now remuxed (not reencoded or recompressed) 4k blu-ray rips that can be anywhere from 50-150Gb per disc. In the DVD/Blu-ray era movies were often reencoded to lower bitrates the way streaming services now do and had tiny original sources. Those files were orders of magnitude smaller (600Mb-1.5Gb typically).

The people who still care enough to pirate in the era of ubiquitous streaming are often doing it explicitly to get the best possible quality, because no streaming service now offers anything of comparable fidelity for any price.


some (most?) streaming services don't even allow you to manually set the quality anymore

gotta hope they decide your connection is capable of 4k (and they decide they want to spare you the bandwidth)


rn seems to have had it right: ripped from my youth, the pre-posting warning comes to mind:

“This program posts news to thousands of machines throughout the entire civilized world. You message will cost the net hundreds if not thousands of dollars to send everywhere. Please be sure you know what you are doing.”


Gosh that's a lot of discussion!


Yeah, it actually requires a lot of work on the posters' end because they choose to do all their discussions in the form of videos apparently.


A lot of very artistic form of discussion


So 300TB * 365 = ~110PB for a mirror to have 1 year of retention.

That's pretty insane lol. How many mirrors are there that can actually manage that much storage? If you're using 20TB disks that's ~5500 disks per year with zero redundancy. Double or triple that for a bare minimum... not counting the load of actually serving that data to everybody.

How is this economical for anybody at this point? Or are these Usenet mirrors all massive businesses that can support running hundreds of PBs or storage and I'm just naive?


> Or are these Usenet mirrors all massive businesses that can support running hundreds of PBs

ding ding ding. The market consolidated toward only a few large backbone operators. Here's a picture of what the network looks like: https://svgshare.com/s/14tF.svg


I'm also curious -- how come you hear big stories about governments going after torrent sites, which don't even host the pirated content.

But I've never heard big stories about the government going after big Usenet providers. Do they? If not, why not? Or does it just not make the news? Or am I just not paying attention?


Usenet files get taken down all the time. Rights holders send notices to the Usenet providers and they remove enough of the binaries where they can't be repaired by the PAR files.

Indexers are separate entities. Those are the ones who catalog and host the .nzb files. You see those go offline far more often but Usenet is still pretty small compared to torrents so they don't get reported as much.


> But I've never heard big stories about the government going after big Usenet providers.

The most litigious government has a 'safe harbor' provision most of the major Usenet backbones comply with: https://www.copyright.gov/512/

And they do take down a substantial amount of content every day, so while everyone knows piracy is common there, what more are they supposed to do? At this point the main effective thing governments can do to make it annoying to use Usenet for piracy is target the indexers (and they do this, but more always move in to replace the ones that are shut down)


AFAIK, files uploaded to Usenet still need to be ASCII encoded using something like uuencode or base64, which increases file sizes to something like a third larger than their original sizes. So you can decrease the amount of data you need to store quite significantly by only storing the unencoded versions.

After that, deduplication can probably bring you down at least another 50% - how many of these files are just the same things being posted over and over? Probably quite many of them. Store a file once with a database tracking its SHA1 hash, and whenever you see a file with the same hash come in, throw it away and instead store a reference to the first file.


Most posts are yEnc encoded these days https://en.m.wikipedia.org/wiki/YEnc


So, binaries, warez, images, and the like?

A backchannel to download papers from sci-hub?

Or are people using Usenet as a way to send encrypted messages in a way that makes traffic analysis more difficult? (If 50,000 people download everything to a group, and post encrypted or steganographic message to that group, then it's easier than seeing that X sent an email blob to Y.)

Or, Usenet as the new numbers station?


I always encrypt my “hi mom” messages as multigigabyte .rar files to blend in.


Wow, is this all AI generated content? AI bots just discussing between each other constantly?

Is there really even that much media produced everyday? Or media that does get uploaded?


You can post binaries on Usenet. And a lot of the stuff is dupes (the same content being re-ripped in different formats and posted by different groups).

Any particular piece of content might be uploaded 50 times or more with different resolutions, codecs, sound formats, languages, etc.


That's part of it, but also most every binary post has additional redundancy added in the form of par2 parity files. This is so if some of the articles don't make it to the news server or some get DMCA'd, it's possible to calculate and repair downloads. Also, due to the same claims, posters just re-upload the same thing again.


Wow. Makes me want to break out my archived installer for Microplanet Gravity, assuming it even works on Windows 10 or 11.


And a TON of it is spam and “robots talking to robots”


To a first approximation none of it is, because it’s drowned out by the absurdly massive amounts of piracy.

When it comes to text-only groups there’s still some spam but a lot of it is cranks howling into the void about their insane world views rather than unsolicited and off-topic commercial offers, though those do still exist too.


where's the action nowadays?




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: