I appreciate that attitude and value it myself, but I like to point out that it is not without risk.
If the world around HN (including its community) changes, stasis can damage or kill it as well.
Specifically regarding the issue of the original posting:
- HN is already an important data source for large language model training. [1]
- To the best of my knowledge there is no freely downloadable and current data dump of HN. [2]
- The HN-API does not offer all the data that scraping can get. For example, if a post had ever hit the front page or the highest front page position reached, is an interesting data point that is missing.
- The Algolia-HN-API has the same limitations.
In my opinion this will lead to increased usage of the API and increased scraping which all costs money. HN might be forced to find a solution for this.
[1] For example, the RefinedWeb paper lists HN as one of only 12 websites that were excluded. From what I understand, it was excluded because it went into the final dataset unvetted. RefinedWeb was used for the Falcon model.
[2] The closest thing is probably the Google BigQuery "bigquery-public-data.hacker_news" dataset.
It claims to be updated daily, but really is from late September 2022. Also I could not find the download link which other data sets offered on BigQuery have. Does anyone know if I can download the complete thing anyhow?
I don't expect HN give a fuck about the scraping. It's pure HTML, no images, probably cached all to hell for users who aren't logged in anyway.
The one thing I see as a future issue is that people are starting to post comments that clearly look like they were manufactured by ChatGPT and friends. Or that could just be the way some people talk and I've spent too long with ChatGPT now and start to smell it everywhere.
HN does have performance / capacity issues, and you'll find that if you're crawling the site rapidly, you'll quickly have your IP banned.
I've had that happen even under manual browsing (when logged out). My front-page analytics project hit that limit quickly (within about 30 requests, probably less). Adding in a reasonable delay got around that.
Keep in mind that a lot of Web infrastructure tends over time to operate just at the edge of stability, as capacity costs money.
This attitude that: if it ain't broke, don't destroy it, is something I'm finding myself valuing increasingly. I use Stylish to make HN look a bit more readable and prettier, and beyond that, it functions exactly how I want it to and I'm glad to see that the institutional momentum here is a core value.
That's actually a nice way of putting it! Better than the "fix" version, because this is clear on the consequences.
I feel like it might be applied to everything from OS UI design (Windows 11), web platform redesigns (Google's icons) to whatever is going on with social media (the silly enshittification term describes this) and many other things.
Let's say that you are the proud owner of a goose that lays golden eggs. "fixing" would be switching it to a different feed that might make it more productive, or it might make it sick. But this year the trend is to give it a few good kicks to see if that helps.
I’ve been here for 6 years and the format is still a breath of fresh air when I come back from things like Twitter or Reddit.
It’s not that I’m opposed to change really. I love good ideas, and being surprised by new and unfamiliar things is usually a joy. Communication via text is hard to improve upon though, and I’m not convinced any major social media platforms have found ways to improve this in any meaningful ways.
I want to read interesting things and discuss them with interesting people. This is hard on most platforms. HN makes it easier than everything else I use.
You did say, recently, that you don't like the idea of paid third-party apps using the HN API.[0]
I thought that was an odd change for HN. After all, the majority of the value still accrues to the HN owners. In fact, user's that are prepared to pay for an app to access a free website largely comprised of adverts, are typically more valuable than the rest. Those users have money to burn and skin in the game!
That's a fine point. I just get uncomfortable with it because the currency of HN should be curiosity, not money, so energetically it doesn't feel like a good fit.
Elon says he used it as a way to stop AI data scraping because the servers were suddenly hit by a large load.... (As a weird form of DDOS shielding in other words)
HN could be hit with such a large load too, since we have pretty good and lengthy discussion here, good for AI data training.
Would you believe it a valid tool to keep the community happy?
Since I doubt we would be happy if we can't access the site because some AI decided this was the time to scrape, either.
I agree that it could become a problem but I'd rather wait until the problem shows itself clearly, rather than (potentially) over-reacting in advance. Sometimes the medicine turns out to be worse than the disease; plus it's a better fit for being lazy.
I'd love to see the source control change log for HN. I can't even remember a single visible change in all the years I've been a user. I think there have been a few under the surface though.
I wish all the other sites on the Internet would wake up every morning, look at their TODO list and say "Nah, not today."
It's ambiguous what GP refers to as "next", but if it's the Eternal September part, I believe HN unfortunately already suffers a lot from it. In my subjective opinion, comment quality took a nosedive in the past 2-3 months or so. That said, I have no idea how to fix it, if it needs fixing at all.
I'm not saying you're wrong and anyway it is hard, if not impossible, to evaluate objectively—but I can tell you two things for sure. One is that people have been saying more or less exactly this about HN for at least 15 years; the other is that HN is subject to a lot of random fluctuations, and random swings tend to get interpreted by humans as long-term trends—not because they are, but because that is what humans do.
In addition to my sibling comment: HN also steps in to quash developing negative patterns in all sorts of ways. There's a long list of banned sites, there is the flamewar detector (though I've ... questions ... about that), dupes detection (or flagging), there are weightings and penalties given to various sites. I believe also some keyword and other patterns are looked for as well, "Reddit" being among ones dang's recently discussed.
So yes, occasionally some new pattern or trend will emerge, but HN adapts to those fairly quickly.
Paralleling what dang's said here, I've been looking at 17 years of HN front page activity over the month or so, and am starting to tackle the question of topic drift and/or focus over that period.
My current tack involves looking at sites (as reported in parentheses at the end of each HN front-page post title) and classifying those. With slightly more than 30% of sites categorised, I can classify about 65% of all HN posts.
For the full dataset (17 years), that's roughly:
1 63913 35.73% UNCLASSIFIED
2 22589 12.63% blog
3 15112 8.45% general news
4 13823 7.73% tech news
5 12851 7.18% programming
6 8622 4.82% corporate comm.
7 8459 4.73% academic / science
8 7294 4.08% n/a
9 5324 2.98% business news
10 3803 2.13% general interest
11 2151 1.20% social media
12 2074 1.16% software
13 1613 0.90% technology
14 1463 0.82% video
15 1144 0.64% general info (wiki)
16 1009 0.56% government
17 724 0.40% misc documents
18 720 0.40% law
19 702 0.39% tech discussion
20 620 0.35% science news
Tons of caveats: this depends heavily on how I classify individual sites, a given site's stories might well be technical, social, or political, etc., etc.
The breakdown-by-year analysis is in development, but if anything programming-specific content as increased in prevalence. Political discussion seems not to have (though it rose significantly ~2014). Cryptocurrency and blockchain-specific sites also peaked about that time (I suspect much of that discussion is now mainstream). General news has always been a huge portion of HN discussion, as have individual (and corporate) blogs.
Note again that this isn't about discussion and comments, or even the titles or article contents (I'm thinking of looking at those, it's ... a challenge for me).
But across nearly 200,000 front-page stories, on which nearly half of all HN discussion occurs (based on another API-based study looking at comprehensive posts), the overall trending seems at first blush to be pretty consistent and if anything improving over time.
(As with all preliminary results, I'm hoping I won't have to eat my words here. Though I'm reasonably confident in most of this.)
From the classifications above, the places you might find some that "suffering" would be in general news, genral interest, and social media categories. All but the first of those are single-digit percentages, and a lot of that general-news content is about technology, business, finance, and science, all of which would crowd out the sort of social and political issues which seem to generate strong feelings.
The "UNCLASSIFIED" sites are a wide mix, though most are probably a mix of blogs, corporate / organisational communications, and the like. The mean posts per site is 1.739951, so gains from additional site-categorisation are pretty slim. I have captured a lot of obvious patterns via regexes and string matches, so academic/science and major (or even minor) blogging and social media sites aren't a large fraction.
More on "UNCLASSIFIED": there are 36,520 of those sites.
It's not practical to list all of them. But we can randomly sample. And large-sample statistics start to apply at about n=30, so let's just grab 30 of those sites at random using `sort -R | head -30`:
That's a few foundations, a few blogs, a corporate site (enterprise.google.com), and something about tea, all with a small number of posts (1--7).
I'm looking at some slightly larger samples (60--100) here on my own system, and can actually make some comparisons across samples (to see how much variance there is) which can give some more information on tuning what I would expect to find under the "UNCLASSIFIED" sites.
Fair enough! Personally one thing I would like is some sort of inbox functionality to notify you if someone replies to your comment, but can definitely live without it!
One welcome addition would be support for embedded images - sometimes you need to share some screenshots. For example right now, I am seeing rate limiting messsages from Twitter, from the normal account.
I think of features like this on a regular basis that this site needs. Then I catch myself. It's feature-creep like that which has killed practically everything else.
They have so far and have reason to continue to, not just because of how they feel about HN but because the economics of curiosity are a good fit for YC's business. That's the miracle (I would even say) about HN - it occupies a sweet spot where it can be funded to just be good*, and the economics work because it's in the interests of the business.
The choice is not between no change and drastic change, but selecting a rate of change that is appropriate. Things which do not change, die, as the only thing that is constant is change. Change or be changed.
I visit a few sites that never changed (or at least changed things only few were even aware of) and they live long enough to lose count of hordes of change-praisers appear and die.
I believe this eternal loop of change is a trap that impatient people force themselves in. Instead of accustoming and learning the current state, they rush into another one with a change that has unclear implications. As a result, they never get where they are and lose any track of where they were or where they’re heading at. Their only comfort can be found in a constant change.
These few sites are like home to me. One of them I visit with years-long pauses and every time I return it’s the same user experience. That’s invaluable.
Sharks haven't changed in about 450 million years. There are designs that just work and don't need to change unless the environment changes drastically.
For me the main barrier is that I want to have portable/roaming control over my IDENTITY, even if the content hosting is (for now) entirely through a system administered by someone else. If I control the identity, I can at least keep local copies and rehost/repost content later.
Instead, it feels like the current Fediverse demands that I make a blind choice to entrust not merely a copy of my content but also my whole future identity to whatever of these current instances looks the most stable/trustworthy at first glance, hoping my choice will be good for 1-5-10-15 years. It's stressful, and then I look into self-hosting, and then I put the whole thing off for another week...
AFAICT I would need to set up a whole federated node of my own in order to get that level of identity-control. Serious question: Is there any technical limitation preventing the admin of an instance from just seizing an particular account and permanently impersonating the original owner?
In contrast, I was hoping/expecting some kind of identity backed by a private asymmetric key. Even if signing every single message would be impractical, one could at least use it to prove "The person bob@banana.instance has the same private key that was used to initialize bob@apple.instance."
This is basically the entire point of the Authenticated Transfer Protocol (AT Protocol), which powers Bluesky. I think it does a ton of stuff right, including portable identity backed by solid cryptography (no blockchain or "crypto"!) and has a lot of promise. It's still in development, but I am hopeful that it will live up to its promise.
Yes, at the end of the day a malicious client is always a risk with this sort of thing. But the AT Proto does have some mitigation in place—users have a signing key which their PDS needs to act on their behalf (sign posts, etc) and a separate recovery key which users can hold fully self-sovereign and use to transfer their identity in case they detect malicious behavior. It's not foolproof of course, nothing is, but it is thoughtfully designed.
But yes, the protocol does have a fair bit of trust of your PDS built in. But that's inevitable for decent UX—imo the crypto craze proved that basically no one wants to (or can) hold their own keys day-to-day. If you want to have a cryptographic protocol that the average person can use, some amount of trust is necessary. The AT Protocol artfully threads the needle and finds a good compromise that is a (large) improvement over the status quo, in my opinion.
In theory, kinda, but you can bring-your-own client, and "the" web client is decoupled from the back-end instance.
"bsky.app" works as a web client for the official "bsky.social" instance, but it also works with the instance I self-host (or any other spec-compliant instance). Likewise, 3rd party clients work with the official instance, and also with 3rd party instances.
However, no key-stealing could possibly happen right now in any case because... the PDS ("instance") holds your signing key - the client never even sees it. Having the server hold your signing keys is very user-friendly, but of course not ideal for security and identity self-sovereignty. In general, the security model involves trusting your PDS (just as you trust your mastodon instance admin, or twitter dot com - the improvements are centered around making it easier to jump ship if you change your mind).
Client-signed posting is something that's not even possible right now, but I believe it's somewhere on the roadmap. If it doesn't happen some time soon I'll be implementing it myself. (I'm writing my own PDS software)
That's never going to work for the average person, sadly. And it misses a lot of social features that a lot of people (myself included) want from social media. Simply put, the UX is way too far off what people want and need.
It will, ISPs just need to start providing the basic hosting infrastructure on their routers again, like they used to. Thankfully we're also at a time where IPv6 is mature enough so that this is greatly simplified !
Wordpress doesn't have ActivityPub built in, it's a plugin in beta currently. Without AP, there is no client that can pull in website feeds and provide discoverability between WordPress sites, Mastodon posts, etc.
Back in the old days, activitypub was my Rss feed reader. Discoverability was driven by good old fashioned cross linking, comment discussions, and skimmable feeds from aggregators like the one we're on.
People love to reinvent the wheel and claim it's a whole new thing. No ideas on the web have really been innovative since the bubble popped. The innovation has all been on delivery and execution (not wanting to discount any of that).
Sure it is? WordPress updates itself and all plugins automatically. I've had Wordpress sites running for over a decade with zero security concerns ever popping up.
> For me the main barrier is that I want to have portable/roaming control over my IDENTITY, even if the content hosting is (for now) entirely through a system administered by someone else. If I control the identity, I can at least keep local copies and rehost/repost content later.
This is why I want domains as identities to succeed. I want to own my handle on every platform, but I don’t want to self host.
Do you know of any existing projects in this space?
I was toying with an idea/protocol where:
1. You add a TXT/CNAME that points to a trusted "authentication provider".
2. When you try and login to a website that supports the protocol, it checks the DNS record and redirects you to your provider.
3. You then "prove" that you own the domain to the provider - how this is done would be specific to each provider, but one possible method could be by providing a signed message that can be verified vs. a public key stored in a DNS record.
4. The provider redirects you back to the original website with a token.
5. Finally the original website consumes this token by sending it in a request to the provider. The response contains the domain as confirmation of the user's identity.
This approach removes the need for self-hosting as users can point and setup their names with third party providers.
Users can also trivially switch to a different/self-hosted provider by changing the CNAME.
Communities could also allow direct registration by hosting their own provider instance and pointing a wildcard subdomain at it: (i.e. *.users.ycombinator.com).
Users could then sign up to said provider using traditional email/password and claim a single subdomain: (i.e. tlonny.users.ycombinator.com)
Sounds they want self custody of their keys. This isn't what the general public want.
Decoupling identity from social is a good idea but you can't just migrate the key storage to a single custodian entity. There'd need to be a multiple custodians to ensure the same power imbalances didn't reappear in a different form (e.g. Google owning everyone's logins).
Exactly. There is no magic trustable non-local/distributed system that replaces 'self-hosted' for this purpose.
All that is needed is a to create your local identity (e.g. like storing fingerprint biometrics on your laptop) and a clever way to sync between physical devices (e.g. through bluetooth).
We're in this weird situation where people don't want to be responsible for managing their own data/id, but can't trust others to do so for them.
I raised issues on mastodon and plemora advancing this view a few years back, to an initially frosty reception that’s since become a grudging “nice to have but hard to add”.
My recommendation was much like MX records for email, so you can use a hosted server under your own identity.
There are people who want to add distributed identity to ActivityPub. It was left out of the spec but there were things left in to make it possible to add later. That's my understanding from a distance, anyway.
I've been able to switch mastadon instances without any problems; most instances seem to handle whatever activitypub machinery transfers followers.
So as long as both your source and destination support account transfers, you can usually switch and even seamlessly bring along most of your followers without them noticing.
No idea about your admin question. All bets are likely off with a bad admin. If you want actual cryptographically guaranteed communication, that doesn't exist in a usable form (except for Secure Scuttlebutt, and that's reeeeally stretching the "usable" part)
That doesn't actually work, though. Old links don't update or forward to matching instances of your toots on Mastodon. They're isolated. It's a bad experience.
That's a good point. It would be nice to have a streamlined export flow that rewrites your links for you.
(It's technically possible to edit the links in the posts you exported yourself before importing, but technically correct isn't the best kind of "correct")
Yes, Tim-Berners Lee led the Solid project[1], which reverses the client/server identity and data model. The user will store their own data and the service provider can only access it under the policy set by the user.
The promise is that one can not only transfer the identity and all personal data across instances of a single service, but also across different services (imagine from mastodon to Lemmy).
Secondly, signing every message wouldn’t be impractical at all, I don’t think. We’ve had the technology to do this for a long time and it’s very simple. What we don’t have is good key management. For average users, this would have to be something provided by their devices (phone or the Secure Enclave in your Mac or whatever) - managing keys and the web-of-trust shamozzle are the main reasons why encrypted email for everyone never took off.
Not OP but I want to point my dns at a host and have them handle it.
You can pay for that service, but you have to administer the instance, and it’s not able to reuse the servers RAM for multiple domains; it’s not like email where spam management is built in.
The federation is opt-out, so by default an instance will accept any federation request. You only block bad instances after the fact (or use some kind of shared blocklist)
The main problem with the fediverse is that none of the people I want to read, post there. And they never will, because they are disparate sociopolitical demographics and the fediverse by design keeps them in separate instances that I at some level need to think about.
The majority use-case requires centralization, which is subject to the network effects that constitute 95% of Twitter's value. Great that it works for you and some others, but it cannot work for most.
I tried it but I just can't get into the flow of things, it doesn't feel like a lot happens during the day, but maybe i'm on the wrong server? I just want to expel my bowels and doom scroll bad funny memes
> it doesn't feel like a lot happens during the day, but maybe i'm on the wrong server?
We are oriented to think that way after ~15 years of the algorithmic engagement-maxi world of Twitter. It always looks like there's a lot happening all the time but look deeper and it's a bunch of people offering their weak takes on hot topics to build their brand.
What was the last thing you remember being must-see sfuff on Twitter?
> We are oriented to think that way after ~15 years of the algorithmic engagement-maxi world of Twitter.
I never cared about algorithmic engagement.
Before (and during) the engagement Twitter:
- has everyone you needed there, centralised
- has search, where you can find people and topics you care about
Mastodon has none of that: you have to know which server to join and how to find people and topics. Centralization always beats distribution in convenience.
You may not care about algorithmic engagement, but that doesn't mean that the content you are looking is free of the incentives created by said algorithmic engagement. I use Twitter search too, but it's full of hot takes, because that's what the network rewards.
It's kinda like the "I don't care about politics" stand. You might not care, but the institutions you interact with every day certainly do.
> What was the last thing you remember being must-see sfuff on Twitter?
The Russia circus last weekend made for some pretty good near real-time intrigue. That being said, I don’t care what crazy thing is happening…I’m not creating an account to hear what’s being said on “The Global Town Square ™”
My experience of it was actually opposite. I first started from Twitter, trying to piece together something from the chaos of tweets, but then I went to an actual news site which had a nicely packed timeline with latest action and had more accurate and more up-to-date information.
In theory, Twitter should be that real-time news feed from these kind of events, but it doesn't actually work. Signal-to-noise ratio is just very low.
How does fediverse intend to pay for server/developer cost? For new technologies many smart people work for free as long as it excites them and when it just comes to maintenance and fixing bugs it wouldn't be cheap for any technology with so many moving parts. Also, early adopters donate with much higher probability than when the masses arrive.
Coming of age in the late 90s/early 00s we had plenty of forums to choose from, hosted by hobbyists, with nary a monetization scheme in sight. And this was in the era when the tech was far less accessible and the hardware far more expensive. Sure, maybe a modern $5/month VPS running basic forum software isn't going to handle 100,000,000 active users, but it sure will handle 10,000 active users, and that's more than enough to have a healthy community.
(Note: I'm of the opinion that fediverse-style federation in the context of forums is merely a nice-to-have; the web is already naturally federated, and people should not feel bad if they want to save money/tech complexity/administration complexity by settling for ordinary self-hosted forums.)
This is specifically why I call out the federated model as a nice-to-have, not a requirement. ActivityPub is way, way more demanding of CPU and transfer than a simple forum, and as a result it's extremely difficult to self-host at scale. You can easily service 10,000 daily active users on a VPS serving lean, statically-rendered forum pages.
Is there a reason for this, though? Need to be able to iterate on features quickly? Maybe not being able to tackle various complexities with the total available resources? Or maybe federation is just inherently expensive?
Why couldn't we have an alternative written in a more performant language/runtime with maybe things like lower quality images/videos or something?
Because performance was not a concern when it was designed - or it could be that it was designed for small communities, and therefore not possible to scale-up cheaply. One of the problem is the caching of pictures from the different instances connected (if I remember correctly) which makes the data storage requirements go up very fast
The big issue with hosting forums and the like is trying to keep the bots at bay. I have seen very small forums get over run in next to no time. And putting in bot checks leads to frustration with the users.
Good point, my implicit assumption is that, unlike the classic forums of my youth, forums in the post-LLM age will want to adopt the "tree of invites" model (e.g. how lobste.rs does it) rather than allowing unrestricted write privileges (read privileges can still be public). This creates a localized web of trust that will be mostly manageable at medium scales; ban or revoke invite privileges to any users whose invitees turn out to be bots or sockpuppets.
It's still at the early days, so give it some times. But there are already some lively communities and, IMO, they are generally better in terms of quality since Fediverse is more niche and has higher barrier of entry.
This. Twitter is doing what they can to drive people to mastodon. I've held off closing my Twitter account but I need to get better familiar with mastodon. I was wondering if there are accounts like NWS <location> (weather updates) over there, or so they plan to have some soon. Also from brief exposure, servers like mastodon social reads like left wing echo chambers which is also cringe.
The main reason is I followed a lot of scientists in my field of research who left the platform. Plus there's been subtle (and some not so subtle) changes to the algorithm and the software.
Most of the people I followed before were clearly left wing. After Musk takeover a lot of them left the platform. Plus the algorithm now pushed more right-wing content in the home page. I wouldn't mind if it were real people discussing valid talking points. The problems are 1) they are all coming from blue check mark accounts, 2) most of it are clearly misinformation, and 3) you can tell most of these tweets and replies are troll and bot accounts. It's just annoying.
Musk boosting his own tweets in the feed was annoying. Had to unfollow him.
I use Twitter via mobile website, and it breaks more frequently than before.
Overall, it's become like Musk's other product, Tesla. It over promises and under delivers. It's not reliable anymore. As a 2x Toyota owner I cannot stand products that are results of crappy engineering. So there you go.
Edit: one more thing: it used to be that I could go to the trending hashtags and get latest news in a second. It's not the case anymore. Case in point: yesterday France was trending. I saw the tweets and got the impression that some member of minority community has committed mass stabbing or rape again. Because the Twitter results were brigaded by right wing blue check mark accounts spewing anti immigration propaganda. It was not until I read a BBC article that I realized what happened was complete opposite, police executed an immigrant at a traffic stop.
Twitter has lost almost all of its core values under Musk. It's just sad.
The internet is most definitely not fine. There is an incoming tsunami of chat gpt generated bullshit that is going to make most open discussion sites more or less useless. Twitter requiring accounts and Reddit shutting down apis are both related: chat gpt et al are a threat to the social media business and ironically made possible because of the social media business. TBF I think we should all be exercising extreme (even more than usual) skepticism on any discussion sites any more.
Social media companies are acting like a drug user who was getting their dope for free but now has to turn tricks behind the dumpster. Easy money (aka low loan rates) has dried up or much harder to justify so companies that have used their user population as a means to profit are realizing they need to charge money for things like blue check marks or API usage when all the VC's won't give them a hit anymore.
Publicly free content from users devolves to garbage content. I think the Chat GPT effect is they're realizing its easier for companies/entities to generate garbage that is at or exceeding the intelligence of comments by actual people (a low bar). Sure there are pockets of usefulness but this is tiny amongst the firehose of garbage.
If all that is publicly available on social platforms is just garbage nonsense, people will just stop going if any barrier is thrown in front of them. The internet as a technology stack is fine. This is how social media dies (hopefully).
A potential saving grace: I bet within a year or so it will be easy to self-host LLMs that are easy to fine tune and run. Then there will be a few open source tools that you can use yourself, privately, to capture your level of interest while reading, and periodically make a reader/summarize/filter agent.
This is not scary if people can fairly easily run it all themselves, keeping their data private. It would help wade through crap, and there is some irony in using LLMs to have personal filtering and summarization.
Compared to future versions of ChatGPT, Bard, etc. the models that individuals can self host will be much weaker, but I think that they will be strong enough for personal agents and eventually be cheap to run, and affordable to fine tune.
Sounds to me like the Internet is, in fact, fine. It's only those "open" discussion sites which are having trouble.
Those sites never fit my definition of open anyway (free, permissively licensed technology and content). The ones that do are smaller, aren't a monoculture and seem to be pretty untroubled so far. No one wants to scrape the little Mastodon or Lemmy instances or other small community sites I pay attention to.
Big deal if something is a threat to the social media business. The social media business is a cancer which should be destroyed anyway. Go outside and engage in real socializing instead of the depression-spawning, teen-girl-murdering version peddled by Zuckerberg and Musk. It's much better and once you change your habits you'll never look back. Maybe it has something to do with all the vitamin D you get from being outside.
If anything, nature is healing. I’ve noticed at least 1 community return to forums due to all of this (https://mholdschool.com/), and while unfortunately AFAIK there isn’t any good FLOSS software for it, it’s certainly a start.
Discourse is going to pull the same shenanigans as Reddit and Twitter, you can be sure of that. No one is going to host millions of users for free, forever, they'll all come around to get their investment back one day.
Yes, the respective projects themselves. If you can't trust a project not to fuck over its users, why are you so concerned about the hosting of their discussion platform in particular, rather than just about everything else?
Do you think self-hosting means "it's free to host"? Do you understand someone is paying the costs of hosting, and that someone can do whatever the hell they want to recover such costs?
usually those costs, for a specific community, are not very high. not sure about discourse specifically, but you can serve thousands of users for relatively cheaply.
Yeah. Honestly, at this point you have to think the ketamine isn’t being microdosed[0], or the whole Twitter escapade is an intentional op to burn down an account independent forum that has frequently been a source of pain for those in power.
I suspect the later. We’ve seen billionaires do it countless times.
Elon is not great at engineering things, either. The main thing that he’s great at is self promotion and convincing people that they should give him (even more) money.
Nah, how many people are using Twitter without being logged in compared to how many people legitimately change every link they receive to Nitter?
I used the 'redirect to nitter' Firefox extension and Android app but it got quite unreliable and nobody else that I know uses nitter at all. I think Nitter users would be a tiny, perhaps even immeasurable minority compared to casual readers that now are incentivized to either log in or fuck off (but... FOMO).
I didn't mean nitter _specifically_ but rather all forms of alternative/anonymous UIs for twitter (that strip all of the engagement/ad/tracking stuff from twitter.)
Of anything I suspect Elon saw what happened woth reddit and had a "wait, we should do that!" Moment.
Bingo! It's only a matter of time another service takes over. I would be extremely surprised if at any given time there wasn't at least two startups in the dark trying to be the next twitter. Just waiting for the right moment for a large group of people to get pissed off at twitter to launch their service to the public.
Wait until you realize that highly centralized businesses are a feature, not a bug.
We've BEEN through federated platforms before. We've even been through PROTOCOLS before. They're all horrible. The successor to any platform that currently exists will have slight improvements to what already exists, and that's IF they're able to do so.
I don't have a dog in this fight, but I do have over 30 years of being around social media platforms on the internet.
> It’s been the primary organizational mode for the last 10,000 years.
For most of human history, most businesses were small ma and pa shops operated by a few local people. These days large business chains are the norm. You could say that centralized big business killed the decentralized small ma and pa shops.
As the saying goes, the market can stay irrational longer than you can stay liquid - everything eventually falls, but Twitter's not on the long side of "eventually" here.
The key here is that the environment of near-zero interest rates these services proliferated in is over, so there is a drive to wall up and monetize their content more aggressively. That will probably fail, because all value they had was in community interaction. Who would want to train an LLM on post-2023 twitter content?
Because Big Tech demands big profits. They can not exist in worlds where they can not have a monopoly or oligopoly.
Smaller companies and ISVs, on the other hand, will be better off if their market is commoditized. They won't have to spend so much to compete in R&D, they just need to find each the best way to serve their (comparatively) small customer base.
One is that centralized _business_ has certainly not been the primary organizational mode. You can talk about centralized _government_ (of whatever variety you'd like), but the distinction there is that the centralized government had some sense of itself in an ecosystem - a citizenry, a land, a future, etc. - and businesses do not.
The second is that centralized entities sure wrote a bunch of stuff _down_, but it's hard to say they were the primary organizational mode for any but the last 50-200 years - the reach of serious centralized bureaucracies has only really begun to match their propaganda in the industrial and now computer age. Until extremely recently, the actual effective reach of a centralized bureaucracy was a day's horseback ride - control degrades rapidly as one leaves the core. "Heaven is high and the emperor is far away", as the saying goes.
Edit: With regards to the second note, "Against the Grain" by James C. Scott is a solid read.
It's not centralized businesses that inevitably fail, it's businesses that become top dog that inevitably fail.
If you own a company and want it to last through the ages and you aren't literally the only guy in town, never become number one. Aim high, but don't hit the top.
The same can be said for countries and practically any organization or group. You stay an underdog if you do not want to ever fail.
This doesn't make intuitive sense. Are you sure you're not seeing the results of survivorship bias? Or is there a mechanism in place that kills top players?
It's because once you're top dog, you stop aiming high and start maintaining your top spot. You stop being ambitious and innovative, instead you start being anxious and wearing rose colored glasses.
That leads to complacency, corruption, and delusion, ultimately leading to failure.
Everyone and everything from mundane individuals to megacorporations and empires have all fallen from grace once they became top dog. No exceptions.
If you want to last, aim high but don't become top dog.
To paraphrase Gilmore, “the net interprets [mandatory login and other access fuckery] as censorship and routes around it”
If Twitter locks out more readers, people will stop posting and move elsewhere. If Reddit ejects the mods that made it successful, communities will evolve elsewhere.
The forest will regrow and different paths will form, routing around the dying patches.
This is true of any medium in any free society. The net has nothing to do with it. It certainly isn’t capable of interpreting some website action as censorship. It’s not sentient. And there’s no dynamic network routing/damage at play here. It’s just people going somewhere else.
I have a prediction that will make you happier:
This is a temporary thing, that will generate a TON of press (outrage ; praise by the fans at the amazing bold strategy move) and will also generate a TON of signups right as there is a wave of people leaving Reddit.
And then it will magically open back up.
(More press)
And that's the story of pretty much every one of the outrageous/bold/brilliant/terrible strategic decisions you've heard of since the Twitter takeover.
What's most amazing is that it works everytime, I'm surprised there isn't an Onion copy/paste article about this each time.
It’s a gigantically dumbass move that’s killed off all amateur creators and consumers. My feed is just filled with full-time content creators churning out 1/20 threads that are consumed by other full time content creators.
I've been meaning to make a Twitter account since I follow a few accounts for some games that I play.
Between Twitter not being owned by a loon anymore and finally a reason to overcome my laziness, why not?
And yes, I know I'm playing right into Twitter's hand. Whatever, I actually like Musk anyway (an unpopular opinion that will no doubt get me flagged around these parts).
This internet started to suck a long time ago, but today was a red letter day.
Reddit's been circling the drain for years, but today is the day it truly crossed a line for millions of people at once (They killed Apollo and RIF).
Twitter's been getting rapidly worse since Musk bought it, but this is another red line.
We're not even allowed to talk about the influence of bots and astroturfers here. A popular post today was flagged within an hour, just for pointing out that a site claims to sell upvotes on HN.
I don't like it when the tech giants get in sync like this...
You can talk about the influence of bots and astroturfers...so long as you agree it's all one big conspiracy theory, like all the comments did on my post from yesterday [1], "The Gentleperson's Guide to Forum Spies". [2]
It feels like the "natural" development of VC funded websites - they offer a service that's heavily subsidized and losing money hand over fist to displace other services (I guess in Reddit's case it was self-hosted forums and similar?).
But that's clearly not sustainable, it's inevitable they'll pivot to monetizing the service - and that's clearly going to make it less attractive than the heavily-subsidized version people have gotten used to.
I guess the ideas is to lock the users in to the degree that the increasing monetization is put up with, and slowly enough there's no "sticker shock" of a previously-free service suddenly having a price.
I'm constantly amazed people are surprised by this, isn't it obvious that tying to a loss-leader service isn't sustainable?
The bigger problem is the plan was basically infinte growth and was only possible due to 0% intrest rates.
The last 15 years has been a weird fever dream of free money, now that intrest is a thing again suddenly the infinte spending expansion and figure out profitablity later model is no longer sustianable and reddit twitter google meta ect suddenly have to actually make money again.
Basically the internet everyone has been used to for 15 years is dead as the reality of debt suddenly resumes.
Expect things to keep getting worse or revert to the old version of the internet of scattered low power sites and forums.
Frankly only microsoft is really the only one that has a somewhat sustianable model for an intrest rate enviroment.
> Expect things to keep getting worse or revert to the old version of the internet of scattered low power sites and forums.
This is actually the most mood-lightening comment I've seen on this matter. A return to the internet of 15 years ago doesn't sound so bad to me. (Of course, the mood darkens again when I remember that it won't really be that, because, e.g., that internet of 15 years ago couldn't handle the bot-spam of today.)
Reddit didn't need to be heavily subsidized. Old Reddit was open source, had community-built apps, and could have been run indefinitely with a handful of employees and Reddit Gold. Like a Craigslist or Wikipedia model.
Instead with their VC funding they spent $$$ to introduce things like NFTs and TikTok scrolling, which nobody asked for, but it burned through a lot of cash.
HN doesn’t really have a business model so I’m not sure how much it can be corrupted. I guess if YCombinator ever gets tired of paying the bills , but I’d imagine having a majority(?) of startup/tech employees visit your site almost daily is quite good for them in indirect ways
I assume that if HN ever caused them any serious PR damage, they'd pull the plug that day. It's always been made clear that it's a single-server site sitting somewhere that was written as a hobby a long time ago. That kind of nothing cost they're willing to take on indefinitely, but not any serious bad media cost. I've always felt that was the reason for the heavy moderation, especially as compared to the anarchic early days. Even with the salaries of dang et al. expenses are probably a rounding error.
That's not independent of the moderation, though. Topics that invite heated discussion but aren't absolutely necessary to talk about on HN get nuked very quickly. If it would be absurd for it not to be discussed here, it gets a pinned note from dang telling everyone to behave, and gets carefully monitored to make sure it doesn't degenerate into a riot.
It’s effective because in this instance their interests align with their audience’s, which is pretty much an ideal situation in such a capitalist economy. Also, I don’t know how much HN costs, but I cannot believe it is disproportionate in the PR and communication budget of something like YCombinator.
Bingo! Being open to access, but heavily moderated, is the perfect spot for them. HN is essentially perfect for it's use case and is unlikely to change soon.
I'm cautiously optimistic about Lemmy. Anyone can spin up an instance, so it's decentralized, but instances are connected, so there's community.
It may still be rough around the edges, but to me it feels like the spirit of the old phpBB forums combined with almost 20 years of lessons learned from Reddit.
<rant> Not responding directly, just piggy-backing for lack of a better place to put this comment.
The problem with Lemmy is that one gets sent to some place like https://github.com/maltfield/awesome-lemmy-instances, is immediately confronted with a ton of weird links like "butts.international" and "badblocks.rocks", what even is this? And about 100000 other servers just named "lemmy", "notlemmy", and "lemmy1". So you click a few at random optimistically, then get hit with login page, or a server error, or an apparently empty test-server. You begin to think you're being pranked, like am I supposed to brute-force click like 50 things to find something that's not a joke? Maybe you go to https://join-lemmy.org/ and it says "After you create an account, you can find communities", so great, it's inaccessible anonymously, the same as twitter. You go to https://lemmymap.feddit.de/ and after 15m of page-loading get a hilariously useless cyberpunk-looking word-soup where you can't click any links, much less search for topics/communities (btw there are 2068431 running instances and somehow butts.international is still front and center in my cyberpunk view)
Finally, by ignoring recommended tooling and just using google-search I found a community relevant to my interests, but it has pretty bad content and a whopping 1 user/day. Another google search trying to find a certain topic, I find one, but it has only 3 total comments, and I could not tell what month/year the posts were added.
So, clearly I don't really know what I'm doing here, but this stuff is ridiculous. As long as we're crawling 2068431 instances why don't we look at the communities it hosts and the volume/recency of traffic? At least filter totally empty stuff and/or make it easier to get all the test instances in a sandbox! Discoverability is so bad that I can barely get to the point where I'm considering usability / content.
You're making a very good point. I looked at lemmy stuff before but I moved to kbin instead. It has a more familiar interface (to reddit) and it is federated with lemmy. This all sounds good and it is. But even there I basically have no idea what's going on. It has more content than the lemmy instances you visited but every once in a while when I remember to visit I only look at whatever landing page I configured and can't figure out what's kbin, what's lemmy, what "magazine" I am looking at etc. To be fair, it's already a pretty good product and it is likely to get a lot better. That's exactly why I keep trying.
As someone who signed up for Lemmy and is planning to replace my Reddit use with it, I hear these points loud and clear. You also make a good point about Google search and how bad it’s become - I see so many stories of people adding “Reddit” to their search in order to get any decent results. This is the natural result when you have every business paying people for SEO and trying to game the system.
Between account walls and search’s indexing problem, it’s become very hard to find small to mid sized active communities on your own. In fact this problem seems to be something people are trying to solve in Reddit communities via related subreddits on the sidebar.
So having gone from using search engine’s to crawl for relevant content that was out there, people are now creating content specifically to end up in Google’s search results - destroying the value search once had. Indicated by what people have done on Reddit, and these discussions about finding alternatives, it seems we are well on our way back to webrings. I welcome this.
Good example of irritating comment that led to the odyssey above. What you say also appears to not be true. The correct assertion is maybe: If you can find an instance, and if the instance is configured to allow anonymous, then you can browse communities. Since nothing is ranked or searchable, then if you're willing to do that for N instances and M communities, enduring server errors/login screens the whole way, maybe you can find decent content after weeks of brute-force labor. This isn't practical for someone who just wants to spend a few minutes finding a replacement for /r/math and /r/physics or whatever.
For the short term at least, since discoverability is so broken, I think those who want to advocate for Lemmy will be better served by just linking to content or curating indexes of active communities. It's not that useful to anyone if the focus is always about pretending everything is fine, or presenting prospective users with totally useless machine-generated indexes where we cannot tell the test-servers from production.
IIUC it's federated,not decentralized. Still pretty good but if a server drops offline it will be pretty inconvenient. The communities on that server will be dead.
Matrix does better as the rooms are decentalized so can continue even if the creating server drops offline. But user accounts are still only federated.
Lemmy.ml's admin is pro chinese government and actively censors comments that are critical. What that means to you is your decision, but I want to make people aware before the mass migration date arrives.
Lemmy's team is very politicized, but even then it took significant pushback to change their minds about an issue that the community was decrying for reasons that were almost entirely technical.
It bothers me a little bit that having a strong stance against intolerance is seen as being “politicized.” That should just be normal and expected behavior.
Maybe they were abrasive in initially fighting the request to make technical changes to the slur filter, but hey when you ask for free enhancements to open source code you either do the work and provide a pull request or be prepared to be told no.
I empathize with their concern about becoming another Voat or Gab. They want federation but they don’t want a Wild West.
The problem is that the stance is incredibly shortsighted and in a way bigoted itself. Take a word filter that contains some regex for n**a. They are saying you should never use slurs and this word in particular in public discourse.
So the word above word is used in lyrics of a music genre with predominantly black musicians. In addition to saying we don't want our software to be used by racists, they also say "we don't want our Software to be used to discuss certain kinds of black music" (arguably a racist stance just by itself). Talk about unintended side effects.
yes, this is one of the trade offs of any system built where one must decide between human moderation/curation vs automating moderation/curation.
if automation is chosen there will absolutely be situations where perfection is impossible. if human’s unparalleled ability to see nuance is chosen then the cost scales along with the amount of information.
the fact is, if we want a community and we want to keep signal above noise, we will need some form of removal of spam, child porn, racism, etc…
automatic tools can’t nuance as well as humans.
then human mods start nuancing and someone will point at stuff and call it biased.
> It bothers me a little bit that having a strong stance against intolerance is seen as being “politicized.” That should just be normal and expected behavior.
It did not seem to me a politicized discussion but a technical issue with filtering using hardcoded blacklists that are just too prone to the Scunthorpe Problem. Perhaps because too many people in the USA despise the mere existence of other languages :)
I think we have to remember that this isn’t a commercial product, it’s a small project. They had a quick and dirty solution and weren’t willing to abandon it but also weren’t initially willing to put in the time to make a more robust solution.
This seems to be inaccurate. When I go to join-lemmy.org, and click join a server, the first servers on the list are at least semi-randomized recommended servers. Every time you refresh the page the selections change.
As far as the “popular” list, which is placed below the recommended list, lemmy.ml doesn’t have any special privileges there. It just happens to be the most popular. If something becomes more popular it will go on the top.
There's nothing stopping you, the source code is freely available to edit when you start up your instance. It would probably limit the appeal of your instance to many people and maybe some very ideological smaller instances might defederate, I doubt the other major instances would care much since only those signed up at or browsing through your instance would see the ads.
I see absolutely no reason why an instance might not decide to fund itself on local ads. I see no reason why you couldn't choose an ad-supported or non-ad instance
Forums used to be ad supported, nothing particularly wrong with being ad supported. Problems occur when your investors expect 10x return on something ad supported.
But to have something pay for it self. I'd rather not lose money on it
Actually, there is something wrong with ad supported platforms. Advertisers start imposing restrictions on the actual content, or the owners of the platforms enforce restrictions pre-emptively so it never arises.
I was on a forum, and the kind of language and even topics for discussion were severely restricted, partly because of advertising.
I mean I consider this a huge win: delete your Twitter account and never again will you be tempted to go read a tweet. If only I could set an anonymous expat cookie for all the services I've left behind letting them know "No, seriously: I left and I'm never coming back. No reason to track me, show me your content or ask me to login." Where's my restraining order cookie telling Facebook to fuck off outta my life, never to return?
Was using Fritter for Twitter and Infinity for Reddit. Both apps allowed local subscriptions without forcing a login. They were perfect. Both now dead in the water for me.
I’m so jaded with the internet at this point. Most phone apps and games are a microtransaction hell. Most social media apps are a race to the bottom with clickbait. Every website is littered with ads and popups and cookie banners.
Time to go outside and forget that the online world exists.
There is any believable amount of content being made where the value is so absurdly low, I welcome this change. For everyone that wants to setup a copy/paste version of a CRUD app or YouTube channel, that is a ton of time and effort being wasted. I’m not saying we should be focused on solely optimizing everyone’s time and efforts, but it’s clear we have tipped the scale too far with how much time and effort is being dedicated to bullshit.
If we as a society decided to channel these efforts into building infrastructure improvements and homes, I think this would help a lot more. I understand the biggest problems in that respect are legal and cultural, but I can’t help but feel people have tried nothing and are all out of ideas.
I have Starlink, which gives me a US IP when I connect to it, and I was surprised to see the amount of ads people there are subjected. Most don’t do business where I am, so their free tier is effectively without ads for me (except twitch. It choses to show me US ads for some reason).
Reddit's Eternal September began no later than 2018 when Tumblr banned porn. What's happening now is that it has hit an apparent growth ceiling (most of the most upvoted posts are from more than a year ago) and the owners have realized that they have to make it profitable now or never.
As long as the site is growing, it doesn't have to be profitable. But when the music stops...
I think the big problem with profitability comes directly from why the userbase grew so large to begin with. People use it as entertainment, and are uploading content that is expensive to host. Video and images are magnitudes larger than simple text and links to that content. If Reddit didn’t take it upon themselves to host media, they wouldn’t be in such a crunch. They also wouldn’t have grown so large, true, but there was no question of Reddit’s sustainability then. With mainly text and links, weathering the slowdown in growth storm would be easier.
> This version of the Internet is starting to suck. :(
Totally! The internet was better before pay walls / auth walls.
I get why we're here today, and I get that the last phase was just about acquiring an audience, and getting people hooked, and now this part of getting everyone to pay was always the plan, but this part really does suck.
Just feels like all the services lined up to start shitting on users at the same time. Netflix, YouTube, Reddit, Twitter, NYT (and all the newspapers really)... you can't watch Amazon Prime without having 300 "buy now" buttons in your face.
I remember seeing a documentary way back in 2000 about the internet that said something like "the internet is freely accessible now, but some people believe eventually pay walls will begin to block access to more and more portions of the internet." And I thought how ridiculous, that will never happen, information wants to be free!
Well, looks like I was only partially right. Most access is still "free", but at the cost of enshittification.
Ah, the F2P (free-to-play) model. Indeed I’m surprised that microtransactions and pay-to-win type dark patterns have mostly not been adopted on the internet at large. And loot boxes! Can’t forget loot boxes.
If P2P micropayments could somehow have succeeded, then a different Internet could have been possible. Tipping content creators directly is impossible without megacorporations taking a cut, be it Apple, Paypal, Patreon etc, and their unit economics work better with recurring payments, which lands us in subscription hell.
One would almost be tempted to ask for a cheque in the mail like in the old days.
P2P payments/transaction media have been on the way out for decades due to the State's interest in controlling the financial transaction medium as tightly as possible.
The database has doomed us all! For it is the Seed of all Evil in the hands of the Wicked! ...and arguably the Good, but misguided!
It doesn't solve it at all, I am talking about real money you can spend in a store, not tokens that need to be converted through an exchange into cash each time.
There are stores where you can pay with cryptocurrencies. That said, GP said "almost". We will see if it ever reaches the point where this word will not be needed.
The reason why you can't spend it at the store is because the store doesn't accept it. The reason why the store doesn't accept it is because of the aforementioned scaling and complexity issues.
I'm at least glad that people will start looking into the alternatives.
The next lesson they need to learn is TANSTAAFL. Nothing good can come with an internet where publishers are paid with eyeballs. We need to rescue the idea that "voting with your wallet" is the best and fairest way to have quality content.
Somewhat agree, though the income disparity across nations (and even within) makes pricing somewhat more difficult- but I agree with the spirit of it having to be non-free.
also killed my irc bot that scrapes tweets to display them. Imo It won't help twitter to have no free read API calls, people won't click on every twitter link just to read two lines of text. I'd only need ~30 a day but I'm not paying 100 dollars a month (?!) just to read a few tweets.
Years ago I was brought to twitter compared to Facebook exactly because you could read without being logged in.
After a few years it had become a hellish place with lots of flames and arguments, but it still had some value.
It became clear that my engagement was mainly to discuss with random people about things knowing they would never change their minds (neither would I) on things like Covid vaccines.
It was a huge waste of time, but I found it out that surfing it as non logged would amore allow me to read without being able to reply to the most stupid comments. Some sort of read only Twitter.
Now that it has gone, Twitter has irrelevant.
Fuck.
I guess I'm done with Twitter.
Reddit is in Eternal September. Twitter is login-walled. If HN is next, I'll probably be mostly done with the Internet.
This version of the Internet is starting to suck. :(