I wholly appreciated the openness to accepting bot accounts, migrated some projects from twitter there during the big exoduses in the last couple years. And while it worked for bot purposes, fun to tinker with etc...(not unlike twitter tbf) it was just some server in space a blip in the fediverse and traction and lack of proper network effects for accounts meant it wasn't much use.
I'm not hot on the fediverse in general, and this just sours me on it a bit more. A bunch of dedicated admins keeping instances going, basically running hobby servers/websites like it was the 90s/early 00s, is never gonna work for the kind fo scale services grow to these days. I know not everything requires scale and lots of ppl are happy existing in their little silos, but that's just it, they're silos. Might as well be back on seperate forums for our seperate interests again. When you want the power of a mix of accounts/networks/interests everything balloons and can't be run with funds and larger centralization. Sigh. It's a tough one and has yet to be solved in full, with any existing approaches all sort of half-solutions. Maybe that's the way forwards in general (an internet of islands) but it sucks to have things going up and down and having to migrate around the net (with or without our own data) like nomads.
> When you want the power of a mix of accounts/networks/interests everything balloons and can't be run with funds and larger centralization.
I don't really get the appeal of this. Different topics/interests often demand different moderation and forum features. Trying to shove everything into the lowest common denominator of social media results in things like people posting essays as screenshots of Apple Notes.
The only obvious benefit I see of this kind of large scale centralization is for marketing. And that's not a benefit to me as a user.
The benefit of a centralized forum like Reddit, for a user like me, is the ability to use one account for everything (i.e. easy to join new forums) and the ability to have everything on one algorithmic or chronological feed.
Traditional forums can solve the former by using social sign-ins, and the latter by having RSS. However, support for either of this is inconsistent, so it's usually easier to just use subreddits.
Finally, money. Forums cost money to host, while subreddits don't.
Those are real and well understood benefits, and they're the reason Reddit won. The costs of moving a community from a forum to Reddit are nowadays mostly understood as well, unfortunately they weren't as obvious back then.
True, I've forgotten that's the whole purpose of federation: to create a unified home feed.
But then the Fediverse unfortunately runs up with the classic issue of network effects. For me personally, my Unified Home Feed of Reddit has more relevant content to me than my Unified Home Feed of Lemmy. And it seems it'll stay that way since Reddit's communities have thousands or millions more subscribers in general.
So, sticking to Reddit is easier. This isn't helped by the admin wars on the Fediverse which introduce messy and silent breaks in the network, often requiring multiple accounts to view everything you're interested in.
To the Fediverse's credit, their network is a lot wider so it still has some uses, that is, hosting communities banned on Reddit. But I'm not sure if some of the more "normal" subreddits have much incentive to move over.
Hmh, that's a drawback to me. I like to have different accounts for different things and keep them more or less strictly separated. I use two subreddits and each has its own account.
On average I'd say account creation has become harder because everybody and her child tries to shunt you into some social sign-in but it still is not difficult to make accounts. "Modern" password managers make it a breeze to juggle many accounts, and frankly, what Netscape Navigator could do almost 30 years ago was already enough for that.
Same goes for forums. phpbb [1] can run on any LAMP potato and you get a whole boatload of potatoes for 2 EUR/month at Hetzner. I don't know what HN requires, but I guess it's not much more. The hard part of forums are the people needed to keep order and Reddit is not helping there, quite the opposite.
That's fair. Don't worry, you didn't sound grumpy to me :)
I do acknowledge that there's a big benefit of having different accounts: different personalities! You don't have all your social eggs in one basket, so to speak.
If this is just a matter of motivation or lack of time I can understand, but if cost is an issue, why not just move to hetzner? A dedicated server there can be had for around 40 bucks, e.g.:
€42.48 max. per month.
CPU: Intel Core i7-7700
RAM: 64 GB
Drives: 2 x 4.0 TB Enterprise HDD
Thats with unlimited traffic, but no ddos protection or similar, so I don't know how essential that was at DO. Also you're on physical hw which is always more annoying if you have to call in because of a failing disk, but from my years of experience this is as smooth as it gets; shut down the server, open a ticket requesting replacement ASAP and give the drive's SN, and the server will be up again within 20 minutes. Absolutely acceptable for a side-project that doesn't offer anything mission critical. But I'd really be curious what the bill currently is at DO, and maybe you have some monster HW there that can't be matched here. Genuinely curious.
I think the author assumes the server requirement to continue to increase in the future, even only because of ever growing number of statuses, so no amount of server hardware right now is going to put one at ease anyway.
Sure, with 191GB we'd need to know what the growth looks like. But 4TB might work for a good while. Unfortunately we know nothing about the rest. Was it a mid-tier VPS and all the money went into storage and traffic? Then this should be plenty for another couple years at least.
It may also be that the moderation effort required will also continue to increase in the future, that the human ability to that is exhaustible. i.e. It may actually decrease at a point in time. You might want to act on these opposing trends before they reach a crisis point.
I don’t do that because I’m in the US and currently live, work, and operate my servers in the same legal jurisdiction. That’s handy! For instance, if one of my users pisses off Turkey, and they order me to take it down, I can ignore it. If the server were in the same court system as Turkey that may not be so simple.
I’ve run pgbench on some 12 euro/month VM at hetzner and it outperformed our 18k/year AWS RDS instance. Sure it isn’t managed, etc, but there is a lot of room between 12 and 1500 euros
Aw too bad, this has been a really useful service. I wonder if anyone wants to pick it up? The post mentions part of the problem is Mastodon's implementation being a poor match to high volume bots. You could imagine other architectures that were more efficient for this use case, it'd be a fun yak shaving exercise.
If anyone needs to migrate their own projects I've had good luck with feed2toot, to post RSS to a Mastodon account on a ordinary server. It's been around a long time now and seems reliable.
>Over the years, the server has grown to have around a few thousand active accounts, which isn't all that many. However, they've generated something like 32 million statuses. Just to put that in perspective, mastodon.social has over 2 million users, who have generated around 110 million statuses.
unsurprising that the bots would outpace organic users, but wow, what a ratio. i'd be curious to see this data charted over time
It makes sense. Assuming the average bot toots ~once an hour (24 times a day) and has been tooting for two years, you get on the order (nearest multiple of 5000) of 20,000 toots per user, which works out to 1600 users.
Also, there's a measuring change from active accounts to plain old users. I don't know the proportion that are active, but if I recall right, the fediverse as a whole had under 1 million active users. Assuming 500,000 active accounts that pull all the weight, it's 220 toots per user on average.
"I live in fear of an AI scraper figuring out how to scrape all of these files and bankrupting me overnight."
What is the best way around this for a hobby project similar to botsin.space.
I don't mind the service going down in case of a DOS attack. I want to handle TLS myself though (so no free Cloudflare).
Most important thing is my good sleep at night, so no fine print that allows the provider to pass on the cost to me in case something goes wrong. (If that means higher fixed cost, that's how it is, I'm not asking for a dream house, just reliable cost control).
I could see a huge business opportunity here. In Amazon S3 there is a concept of "requestor pays" so you can have an S3 bucket that loads of people download data from yet you aren't hit with the bill and instead the "requestor" pays for the bandwidth.
With AI scraping becoming more of a thing, a cloud platform could roll out a feature where an AI is allowed to scrape the daylights out of your site, but they must pay for the bandwidth or bandwidth + premium.
In this scenario you wouldn't wake up bankrupt but instead with a windfall of cash because TikTok decided to scrape all your stuff.
One of the reasons I maintain a node with only one user is I fear the day I'll be responsible for other people's social media presence; I could easily see myself going "It's just a few thousand users" and the next thing I know I'm asking whether I can keep this thing going (and agonizing over what it'll do to my users to cut the service). And unlike Colin, I despise Rails and wouldn't have the patience to hammer on it when it starts to misbehave.
Not decentralized: federated. Some instances are a lot bigger than others, and some work in different ways, such as one instance with lots of bots making procedural content.
Why are these Twitter clones always so resource intensive and finicky? Twitter is just "IRC in reverse", if you take literal descriptions transitive relations in IRC(v2) and moved around nouns, lots of it should apply to Twitter/Bluesky/Mastodon well. RDB gurus can probably recreate lots of APIs as tables on bare MySQL too.
It doesn't make sense to me that such a thing take so much dev and ops cost compared to IRCv2 servers, other than for the fact that modern webdev just so happens to be extremely bloated, especially when extremely competent and high spirited developers are giving up like this.
Are we doomed to keep adding more RAM and more disk and more bandwidth to catch up with ever-growing bloat?
>extremely competent and high spirited developers are giving up like this.
I'm pretty sure the median IRC server runs for much less than 7.5 years. I don't think anyone expects volunteers to dedicate decades of their life to admin duty. and it seems fine and healthy for the ecosystem that he is telling people they have a few months to move their bots to a different server.
I didn't read TFA but was there a way to find the migrated bots? honestly, for some quirky reason, the bots are a big chunk of my enjoyment of social media (from Opposum's every hour to randomly generated 3 body simulations of suns in 3d to flight tracking to weather alerts to CO_2 levels), and I have so much enjoyed botsinspace bots since joining mastodon (which is by far the most enjoyable/least addictive/least evil social media I've found).
The post shows where the cost is - storage and bandwidth. With IRC servers you're not expected to serve the all the history forever, with a website around it, persistent subscriptions, outbound queued notifications, etc. On IRC people also pretty much expect missed messages and splits from time to time. Those are very different services.
Yeah but are those valid reasons? Bandwidth is unlimited for most dedicated servers, and 190 GB for 7 years of data isn't a lot; it could fit on my phone 5 times.
190GB in the database. That didn't include media. I'm assuming the media part is not served from the same host since that can easily overshadow other traffic.
Not related to Mastodon, but in the case of Matrix, the server software ranges from "runs on a raspberry pi with zero issues" (Conduit) to "even with 16 GB of RAM, federating with a large enough room will exhaust Python's heap" (Synapse).
In the case of Conduit, a Matrix server with a few private rooms and users consumed only 32 MB of RAM, using RocksDB for storage. The equivalent on Synapse required about 5x as much memory, despite using SQLite. In practice, Synapse instances will use Postgres since many appservice plugins specifically require Postgres and don't support SQLite. Not to mention, SQLite isn't optimized for frequent, concurrent writes.
I do sincerely think the choice of Rails, and the fact Ruby only got a compiler people use recently, means that most Ruby programs require fairly beefy processors and plenty of memory in order to keep up with a few hundred clients.
Of course, I am extrapolating based off my experience running Synapse (a Matrix server) with Postgres. There is a chance Mastodon scales much better.
From what I understand, IRC doesn't hold messages for extended periods of time or allow media uploads, while fediverse does, so that's one big difference.
Do they have to? Twitter does and that's noble, but can't the server, say, per-user logrotate, sign that with webserver cert, and send via email or force download or push to HTML localStorage thing when user is on desktop and then forget about it?
it really, really varies by implementation. mastodon is popular (for some reason), but far from the most efficient activitypub server. akkoma derivatives are more limited by postgresql's IO performance than the phoenix app itself. unfortunately, what people know is a really slow rails app.
i haven't personally operated a misskey derivative, but based on my experience writing network servers on node.js it probably performs better than rails XD
the same applies for clients. there are nice native apps and some pretty efficient web clients, but they aren't the default on the most popular server software so nobody uses them.
Read the post. This specific server is created as a "playground for bots", ran by one hobbyist volunteer. Nothing in the post is surprising or says anything about the architecture of Mastodon-like software.
I think there are issues with hotspots. The most popular tweets are seen by a big chunk of the userbase, which means they have to operate on a fanout model where each tweet is pushed to individual followers.
I believe IRC doesn't operate like that. The messages delivered to each user don't need to be retained, and I assume the size of the largest channels is in the tens or hundreds of thousands.
Unfortunately it's written in ruby and it has no quotas on the amount of images and videos uploaded or downloaded per-account. Also it is not designed to scale horizontally or leverage any form of p2p.
These clones are often written in unimaginably inefficient languages like Ruby or Python.
If I’m not mistaken, Twitter uses Scala. That would have been a good start. For all the indie-ness, one of these clones could have been written in hand-tuned Rust or C# or Kotlin to respect the resources of people who would run them out of their own pocket. But sadly this has not happened yet.
Do you really think those "extremely competent and high spirited developers" haven't tried? Not only that there are numerous attempts to extend or replace IRC, but those attempts generally understood what is fundamentally different between an ephemeral room-base chatting protocol and a protocol that allows efficient traversal, aggregation and streaming of possibly large social graph and interactions.
Very incisive. Your post got me thinking: Rather than a federated system like Mastodon, what sort of protocol could (a) function as a temporary, room-based, privately hosted chat, that also (b) encoded the social graph and aggregated interactions in a distributed way that could be polled by any client? It seems like the past 30 years have designed either for the decentralized, chat-first model, or else the centralized social-first model (federated or not). I'm thinking of what an LLM could do in terms of summarizing and compressing both at the client level, so large aggregate searches would know where to look in a decentralized universe of chat rooms to more or less emulate the data-retrieval functionality of a massive centralized social network...
While technically different, relays in the ATProto protocol serve a similar purpose; it can be thought as a materialized view in RDB as far as I understand. So if ATProto proves to be successful in the future, extensions to relays might make that possible transparently. (One big limitation of relays right now is that they have to consume the entire repository at once, making it hard for individuals to host their own relays.)
What about like just readable fragments of materialized views that were encoded into the messages themselves. So that a sharp local context and a blurrier larger context could be reconstructed from any given message. Sort of like a Mipmap. And with maybe 10% of the messages in a thread you could reconstruct a fairly accurate representation of the whole thread, at least good enough to run a search on. Every client could serve as a relay that stored its own threads and a constellation of associated mipmaps, and, if some were missing messages, it would be obvious which other clients needed to be checked for the missing portions the next time they logged on. Old/archaic data could be warehoused by clients that chose to do so. No central servers at all, you just crawl client to client looking for the connections, and build your own graph based on what you're looking for.
Damn, I asked the other day if it was down after noticing a ton of timeouts in sidekiq. Reached out and Colin said they were looking into it. Guess fixing it up just proved to be too much.
File transfer costs on a typical provider will absolutely eat away at any money you could save through CPU or memory efficiency gains, I think. If you move to a different provider (like others have mentioned) to fix this then you'll often get good CPU and memory to go with it so the whole calculus changes. Those two resources are often the cheapest in the whole stack, for better and worse. You'll save $20 by using a smaller droplet but still pay $80 for outgoing traffic and another $30 for disk storage, and those are the costs that increase the fastest. The design of the existing Fediverse means that they just use a lot of bandwidth and storage. I think it would be a wash at the end of the day.
This would work for an instance of human accounts who need real UX. A bot instance should really work with the most basic Fediverse software, if it supports posting, reading, replying, etc.
This is a typical excuse I always hear in defense of Ruby but am yet to see a proof that it is the case, with plenty of arguments that demonstrate that Ruby also happens to be a productivity loss the moment the project scale goes beyond trivial.
It has such an embarrassing failure mode that is unthinkable in statically typed compiled languages. Rails is not even a better choice at its main selling point which is developer UX, it has been many years since the rest of the industry caught up and surpassed this. Nowadays, as a primarily C# developer I’m always baffled by the crutches RoR developers have to deal with - one would just not tolerate these in .NET.
At the end of the day, if every line of code costs 100x more, it’s very difficult to come up with a good reason where such Ruby tax is worth it in the projects that cannot afford to throw more compute and memory at a problem.
The proof is that Mastodon and his server were implemented in Ruby. It doesn't exist in another language. And it's the same language that he uses at his day job and according to him, it's part of what compelled him to work on it at all.
The fact that he built it and maintained it is the proof for the claim. Without Ruby it simply doesn't exist, so it doesn't matter how much hypothetically better your favorite language is.
No relationship to the author, but this makes it into running a commercial service.
So, at a rough guess you need:
- Some kind of company entity for all this to belong to, possibly a LLC, and all the associated paperwork that goes with it.
- Bank accounts and some way to handle card payments
- Some level of requirement to provide customer service/support, at least for billing related issues
- Have to now also deal with card fraud, refunds, disputes, charge-backs - even if you use some service that will handle most of it, you'd still need some level of involvement
- Have to handle billing related tech stuff - issuing bills, ensuring accounts are activated/deactivated based on billing events
- And now you need to charge enough to cover all of the above, and your time, and the time of any professionals involved in the above
Starts to sound like at least a part time job. The OP may not want to go there.
For my personal projects that I provide as a service to others, I do them for the fun of it. In the past I've bailed or cut access to them when they've started to feel like a job.
Isn't there a nonprofit association aimed for making the managing/finance-side of such small projects easy? I mean, there are all kinds of orgs for free software and other hobbies. Having a dedicated organization for collecting small money or donation and handling taxes worldwide for small projects, might be something useful, I guess.
Federated networks like Mastodon and Lemmy are going to get people well-acquainted with websites shutting down. It's hard work (time, money, etc.) to run these things for people, and people start to really lean on them.
It's almost novel now days getting sucked into something that shuts down. killedbygoogle.com is a meme partly I think because websites shutting down is just so uncommon in areas that we get personally invested in.
I run my own Lemmy instance just for my self and even that can be trying sometimes. I enjoy using it instead of reddit, but one day I will probably shut it down and be sad.
I'm starting to think nostr was barking up the right tree after all. Put as much complexity into the client as possible and make the servers dumb and completely uncoordinated, utterly interchangeable. Spam your broadcasts to any relay that will listen. No idea if it actually works (I've read a lot about nostr and AT proto but never used either of them) but I think it's very obvious that any system that deeply relies on some company that everyone becomes extremely reliant on (including AT proto/Bluesky) is only a couple steps away from the same sort of problems as centralization.
Of course, the real gold standard would be P2P, if only it could work. But... mobile phones can't burn battery running P2P clients in the background, everyone's under a NAT these days, and some types of software (like microblogging networks) would be horrifically intractable as a P2P system.
Oh well. At the very least, I really love the concept of Decentralized Identifiers (DIDs). I'd like to see more stuff like that.
The atproto team came from the p2p space. We had a lot of experience running client-side computation. There are challenges you can try to solve in that design -- key sync, key recovery, reliable hosting, reliable discovery, etc -- but what you can't get around is the need to run queries against aggregations of data. Even scales that we consider "mid" start to strain on the client-driven model. Federated queries might be able to solve it, but it's extremely challenging to get reliable performance out of that. The design we landed on was replaceable big nodes.
The common comparable people raise is email/gmail, but we used the DID system and account portability to try to get a better outcome on provider migration. It's hopefully more like web/google -- which still has the centralizing pressures of scale benefits, but hopefully injects enough fluidity in the system to move the situation forward. Sometimes, you pick the design that moves the ball down the field, not the one that guarantees a touchdown. If somebody wanted to improve on what we've done, I'd tell them to focus on the federated queries problem.
In theory AT proto doesn't seem like a bad design. I've read a fair bit of the docs, although mostly skimming. (I've been meaning to read the paper on Merkle Search Trees so I can figure out what exactly is going on with PDSes.)
On the other hand, in practice it seems like the AT proto infrastructure is still very centralized for now. DIDs are excellent in theory, but everyone is using PLC DIDs, which depends on the centralized plc.directory. You can run your own PDSes, but there's only one relay for now (that I am aware of.) I also don't think there is more than one instance of the Bluesky AppView, and the official instance is locked into the Bluesky Moderation Service, which seems to limit the usefulness of some of the censorship resistance of the protocol.
I'm not sure how much of that is social problems rather than technical, but I worry that since Bluesky and AT proto are gaining massive popularity with this status quo (millions of users!) it'll have a problem akin to Matrix.org, where in practice almost everyone is using the same infrastructure anyways.
It's still relatively early days, but millions of people is definitely enough to where you start to hit problems with having everyone under one roof. I really hope we get to see how things play out when more of the network is operated independently.
The necessary future is more providers, more relays, a move of PLC to an independent body, and more DID methods.
I will also say - there are ~100 self-hosting PDSes in the network, about 25 relay consumers, 3 alternative appviews that I know of (smoke signals, frontpage.fyi, and whitewind), the firehose & backfill are fully available, the specs are getting fairly complete, and the software is open source. This is a priority for us.
fwiw roughly 50% of Matrix is on the matrix.org instance currently. we consider this a bug, but also prioritise ease of onboarding over decentralisation purity ideology.
I hate to be a downer but there's a lot of things Matrix could prioritize over decentralization. That said, the decentralization works pretty badly. Large federated joins are somewhere between comically slow and horrifically slow. Status does not seem to work well across federation either.
I'm also a bit miffed that Dendrite was positioned as a "next generation" Matrix server but now it feels nearly orphaned with missing support for newer features, issues with various appservice bridges, few updates at a very slow pace, and no migration path out in sight. I know it came with a fair number of disclaimers, but that still bums me out as it seemed like it would be okay for a small non-critical homeserver, and now it seems likely I'll have to engineer my own path out when clients finally stop working with Dendrite. (It already happened once...)
You have no idea how bad I want to love Matrix, but frankly if it was due to a focus on usability that decentralization "purity" has suffered, it simply does not show in the resulting usability improvements over years of time. Sorry to be harsh.
I share your feelings. I had switched to Dendrite 1 year ago and its flaws made me open my eyes to the major problems I had with the design of the protocol.
I started to worry that it would not improve and I reexamined XMPP.
Finally I switched to XMPP that I had abandoned more than 5 years ago because I think there is a better chance that the community will finally offer clients with the important features, on all platforms, in the coming year and that it will last over time.
I did see the Matrix 2.0 announcement, though for obvious reasons I can't actually use any of the features listed in it. Obviously, improvements to the basic chat functions of Matrix would be great. For now though, I am stuck with the reality that joining a large federated channel sometimes takes more than 6 hours. I wish I were exaggerating.
edit: I guess though that faster room joins weren't a part of Matrix 2.0. Actually, I don't really have a huge problem with the sync taking too long personally. So maybe Matrix 2.0 wouldn't bring that big of an improvement for me anyway.
I, too, ran my own instance. I enjoyed it for some time but I've now moved to the omg.lol ecosystem. I feel that by paying a little money for it that I have a higher chance of the server not shutting down.
ActivityPub seems to require a lot of hardware resources in order to run properly, which is unfortunate. It’s not something I would ever want to run myself, especially to the public.
Please be wary to conflate ActivityPub with the code on top of it, like Mastodon for instance; which is, on both front and back-end, proven to be resource intensive and difficult/costly to scale especially over time. (the older the dbs get, the more inactive users pile up, etc.)
Versus something like Pleroma; which I've used since it's inception, being incredibly janky and lightweight, prone to breaking, but later versions have mostly ironed out most of those catastrophic bugs. It has it's own challenges as well, but it does historically scale better, is more flexible, and less intensive per instance
One demands a lot of money and time, where the other demands a lot of time and not so much on the money side.
I'm not going to spend the time to give you a history of pleroma/mastodon instances, as it's a controversial history at best and there's a lot of people who know little, yet who will believe themselves an oracle. (ofc that could also be me so take it with a grain of salt and all)
Though if you are willing to read through a bunch of highly opinionated accounts, and you pay close attention to what actually happened, the answer is pretty clear.
ActivityPub is intensive, but not the main culprit.
I'm guessing the syncing process between instances is really brutal on resources. Imagine constantly syncing external databases for your service to function properly.
AP is push based (which actually causes the "every instance has its own set of comments" problem). You can run pullers on a small instance to get a better experience if the remote sides support listing posts, but the standard sync process is no more than receiving HTTPS calls and storing the JSON contents in the right place.
There's some additional overhead (doing HTTPS calls for verifying signatures, for instance) but that information can be cached pretty effectively.
Pushing contents is no more than posting HTTPS calls to every server in your follow lists, and possibly exposing said content in a GET API for pullers, though that's entirely optional.
Mastodon is heavy because of the way the backend is written (I blame Ruby on Rails for tha one), but there are fully featured ActivityPub servers out there that are orders of magnitude more efficient. Mastodon devs prefer the ease of development over performance but that's a choice, not an inherent problem of ActivityPub.
Besides the monetary costs of operating small fora there are also significant competence hurdles. Site owners who manage to hit a couple thousand users have to figure out spam handling, automated content moderation (including photoDNA and the required reporting if they host images), registering a DMCA agent with the copyright office, setting up an LLC, assessing their needs for COPPA, GDPR and CCPA, their site's tax situation and anything they may want to do involving employing others (such as T&S) without getting burnt out. The median size for a forum getting its first subpoena is 4,300 users.
Managing all of that is easily learnable in a couple months if they have time, disposable income and few distractions but surprisingly few people who have site management thrust upon them know about these things in advance. To most people who think about running an internet anything the above are unknown unknowns. You can't go looking for things you don't know exist, so burnout is high.
Citation highly needed. I'm close to quite a few people who run larger instances than that, including my own, and none of them have ever told me about getting subpoenaed. That's exactly the kind of war story we'd tell each other, too.
I've seen no evidence that running a fediverse server is nearly so legally fraught.
I'll go looking for the citation and it may take a couple hours but I'll point out that FWIW my info is drawn from some academic surveying vBulletin and Xenforo site owners in the 2010s so I wouldn't be surprised if they aren't applicable anymore.
No problem. I thought that seemed on the high side but didn’t have any stats to counter with. Which paper does 430k come from? I’d like to squirrel that away for later.
Eh, it's not new to federated sites. Many of the web sites I frequented a decade a go are dead. Or enshittified useless so I visit them rarely anymore.
It was sad to see say, the "user friendly" webcomic go away. But I enjoyd it in the time. Just live the moment. Don't expect even the big websites and apps of today to last - not at least in the form you enjoy.
> Federated networks like Mastodon and Lemmy are going to get people well-acquainted with websites shutting down. It's hard work (time, money, etc.) to run these things for people, and people start to really lean on them.
Mastodon is filled with such utterly basic UX issues. You move instances because the old one announces a shutdown? No old posts visible, no import possible. You want to see the history of an old account on another instance? The oldest toot you'll see is the first one that your instance picked up from that account. You have to switch to their instance to see old toots - there's a helpful link at the end of the feed, but it's still annoying. "Trending" topics only carry stuff happening on your server, and most of it is days old garbage. Search is horribly broken and inconsistent.
> You move instances because the old one announces a shutdown? No old posts visible, no import possible.
But this isn't "utterly basic" to solve on the backend due to how ActivityPub (currently) works. First you have to allow backdated posts[0] (not supported in the spec) which requires a mechanism to stop them being sent out (else they'll appear in current timelines[0]) but also you need a mechanism to send them out (to update the old URLs except how does the new instance know where the old instance sent the status? And how do you prove that you have the right to even request the change?)
These are probably not insurmountable but they do require a lot of thinking about!
[0] I ran into these importing an old Twitter bot into my Akkoma instance. I had to modify the server code and it was not a fun time.
> But this isn't "utterly basic" to solve on the backend due to how ActivityPub (currently) works.
They're basic for the user. I know a few people who left Mastodon for good after the second or third time they had to shift servers. That kind of stuff should have been thought of from the beginning...
> to update the old URLs except how does the new instance know where the old instance sent the status? And how do you prove that you have the right to even request the change?
The same way an account move is currently reported to the instances where followers reside and handled there - the account-move operation would only need to do a full re-scan of the old profile. That's a ton of traffic for people with followers from many instances, I agree, but the source instance could trigger the creation of something like a data dump that destination instances can download without hitting the API.
> something like a data dump that destination instances can download without hitting the API
That gets you the old statuses, great. How do you then insert them into your existing instance? You can't just repost them because they'll appear with new timestamps (bad). You can't just repost them with old timestamps because servers and clients assume "just arrived == now" (bad). If you're using sequential IDs on your status table, good luck with that because I'm pretty sure someone has taken the shortcut of using that instead of the timestamp. Assuming you can work all those out, now you need to update the old URLs in the follower timelines to point to the new URLs (unless we punt on this and just let the old timeline sit around as it.) Except you don't know who got those statuses when they were posted - the old instance would need to have kept all the queue records for every post and be willing to supply them to the new instance. Or you can "eh" that and send them out to the new followers (except we don't want to do that because it confuses current instances and clients to get old timestamps at a new time) but that doesn't mean everyone will be updated. Or you can try and persuade the old instance to redirect each old status to its new URL once you've updated the software and protocol and clients to allow for status redirection, obviously, and worked out how to verify that server X is actually allowed to redirect A@Y's old statuses and isn't some hijacker / spammer / whatever and ...
I've not even given this much thought - I'm sure people who actually dwell in ActivityPub and security worlds can give a much better explanation of why it's not at all easy to implement.
> That kind of stuff should have been thought of from the beginning...
Yep, can't disagree that a whole heck of a lot more thought should have been put into the AP protocol from the start.
Federated networks like Mastadon strike me as being centralization at scale.
They don’t appear to solve any of the power dynamics of users and operators - users are still at the mercy of the operator - and they run on either altruism or monetization.
Mastadon appears to have successfully created N copies of the Facebook problem, which is definitely better than where we were.
I like Mastodon, it’s the only Twitter like thing I use.
But I think this just reflects the facts. Centralization works and is highly preferable for many users. Just like in the only big federated service: email.
Yes you can run your own. But there are a lot of costs in terms of time/complexity/knowledge/trust to that.
Outsourcing it to someone else is really nice.
You don’t need one big instance like Twitter was. Having a small handful of big ones works well too.
But the dream some people seemed to have where everyone should run their own instance alone or with a few friends was never going to happen.
> Centralization works and is highly preferable for many users.
I don't think users care about that at all, and if they have it explained to them, hate it. I think the real problem is that we haven't decentralized ownership and decisionmaking, instead we shattered big dictatorships into little fiefdoms, often run by local gangs (as one would expect.) Arguing that federation should automatically solve our problems with social media is like the US argument for "state's rights." You had one problem, now you have 50.
This is also exacerbated by the fact that people can't migrate. That would seem like it should be a developer priority to enable competition between instances, but instead people get irritated when asked about it at all. Every post locks you in farther to a particular instance. If people can leave on a whim, bad instances would starve. Instead of people being able to vote with their feet, the politics of mastodon all revolve around punishing other instances for various examples wrongthink by defederating. So now it's little fiefdoms at war with each other, you have to be in the in-crowd of your likely randomly chosen instance to have a say about it, and if you leave you lose everything.
My experience is that tons of people heard about Mastodon when the first wave of Musk bullshit hit Twitter and they immediately had an allergic reaction to having to pick a server. They don't realise there's no practical difference from email (which they're already using) but somehow the need to pick a server baffles and confuses the average social media user.
Instead, everyone seems to be joining Bluesky now, which is also federated but doesn't mention it anywhere so users can just join the main instance.
I expect this will cause massive problems in the future when federation will start taking place on a serious scale and the risks of misleading users by using similar usernames on other servers start applying. People don't know the network is federated and there's no easy way to read up about it without diving into dev documents.
Even the “main instance” on bluesky is a cluster of instances; it’s just not exposed in a way that causes the choice issue. And since you have full account portability, if you ever want to change, it’s at least possible.
>Yes you can run your own. But there are a lot of costs in terms of time/complexity/knowledge/trust to that.
>Outsourcing it to someone else is really nice.
Yeah but the key thing is that you can choose your provider. Email isn't a walled garden that can be enshittified because you can just migrate somewhere else - yes it's a huge pain and has a bunch of drawbacks, but you can do it, and people do do it.
Moving to a different Mastodon instance is a way smaller transition than moving from Twitter to another social media platform entirely.
The Fediverse has a bunch of issues but I don't think we should think about it as "running your own", we should think of it as "choosing the provider that best fits your needs", as many have with Gmail.
As always, it depends. I'm on a mastodon instance centered around a fairly specific topic, whose members donate (more than) enough to cover the costs of running the instance.
Of course, it still relies on the benevolence of the guy who runs and maintains the instance. He actually takes a fee out of the donations each month to pay for his time, but it's a token amount.
I think OP’s point is that most users don’t choose to do so. Whether because of lack of ability, interest, time, whatever, people would mostly rather just be users.
This is why I believe that Bluesky and the AT protocol is a significantly more attractive system than Mastodon and ActivityPub. Frankly, we’ve tried the kind of system ActivityPub offers before: a decentralized server network ultimately forming one big system, and the same problems have inevitably popped up every time.
XMPP tried to do it for chat. All the big players adopted it and then either realized that the protocol wasn’t complex enough for the features they wanted to offer or that it was much better financially to invest in a closed system. Sometimes both. The big providers split off into their own systems (remember, Google Talk/Hangouts/Chat and Apple iChat/FaceTime both started out as XMPP front-ends) and the dream of interconnected IMing mostly died.
RSS tried to do it for blogs. Everyone adopted it at first, but eventually content creators came to the realization that you can’t really monetize sending out full-text posts directly in any useful way without a click back to the originating site (mostly defeating the purpose), content aggregators realized that offering people the option to use any front-end they wanted meant that they couldn’t force profitable algorithmic sorts and platform lock-in, and users overwhelmingly wanted social features integrated into their link aggregators (which Google Reader was famously on the cusp of implementing before corporate opted to kill it in favor of pushing people to Google+; that could have potentially led to a very different Internet today if it had been allowed to release). The only big non-enthusiast use of RSS that survives is podcasts, and even those are slowly moving toward proprietary front-ends like Spotify.
Even all the way back to pre-Web protocols: IRC was originally a big network of networks where every server could talk to every other server. As the system grew, spam and other problems began to proliferate, and eventually almost all the big servers made the decision to close off into their own internal networks. Now the multi-server architecture of IRC is pretty much only used for load balancing.
But there’s two decentralized systems that have survived unscathed: the World Wide Web over HTTP and email over SMTP. Why those two? I believe that it’s because those systems are based on federated identities rather than federated networks.
If you have a domain name, you can move the website attached to it to any publicly routable server and it still works. Nobody visiting the website even sees a difference, and nobody linking to your website has to update anything to stay “connected” to your new server. The DNS and URL systems just work and everyone just locates you automatically. The same thing with email: if you switch providers on a domain you control, all the mail still keeps being routed to you. You don’t have to notify anyone that anything has changed on your end, and you still have the same well-known name after the transition.
Bluesky’s killer feature is the idea of portable identities for social media. The whole thing just ties back to a domain name: either one that you own or a subdomain you get assigned from a provider. That means that picking a server isn’t something the average person needs to worry about, you can just use the default and easily change later if you want to and your entire identity just moves with you.
If the server you’re on evaporates, the worst thing that you lose is your activity, and that’s only if you don’t maintain any backups somewhere else. For most people, you can just point your identity at a different server, upload a backup of your old data, and your followers don’t even know anything has changed. A sufficiently advanced client could probably even automate all of the above steps and move your whole identity elsewhere in one click.
Since the base-level object is now a user identity rather than a server, almost all of the problems with ActivityPub’s federation model go away. You don’t deal with blocking bad servers, you just block bad people (optionally using the same sorts of “giant list” mechanisms already available for places like Twitter). You don’t have to deal with your server operator getting themself blacklisted from the rest of the network. You don’t have to deal with your server operator declaring war on some other server operator and suddenly cutting you off from a third of your followers.
People just publish their posts to a server of their choice, others can fetch those posts from their server, the server in question can be moved wherever without affecting anything for those other users, and all of the front-end elements like feed algorithms, post display, following lists and block lists, and user interface options could either be handled on the client-side or by your choice of (transferable) server operator. Storage and bandwidth costs for text and (reasonable) images are mostly negligible at scale, and advertising in clients, subscription fees, and/or offering ancillary services like domain registration could easily pay for everything.
ActivityPub sounds great to nerds who understand all of this stuff. But it’s too complicated for the average social media user to use, and too volatile for large-scale adoption to take off.
AT protocol is just as straightforward to understand as email (“link a website domain if you already have one or just register for a free one on the homepage, and you can easily change in the future”), doesn’t require any special knowledge to utilize, and actually separates someone’s identity and content from the person running the server. Mastodon is 100 tiny Twitters that are somewhat connected together, AT actually lets everyone have their own personal Twitter and connect them all together in a way that most people won’t even notice.
Good post of historical reminders and I appreciate the framing of bluesky's identity approach. Never was sold on Fediverse/ActivityPub as being it and not a fan yet of Bluesky's slow-building-in-public approach but am intrigued by this key facet of the main role the personal domain takes. How can one easily change/migrate their AT identity if they change domains? How is their whole social history transferrable? Like that was one of the problems/unclear things to most about Mastodon - that it actually wasn't that easy to move instances because sure your identity could move but your posts would be on the old instance, so it wasn't really that portable. I'm all about the permanence and data preservation, so I don't want to commit to a platform now without assured control over my data and ability to maintain history/identity in a move. Have enjoyed the centralization and longevity for too long on a place like Twitter to get all loose and ephemeral now.
Type in the domain you wish to change to. Click next.
It’ll give you some stuff to put into a DNS TXT entry on that domain. Do that. Click “verify DNS record.”
And that’s it. You’re done. Everything is “transferred.”
The history is transferable for the same reason a domain is transferable to another web host: what does URL stand for again? Uniform resource locator? That is, it’s how you locate something, not what that something is. In this case, the domain isn’t actually your identity: your identity is your DID, “decentralized identifier.” To hand wave slightly, all your content is signed with your DID information, not the URL you use. There’s a service that resolves domains to DIDs. So changing your domain means changing what that service resolves to. That’s why I put “transferred” in quotes above; when changing domains, nothing actually moves.
Now, if you want to change the server where your data is hosted, your PDS, it’s effectively the same thing: you spin up a new server, backfill your data by a backup or by replaying it from the network, and then say “hey here’s a new PDS” to the network.
All of this is possible because of the fundamental design choices atproto makes over the ones ActivityPub does.
Happy to answer more questions. But if data ownership and preservation is a thing for you, you should like atproto.
> But the recent Mastodon upgrade has caused a significant amount of performance degradation, and I think the only way to really solve it is going to be to throw a lot of money into hardware.
I found the latest upgrade also making some odd UX decisions. Content warnings got a weird new styling and it's not clear anymore how to hide images separately from hiding the text.
Are the mastodons okay?
There are good things too, don't get me wrong, like grouping notifications instead of getting a notification flood on a popular toot. That's nice. But what's up with perf regressions and (in my opinion) UX regressions?
The content warning styling may be reverted (its already done in `main`), and we are not aware of any performance issues with 4.3, it's in fact the opposite from all the feedback we got. I am really curious of what is happening here and asked the admin to provide more information.
I'm not hot on the fediverse in general, and this just sours me on it a bit more. A bunch of dedicated admins keeping instances going, basically running hobby servers/websites like it was the 90s/early 00s, is never gonna work for the kind fo scale services grow to these days. I know not everything requires scale and lots of ppl are happy existing in their little silos, but that's just it, they're silos. Might as well be back on seperate forums for our seperate interests again. When you want the power of a mix of accounts/networks/interests everything balloons and can't be run with funds and larger centralization. Sigh. It's a tough one and has yet to be solved in full, with any existing approaches all sort of half-solutions. Maybe that's the way forwards in general (an internet of islands) but it sucks to have things going up and down and having to migrate around the net (with or without our own data) like nomads.