Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
How Does Bluesky Work? (steveklabnik.com)
248 points by steveklabnik on Feb 24, 2024 | hide | past | favorite | 91 comments


This seems incredibly well thought out, and seems like it solves some of the bigger issues with ActivityPub. I had no interest in Bluesky before; now I think I'll make an account.


My main problem with Bluesky and the AT protocol is not their apparent technical merits so much as the strong "not invented here" approach. Like their "domain as identity" instead of using user@host like everyone else and established protocols like WebFinger. Like using their own mechanism for that instead of looking at how ACME already does it for certificates, leading to someone owning the s3.amazon.com domain [1]. Like using their brand-new RPC protocol, "XRPC". Like whatever Lexicon is, a weird code-generation-based data schema whose only benefit over protobuf and jsonld is that it's newer.

It is hard to trust an "open network" that demonstrates time and time again that they don't respect and don't want to talk to anyone else. It also makes you wonder how many more common mistakes have slipped in the design simply because they reject common knowledge.

[1]: https://www.reddit.com/r/programming/comments/138hlf2/s3_dom...


Your comment could have been more constructive if you have included more "common mistakes" than things you just don't like, the only exception being handle squatting. Like, what is a specific benefit of `user@host` as opposed to a single domain---which can be generated on demand---as an identity? (And who is "everything else" here? Almost all non-Mastodon social networks don't use them.) Also a value of the common schema language is not that high, especially when that language is coupled with the serialization format and prevents any improvement.


"value is not that high" is exactly the kind of behavior I'm talking about.

What's the benefit of user@host? Deployment simplicity, you don't need wildcard domains or wildcard certificates, or worse DNS API access, you can just self-host one app at one domain like everything else. As for who uses this format: email, matrix, XMPP, mastodon/ActivityPub, gnusocial/friendica...


If the supposed benefit is only deployment simplicity, the value is indeed not high because the deployment is not expected to occur that many times. Technical problems are among least important problems in the social network.


Your argument is "drawbacks to reinventing are not that high", even though I pointed out both usability and security issues. But what's the advantage? Other that giving the (correct) impression that the supposedly-open-network doesn't want to have anything to do with anybody else?

You're coming up with excuses not justifications. And "it looks like it hasn't hurt us too bad yet" is not a great one at that. Self-hosted deployments have barely started rolling out (well, relay is still centralized).


Do you really want me to use a lengthy bullet list instead of compact prose?

> Deployment simplicity, you don't need wildcard domains or wildcard certificates, or worse DNS API access, you can just self-host one app at one domain like everything else.

At this point, they are not significantly more difficult than single domains or certificates, thanks to Let's Encrypt and ACME. In fact the requirement for domains and certificates has been always the single biggest hinderance, which inherently prevents self-hosted deployments in masses. Self-hosting is definitely a good option to have, but it is unreasonable to assume that most users can have their own deployment even in the near future. Multiple companies seriously tried to tackle them and have failed so far. As such, individual applications are not in the good position to solve these issues.

> I pointed out both usability and security issues.

You only have pointed out security issues, which I acknowledged as a single actual mistake. And that only relates to the domain ownership check, so you can fix them without scrapping the entire scheme.

I can't really see which usability issue can arise from `@example.com` as opposed to `user@example.com`. For laypersons who would use federated instances, `@user.example.com` and `user@example.com` is literally a single character difference and doesn't really matter much except that `user@example.com` is much more prone to be mistaken as an email address (see below). For users with their own domains, any user name is redundant and `@example.com` is definitely superior.

> As for who uses this format: email, matrix, XMPP, mastodon/ActivityPub, gnusocial/friendica...

Among others, email is the only protocol that have achieved common usages, distantly followed by Mastodon. Everything else is a nerdy technology which doesn't count as "everyone else". I should note that I have operated an IRC server for more than a decade and I'm very confident that the number of public Matrix servers is even rarer than the dwindling number of public IRC servers. IRC was once popular only because there were no other alternatives, not because it was open (mIRC was a dominant implementation in its heyday anyway).

And even Mastodon doesn't exactly use `user@example.com`; it uses `@user@example.com` with a prefix `@`, presumably because it allows for shortened handles (`@user`) while avoids a confusion with email addresses, but that can be also done with a domain-only handle if desired. Like, that has been a feature of DNS for the entire time. The fact that both Mastodon and BlueSky doesn't use the exact email address format shows that the exact handle format is not very relevant, as long as it can account for distributed registration and verification and can't be confused with existing usages (i.e. email addresses).


You counter "it makes deployment difficult" by "lay people won't be deploying, it's too difficult". I can't understand what you're trying to say at all.

People are deploying Mastodon, it will keep being an actually-federated network, and Bluesky can be a centralized could-have-been federated network run by a couple of companies. But is that the goal as you see it?


My counter is rather "deployment was already too difficult for laypeople even without that", see the emphasized word:

> In fact the requirement for domains and certificates has been always the single biggest hinderance, which inherently prevents self-hosted deployments in masses.

I'm confident that you never satisfactorily answered to that, and even more confident in my belief that such hinderance is inherent and thus should not be the foremost design factor. Unless you have a clear answer that can be universally applicable (that is, no "works for me and my friends", I too have enough technical friends who don't use Mastodon), please refrain from trying to claim otherwise.


Answered what? I'm sorry that you can't figure out certificates, I don't see how that excuses a system that is made more complicated that it needs to be, and more complicated than anybody else does it, which was my initial claim. Again, you are still giving me more "it's not that bad" or "it's not the main issue", which are not reasons to do the thing wrong in the first place.

This "conversation" has reached my threshold for puzzlement, so I'm dipping out. Let me show you how it feels to me with an analogy:

Some company is making a new revolutionary car. It gets about 10 miles of autonomy, can only turn right, and stalls every 100 feet. I point that out in the comments.

Then lifthrasiir shows up and says that some people have driven it successfully, that you can get anywhere by only turning right with careful planning, that most vehicles don't get a lot of autonomy only cars and trucks, that there are commercial services that will get the car where I need it to (by loading it on a truck), and anyway if I really want to get somewhere why don't I take a bus?

Sure, I never claimed that it couldn't be moved, or that it would prevent its owner from getting places. What I said is that it's a terrible car.


You are claiming that a tram is a bad car. It was designed in that way for a reason and you can't (or don't want to) understand it because you like a car so much. Whenever I say a vehicle, you misinterpret it as a car. Of course it would make a bad car, like your analogy. That's enough puzzlement for me as well.


ATprotocol seems technically superior, but socially set up for failure.

Because there’s no social incentive for anyone to set up and maintain relays and moderation service when divorced from actual communities, what’s left are purely financial incentives - paid service, advertisement insertion, data mining, information manipulation etc.

Also the centralization of data distribution and moderation means they are huge undertaking now, both in terms of hosting cost and human cost, it’s almost forcing these services to be commercialized, or you can say discourage anyone not Bluesky from setting up their own relay or moderation service.

For all the faults of ActivityPub, the extremely “gated community” like approach provides the social incentive for users to feel a sense of belonging to a server, and for the server owner and moderator to feel a sense of pride and ownership over providing service to their users. Being within a valued community also incentivizes people to donate to pay for server cost and moderation effort.

ATprotocol will technically allow multiple relays and moderation services, but in reality how many will actually materialize?

As such I feel judging social networking protocols from a purely technical and adversarial point of view (like ATprotocol does) is a mistake, sure you are now no longer beholden to a server, but at the same time you are divorced from a tight knit community and you no longer have skin in the game.

When you have nothing to lose (data, identity and community wise), that’s a double-edged sword.


I’m still fairly new to the team, and don’t claim to represent their point of view here, but my impression is that services (like moderation services) aren’t intended to be divorced from the communities they’re serving. It’s more that they’re decoupled from the physical data placement. The relationship between users and moderation services would be many-to-many. The incentives to self-organize and to provide a valuable service for your community are still there; it’s just that the notion of belonging to a community is more fluid in itself.


Fluid community would disincentive tight-knit community wouldn't it?

I feel that if you decouple the social reward and the grunt work of moderation and financial burden of hosting the relay, then the incentives just wouldn't be there to do the dirty work unless it becomes paid or monetized in some way.


> For all the faults of ActivityPub, the extremely “gated community” like approach provides the social incentive for users to feel a sense of belonging to a server, and for the server owner and moderator to feel a sense of pride and ownership over providing service to their users. Being within a valued community also incentivizes people to donate to pay for server cost and moderation effort.

are social incentives going to be enough long-term though?

I keep looking at this and thinking of three things: trust, security, and Reddit. These are actual questions, not arguments, I'm genuinely curious.

Can you trust an admin to host and police your social network? Moving servers isn't hugely difficult iirc, but it could be very disruptive and drama-ful. It would have to get pretty bad before people start to overcome the network effect inertia and actually move.

What is there to hold hosts accountable for security? The threat of reputational damage doesn't seem like it is going to mean much to pseudonymous volunteers, so what can be done to ensure hosts are properly protecting the system from promise?

Reddit moderators are, by and large, reasonable but there's a non-trivial minority who will do things like stage a coup against the other moderators, start arbitrarily banning or harassing people, or just straight up ghost the site and never follow up on the rules. same as above, what mechanisms exist to prevent or reverse or take over the actions of an rogue/absentee mod?


> are social incentives going to be enough long-term though?

I'd say all the long-running forum based communities proves that yes it is. Not counting Usenet, IRC and other "old internet communities" which have withstood the test of time.

> Can you trust an admin to host and police your social network?

That's where the federated part comes in, you choose a server and admin that you trust.

Preventing people from moving server would be holding them hostage, breaking the fundamental trust of the platform and become public enemy numero uno overnight.

I really doubt anyone would do it, but yes the chance isn't zero. Plenty of people host servers with their real identity and reputation on the line though.

If you are still paranoid you can join a commercial server where by contractual obligation they have to provide service to you and allow you to move away.

> What is there to hold hosts accountable for security? The threat of reputational damage doesn't seem like it is going to mean much to pseudonymous volunteers, so what can be done to ensure hosts are properly protecting the system from promise?

Again, you choose your server based on your risk tolerance. You can go with the main mastodon.social server, or join a server owned by a real person you feel is reputable/trustworthy, or join a paid server.

And I would disagree about pseudonymous volunteers not caring about reputation damage, the vast majority of people would feel a sense of ownership over their online identity, especially if it has a lot of clout.

> Reddit moderators are, by and large, reasonable but there's a non-trivial minority who will do things like stage a coup against the other moderators, start arbitrarily banning or harassing people, or just straight up ghost the site and never follow up on the rules. same as above, what mechanisms exist to prevent or reverse or take over the actions of an rogue/absentee mod?

Reddit mods are not server owners, at the end of the day they know they're just hosting someone else's party so there isn't a true sense of ownership.

But if a moderator decided to go rogue? It's just like a forum isn't it? Either the owner gets rid of them or you're out of luck.

You can limit damage by backing up your followers and follows on a regular basis so you can rebuild your network if you are forced to create a new account on a different server.

With federated system come the freedom to chose your server and the consequences thereof, you're gonna have to take some personal responsibility by choosing a server that has a healthy community and a responsible moderation team.

At the end of the day I don't think a perfect system exists, they are just different sets of compromises.


> For all the faults of ActivityPub, the extremely “gated community” like approach provides the social incentive for users to feel a sense of belonging to a server, and for the server owner and moderator to feel a sense of pride and ownership over providing service to their users.

I really misunderstood what you were talking about until you got here.

This mindset is extremely alien to me, so that's probably why it's so hard to make sense of this.

I am glad that you have a system that works for you, and that I have one that works for me.


I'm actually a fan of POSSE (https://indieweb.org/POSSE) where all social networks are auxiliary to my own website/blog.

After the Twitter debacle I'm done with the idea of being "married" to any network.


Since Bluesky published their protocol, I’m thinking about how one integrate a BS-compatible microblog into one’s own weblog software, e.g. POSSE.

With ActivityPub I can see that: a form of headless ActivityPub server which provides an Inbox and Outbox, based on the DB. Such you can publish your own posts both on the web and into an outbox. And even better, you get reactions in your inbox, which you could possibly display on your own web pages with enough determination, similar how the Indieweb people do that with WebMentions. After all every Status in Mastodon/Fediverse has an url property.

With AT Protocol I’m a little bit at a loss. You could possibly expose your Backend as a AT-compatible Personal Data Repository, although that would be a lot of work, exposing a Git-like interface.

But I’m still wondering how one would aggregate reactions. It seems it would not be enough just to have a PDS, but also to run an Indexer/Relay, which indexes other peoples PDS. And not just the PDSs of the people you are following, but also the rest of the world – after all a Like or such can come from everyone. When running an Indexer you are pretty soon into Firehose territory. Too much for a single person website.

It really seems there isn’t a way for reactions to swim upstream – the social graph of Bluesky in a way lives in the Indexer/App View world, not on your PDS.

Maybe I’m wrong, but the AT Protocol docs are not very expansive on such points. Neither is Klabnik’s blog. Everyone talking about ATP seems to be excited about the Solid-inspired DIDs and PDS, but never talk about the transmission mechanisms. Whereas I’m sitting here and thinking that an ActivityPub Actor’s Outbox is already a form of PDS – and it has an Inbox. Addressable via the Web and findable via Webfinger.


Reactions would be a Lexicon. They’d then get consumed like anything else.

Or at least, that’s my initial guess, would have to actually work through the design to know if it would work that way.


You’re talking about the data model, I’m asking about the transmission of those and its directions.

“Lexicons” are BS’s Not Invented Here variant of ActivityPub JSON-(LD) meta-syntax and the vocabularies. Obviously necessary for the transmission of information, but it doesn't tells you how that information is transmitted or in which direction.


Sorry, I meant mostly like, "the transmission doesn't work any differently than anything else," you don't need to invent a novel transmission model to make something work.


> But I’m still wondering how one would aggregate reactions. It seems it would not be enough just to have a PDS, but also to run an Indexer/Relay, which indexes other peoples PDS. And not just the PDSs of the people you are following, but also the rest of the world – after all a Like or such can come from everyone. When running an Indexer you are pretty soon into Firehose territory. Too much for a single person website.

You don't need to run a Relay yourself in order to see reactions; you instead subscribe to an existing Relay's firehose (such as the one that BlueSky PBC operates), and ignore anything that isn't "reaction-shaped" (matches the relevant Lexicon).

> Everyone talking about ATP seems to [..] never talk about the transmission mechanisms. Whereas I’m sitting here and thinking that an ActivityPub Actor’s Outbox is already a form of PDS – and it has an Inbox. Addressable via the Web and findable via Webfinger.

Here's my attempt to summarise the transmission difference between ActivityPub and AT Protocol: ActivityPub is symmetric and push-based, AT Protocol is asymmetric and pull-based.

With ActivityPub, your client talks to your server. The server handles both outgoing data (you making posts) and incoming data (viewing other people's posts).

When you create a post:

- The post is placed in your server's outbox.

- Your server looks up the servers for every one of your followers, and places your post in their inbox.

When someone creates a reaction:

- The reaction is placed in their server's outbox.

- Their server looks up your server, and places the reaction in its Inbox.

- Your website checks the Inbox, discovers the reaction, and handles it.

With AT Protocol, your client talks to your PDS and the client's App View. Your PDS handles outgoing data (you making posts), and your client's App View handles incoming data (viewing other people's posts). Relays are what provide transmission between the two: Relays subscribe to your PDS, and App Views subscribe to the firehose of at least on Relay.

When you create a post:

- The post is added to your repository on your PDS.

- The Relay pulls the post from your PDS and broadcasts it on its firehose.

- App Views see the post in the firehose, and include it within the feeds of users that follow you.

When someone creates a reaction:

- The reaction is added to their repository on their PDS.

- The Relay pulls the reaction from their PDS and broadcasts it on its firehose.

- Your website's App View (which might be your client's App View if "reactions" are also posts, or some other App View if this is a different kind of semantic data) sees the reaction in the firehose, and handles it (e.g. includes it within the feed of reactions that your website uses).


That is rather helpful, thanks!

But in a way depressing: I fear subscribing to a firehose is not that realistic for small single person websites. Maybe you can filter the firehose subscription.


Had a similar response. Dusted off my account and finally posted something.


Same, actually. The fact that I can get the @stavros.io handle just by adding a DNS record is a nice touch.


I believe there’s currently a single relay. When there are multiple relays, each will have a different view of the network, and the one that includes more posts will be better than the others.

It might be quite hard to get relay diversity rather than using a single dominant relay, sort of like attempting to get traction for an alternative DNS system. Then, if your posts don’t get picked up by the dominant relay then it seems like you’re out of luck?

But it’s still very early, so perhaps it won’t work out that way. And even if there is a dominant relay, well, GitHub isn’t so bad, and there are alternatives.


I have previously found Bluesky kind of confusing, but I've recently realized that it was mostly the terminology that was confusing me.

Imagine if you instead referred to the the system as

[microblogs] -> [indexer] -> [client]

There's a lot more to it, but I think this is the basics of how Bluesky works. They just call everything something different than you might think it should be called.

The indexer is only necessary because the user ids do not remain static on the system. Instead there is some weird cryptographic reshuffling going on where DID:PLC starts as a substring of a hash and then because of 'key rotation' the user id of every user actually changes out from under the user themselves over time.

Imagine a system such as Livejournal but your handle is constantly changing so you need an authority to keep track of the handle changing. And of course you are only allowed to post short messages with no titles or markdown support.

I'm personally waiting for a new DID:PLC where you can get a static pubkey where you sign the messages with a private key because I think that'll be way less stuff to keep track of and it'll make self-hosting and local-first development much easier.


> The indexer is only necessary because the user ids do not remain static on the system. Instead there is some weird cryptographic reshuffling going on where DID:PLC starts as a substring of a hash and then because of 'key rotation' the user id of every user actually changes out from under the user themselves over time.

The user IDs (e.g. did:plc or did:web) do not change over time. You are correct that a user that uses did:plc has that generated from a hash (of their initial repository state), but subsequent additions to or deletions from the repo don't change the DID. Likewise, if you set up your repository with did:web, the domain you pick is fixed, and you lose access to your repo if you lose control of that domain.

What is changing with key rotation is the repo signing keys that authorize updates to the user repo. If you run an entirely self-hosted PDS then you can choose to never rotate that key, and then you have a "static private key" repository. But key rotation is a necessary part of the design, in part to enable account migration (you wouldn't want an old PDS to still be allowed to make changes to your repository, so you'd rotate those keys to remove the old server's key and add the new server's key).


I 100% agree with you, but I think it's an issue of terminology.

I find myself thinking of the pubkey as the id and the DIDs as an alias or a name pointer towards that id.

I think the indexer would need to be less robust if there was no key rotation, and I'm looking forward to that feature.


The comparison here that comes to mind is imagine if Github issued you a keypair to use over ssh to push/pull from a repo, but they change that keypair for you periodically and also keep track of that keypair for you.

I'd much prefer to generate my own ssh key and upload the public key to Github, Gitlab, and Sourcehut, and Codeberg, etc. Forgive me if I forgot a forge.


A useful way to think about it is that the PDS is a "user agent" - they act on behalf of the user, and with the user's implicit trust. This is much the same way that a user trusts webserver software (and the VPS running it) to correctly serve their website, hold onto the private keys for the site's TLS certificates, use those private keys to correctly set up encrypted connections to the website, etc.

The AT Protocol itself does technically allow for all private key material to be held at all times by the user, but that means the user needs to be interactively involved in any operation that requires making a signature (and online / available whenever such an operation is necessary, which isn't necessarily the same times as when the user is interacting with a particular app like BlueSky running on ATProto). The PDS software that BlueSky have implemented instead requires (IIRC) that the PDS has the private key for at least one of the public signing keys (but there can be multiple signing keys, so the user can also hold onto one locally).


Yeah, I hear you about the names. When talking about this on bluesky, someone came up with "data store -> indexer -> view", which I think fits a bit nicer too.

https://bsky.app/profile/aparker.io/post/3km6zvwfen727


"data store -> indexer -> view" works very well too. I hadn't seen this post, I guess we were thinking simultaneously.


I would expect clients (App Views) to be able to use multiple relays in parallel if needed. That would be like having a local GitHub UI pulling data from several GitHub instances (“relays”) that may host different but potentially overlapping sets of repositories.

This would be like a Usenet client aggregating newsgroups from several NNTP servers.


More like a local git repo having multiple remotes?


Yes — the canonical source of truth for a user's data is their "data repository" which is analogous to a git repo. These can really be passed around any way you want. Our current relay architecture was designed to support large, real-time scale, but other configurations are possible as well, since all the pieces here are composable.


I would expect every relay to have everything (modulo different opinions about illegal content). Why do you think they would have different views of the network?


Well, they'll try. The relay plays a similar role to a search engine, though without the ranking. Why do different search engines have different content?

As others have said, spam and illegal content are the most likely reasons.

We might also compare to email delivery; maybe ideally every message would be delivered, but in the real world, they aren't, and open relays are a hazard.


It's virtually guaranteed that different relays will take different views regarding disinformation, harm, etc.

This is probably good, too. I personally don't want my hand held, but recognize that others justifiably want someone else to filter out gore, or even opinions they find sufficiently off-putting.

I don't want to ram particular content down anyone else's throat, but I'm 100% opposed to anyone making that decision for me.


My understanding is that relays will not filter anything except really illegal material (child porn). Moderation is based on labeling, not deleting.


Unless they want to provide free file storage to the world... surely they must delete spam.


Is it possible for a relay to filter individual posts, or do the Merkle trees prevent that? (They would if they need to be signed by the originating user.)


I'm part of a marginalised group that is often considered to be illegal in many countries, or at the very least, very off-putting. (In 50 years, perhaps, things will change and we will be more accepted. To be specific, it's a grey area in the U.S.)

I'm interested in BlueSky but the relays are a worrying "point of failure" if they can just block me from there. I'll be more interested if there are multiple relays that I can use in parallel, with at least one being friendly to us, so I can still participate. Currently on the Fediverse it mostly works since we can connect directly to other servers, and there's no SPOF. But I would love to use BlueSky due to the better protocol design! Just more relays would be good.

... Wait, can we self host relays? And is it plausible for a community to do that? I couldn't find anything about it.


You can host your own relay but if you want accurate likes/replies/etc you need to crawl the entire Bluesky which could be quite expensive in the future.


One thing I’m curious about here is how BlueSky can credibly commit to only use the open portions of the protocol. BlueSky appears to be de facto quite centralized – given that, it seems like there’s no technical reason why first-party BlueSky clients have to be ATProto clients. Obviously it would be a major betrayal of user trust to do so any time in the near-to-medium-term future, but it seems like the de facto decentralization of ActivityPub gives stronger guardrails here.


What are first party BlueSky clients? Do you mean that a first party relay and PDS could communicate over a different protocol. I don't really understand what you are worried about. 2 activitypub clients could talk to each other over a protocol that is not activitypub.


I mean, the code for everything is open source. It's credible because they're doing it. If they started changing to not do it, people would notice, very quickly.


That makes sense. So first they have to go closed-source before that attack vector is even feasible, and doing so would be sufficient on its own to raise alarms.


While I appreciate them getting to the "speech v. reach" thing; Mastodon still feels long-run smarter out of simplicity (if short-run less fun or useful because Mastodon feed curation is definitely harder)

The main question, to me, is: If a hypothetical (likely wealthy or powerful) bad actor wants to put their thumb the scales here, how can they?

And this article REALLY hand-waved that away very poorly, with the whole "oh, I don't really know how the feeds work" bit. Makes me very suspicious that they know good and darn well not only how the feeds will work, but that they understand that this is where the power and/or money is and will try to keep a hold of that.


I don't think "oh, I don't really know how the feeds work" is the best summary. Let me quote the author:

> Feeds are a recent addition to atproto, and therefore, while they do exist, they may not be feature complete just yet, and may undergo some change in the future. We’ll see. They’re working just fine from my perspective, but I haven’t been following the lower level technical details.

One paragraph above for some more context:

> This to me is one of the killer features of BlueSky over other microblogging tools: total user choice. If I want to make my own [feed] algorithm, I can do so. And I can share them easily with others.


Here’s some docs and some code that give an overview of how feeds work:

https://docs.bsky.app/docs/starter-templates/custom-feeds

https://github.com/bluesky-social/feed-generator


Which part implied we don't know how feeds work? They're open source, and exist today, and you can run them.

> wants to put their thumb the scales here

Which scales, and in which sense? There's a lot of moving parts here.


The idea of users choosing their moderation provider which is independent of their hosting provider so they don’t have to worry too much about choice of instance sounds cool, except that I imagine that instances will still have to moderate to avoid falling foul of hosting locally illegal content.

This feels like it could end up being quite difficult for people running instances and end up as a big of a mess. It may be that I have misunderstood, however.


It is true that folks who host content will have to deal with the legal repercussions of the content that they host, and that inevitably, they will have to deal with this. That's true no matter what protocols you're using to host, however.


Yeh, but part of the promise of Bluesky is 'your moderation regimes is picked by you and it's not linked to your host'. Except it is so now your moderation regime is determined by two entities, not one.


you haven't. Bluesky decentralisation is fake. A PDS is a reasonable Ubuntu box, but even this post admits that a relay is a huge expensive central point of control.

What's the Bluesky plan for spam? Block it at the relay, maybe.

The DID server is the other central point of control.

But sure, go wild with PDSes.


A fun bit of Bluesky trivia is that it’s also named after a person, apropos that post from a bit ago – the CEO’s first name is literally “blue sky” in Mandarin.


Twitter named the project bluesky before I got involved, funnily enough. Coincidence. But my middle name does mean bluesky in Mandarin.



Does this mean if bluesky stop supporting did:plc then all preexisting plc identities go kaput? Since plc involves pinging their service...


https://plc.directory can be mirrored and this new mirror would become the authority if something like this happened


I'm skeptical about third-party labelers. How do you prevent that from being abused?


Users choose which 3p labelers they subscribe to. If a labeler service starts abusing their position, users can drop it.

https://github.com/bluesky-social/proposals/tree/main/0002-l... is a deep-dive on how Bluesky sees labelers fitting into the ecosystem.


What kind of abuse are you talking about? I'd be happy to give you my take, but it's a bit too broad for me to know how to answer.


We'll be publishing a lot more about moderation soon, but the basics of labelers:

- anyone can label anything in the network — including other labelers

- apps can surface labelers for users to subscribe to, similar to how custom feeds currently work in bluesky

- apps can exercise curation over how labelers get surfaced and ranked, similar to how custom feeds get surfaced (currently ranked by popularity expressed through likes and saves)


> Moderation tools fall under the “reach” layer: you take all of that speech, but provide a way to limit the reach of stuff you don’t care to see yourself.

Sometimes, people say that BlueSky is “all about free speech” or “doesn’t do moderation.” This is simply inaccurate. Moderation tooling is encoded into the protocol itself, so that it can work with all content on the network, even non-BlueSky applications. Moreover, it gives you the ability to choose your own moderators, so that you aren’t beholden to anyone else’s choice of moderation or lack thereof.

Ah yes, the famous "if you don't want to see this content, just close your eyes" approach to moderation. I know this philosophy is well-liked in Silicon Valley, but I think it's fundamentally flawed: There are legitimate situations in which you want to prevent unrelated other people from talking about a certain thing or acquiring certain content.

Classic examples are cybermobbing, doxxing and revenge porn: Two or more people talking about how to hurt a third person, publishing private, unflattering or false information about the person, etc. Removing this information from the victim's feed is completely useless (in fact it likely won't appear in their feed in the first place) as the harm comes from the fact that other people view the content or engage in the discussion. Nevertheless, the harm is real.

In a system with traditional moderation, a moderator could stop this kind of behaviour by deleting the posts for everyone and/or banning the perpetrators. None of this is possible in a "just hide the content" system.

Shared blocklists or "labels" won't work either as the consumers of the content don't have any motivation to block it - indeed, they want to see the revenge porn. The one who wants to block it is the victim, but it has no power to force everyone to use a particular blocklist. (The whole idea behind this system is to no one can force a blocklist on someone else)


There is more stuff in this space coming; I am focusing on the normal cases in this post, and not the bad cases.

The team has considered stuff like this, “consulted with experts,” even, but it’s ongoing work and so I don’t feel qualified to talk about it. The “labels won’t work because they want to see it rather than block it” is a part of the problem space I’ve seen them acknowledge explicitly.

EDIT: see here too: https://lobste.rs/s/shseqz/how_does_bluesky_work#c_vjvmei


How does it compare to Nostr?


They're actually more similar than I expected. Nostr seems to have gotten lost in trying to be a decentralised store of everything [1], while BlueSky is focusing on Twitter-style social media and its challenges—moderation, portability, etc. (though I'm unsure if ATproto can support other types of decentralised content)

Nothing wrong with being a store of everything like Nostr, it's just that a totally open model of contribution with no direction or oversight (well, except fiatjaf) tends to end up bogged down following a million different and often competing ideas. It's very active, but moving in 196 directions: https://github.com/nostr-protocol/nips/pulls


https://github.com/nostr-protocol/nips/blob/master/26.md NIP-26 (and if there is anything else similar) is one to point to here.

Once you, the developer, are required to implement (thiskey === thatkey) a relay operator has to index the entire network before a post can be rendered in compliance with that NIP. It does bog it way down, as you mentioned.

this public key does not equal that public key either as much as we might wish they could.


> For example, if every time I post a new update on BlueSky, if I had to send my post to every single one of my followers’ repositories, that would be extremely inefficent, and make running a popular repository very expensive to run.

Of course it would be inefficient, the Internet is not a broadcast medium !

Which is why AFAIK typically the only thing that is pushed (in large scale applications) is the existence of a message (and maaaybe a short title), your "followers" then pull the message when they want to see it. (Some of course might be big enough "fans" of you that they decide to "subscribe", and set up an "AI" to automatically pull all of your messages as soon as published.)

So it seems to be a weird strawman of a solution nobody would even try to use at large scale, or am I missing something ??


Why not use ActivityPub?

ActivityPub is a federated social networking technology popularized by Mastodon.

Account portability is the major reason why we chose to build a separate protocol. We consider portability to be crucial because it protects users from sudden bans, server shutdowns, and policy disagreements. Our solution for portability requires both signed data repositories and DIDs, neither of which are easy to retrofit into ActivityPub. The migration tools for ActivityPub are comparatively limited; they require the original server to provide a redirect and cannot migrate the user's previous data.

Other smaller differences include: a different viewpoint about how schemas should be handled, a preference for domain usernames over AP’s double-@ email usernames, and the goal of having large scale search and discovery (rather than the hashtag style of discovery that ActivityPub favors).

from https://atproto.com/guides/faq


Using double-@ user identifiers for actor discovery is not an ActivityPub feature. It's a bastardization of webfinger resource search added and popularized by Mastodon.

In ActivityPub URLs are the identifiers, be it for actors or anything else. And the URL can be just of a domain, therefore it's perfectly possible to have domain named identifiers.


Domain usernames is a very good point.


Until trademark laws come into play, and you find yourself obligated by law to give up your domain username to a big corporation. (A famous example in France is “Milka vs. Kraft Foods”, the court favored the big corporation’s registered trademark over Mrs Milka’s name.)

Granted, my comment doesn’t add much to the discussion, since this domain ownership issue would have been a problem in ActivityPub too.


Seems still better than the current status quo, where twitter can just decide to give your handle to kraft because they feel like it.

At least there's already a relatively established dispute mechanism for domains.


This is not a problem because the atproto has true account portability, so your domain getting seized is not the end of the world, unlike in activitypub.


> Until trademark laws come into play, and you find yourself obligated by law to give up your domain username to a big corporation.

This wouldn't be a big deal in practice (besides losing the domain). Domain usernames are just the combo of you telling Bluesky "I intend to use this domain name" and then you placing a TXT record on the domain to prove you own it. If you want to change domains (or, are forced to), you just give them the new domain name and you set another TXT record (just like if you had set up a domain name as a username for the first time). The underlying DID is still yours.


Didn't GitHub or Twitter or someone force someone to give up @Kik or similar?


That was npm, which lead to a left-pad incident.

https://news.ycombinator.com/item?id=11349870 - Kik, left-pad, and npm (2016)


No, npm themselves chose to take the name away from someone and give it to someone else.


That's also not how atproto domain usernames are implemented.

The identity is just a public key, basically. There's a resolution service which uses domains, but you can move the identity, with all the data, to another one.


Well, now it's up to the person who owns milka.fr so if it's kraft... No issue. If this isn't and they complain, it has to go to litigation. And one would think that they own a domain for their products already so it should be all good.


With all due respect to Steve, he only demonstrates how the AT protocol ecosystem has ways of passing the buck from content creator to (uncreative but rich) middlemen, for dissemination, moderation, content curation, etc.

That is antithetic to what the indie web should be in my opinion: allowing corporations, because let's face it who'll be able to scale better, to control who sees what I post and who I see is not what I want from my social media. We have enough of that as it is.


I am going to be honest, I do not see how I demonstrated that. Can you be more specific?


> if I had to send my post to every single one of my followers’ repositories, that would be extremely inefficent, [..] an additional kind of service, called a relay, that aggregates information in the network, and exposes it as a firehose to others

Therefore a company with money/resources to build this relay will be in control if/how the firehose presents my content as a content creator/pds owner. This will create the same imbalance of power you find today with Youtube/Twitch/Instagram/etc.

The solution is not to add intermediaries, but to make this dissemination to all of those followers less inefficient.

Designing a whole protocol to cater to the .1% of users that might get into the million followers is another level of premature optimization, in my opinion. For the others, sending requests to 10k followers should be plenty reasonable to do with off the shelf computing without requiring protocol level intermediaries like relays.

AT protocol, in my opinion, caters to the "scale" crowd and in doing that puts all the power in the hands of the few that can do it better. Instead of democratizing social media, which what a federated protocol would imply, it actually maintains the status quo albeit by changing the companies holding the power.


Okay, thanks. I think the fact that you can run your own relays mitigates this, but I can understand some skepticism.


Sure, but you're never going to do it better than the big guns, and most of the "consumers" of social media will probably use one of their relays instead of yours.

I think that the democratized model that ActivityPub encourages works better (even for those .1 percenters I spoke about earlier) but for sure for the average user because it more or less equalizes the playing field. More than that, nothing in the ActivityPub protocol would prevent users/clients using relays for disseminating content, "feed generators" for consuming content, distributed moderation mechanisms for moderating. The limit is only in the imagination of the developers and not in the way the protocol is structured.

I will stop now, I feel pretty strongly about this problem, and I fear I'm coming out as more combative than I intend to.


> The Authenticated Transfer Protocol, aka atproto, is a federated protocol for large-scale distributed social applications.

Well, there's the problem: large-scale and social are incompatible. Humans evolved to be sociable with maybe a few hundred people.

No-one has ever made a good restaurant by inventing a system to deliver millions of calories per second to each diner.


The scale is the whole system, so many different groups of people.

To take your restaurant comparison that'd be like saying McDonalds is an average sized restaurant, which might be true for each individual McDonalds, but not for the entire system of the McDonalds chain of restaurants.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: