How Does Bluesky Work?

paholg · on Feb 25, 2024

This seems incredibly well thought out, and seems like it solves some of the bigger issues with ActivityPub. I had no interest in Bluesky before; now I think I'll make an account.

remram · on Feb 25, 2024

My main problem with Bluesky and the AT protocol is not their apparent technical merits so much as the strong "not invented here" approach. Like their "domain as identity" instead of using user@host like everyone else and established protocols like WebFinger. Like using their own mechanism for that instead of looking at how ACME already does it for certificates, leading to someone owning the s3.amazon.com domain [1]. Like using their brand-new RPC protocol, "XRPC". Like whatever Lexicon is, a weird code-generation-based data schema whose only benefit over protobuf and jsonld is that it's newer.

It is hard to trust an "open network" that demonstrates time and time again that they don't respect and don't want to talk to anyone else. It also makes you wonder how many more common mistakes have slipped in the design simply because they reject common knowledge.

[1]: https://www.reddit.com/r/programming/comments/138hlf2/s3_dom...

lifthrasiir · on Feb 25, 2024

Your comment could have been more constructive if you have included more "common mistakes" than things you just don't like, the only exception being handle squatting. Like, what is a specific benefit of `user@host` as opposed to a single domain---which can be generated on demand---as an identity? (And who is "everything else" here? Almost all non-Mastodon social networks don't use them.) Also a value of the common schema language is not that high, especially when that language is coupled with the serialization format and prevents any improvement.

remram · on Feb 25, 2024

"value is not that high" is exactly the kind of behavior I'm talking about.

What's the benefit of user@host? Deployment simplicity, you don't need wildcard domains or wildcard certificates, or worse DNS API access, you can just self-host one app at one domain like everything else. As for who uses this format: email, matrix, XMPP, mastodon/ActivityPub, gnusocial/friendica...

lifthrasiir · on Feb 25, 2024

If the supposed benefit is only deployment simplicity, the value is indeed not high because the deployment is not expected to occur that many times. Technical problems are among least important problems in the social network.

remram · on Feb 25, 2024

Your argument is "drawbacks to reinventing are not that high", even though I pointed out both usability and security issues. But what's the advantage? Other that giving the (correct) impression that the supposedly-open-network doesn't want to have anything to do with anybody else?

You're coming up with excuses not justifications. And "it looks like it hasn't hurt us too bad yet" is not a great one at that. Self-hosted deployments have barely started rolling out (well, relay is still centralized).

lifthrasiir · on Feb 26, 2024

Do you really want me to use a lengthy bullet list instead of compact prose?

> Deployment simplicity, you don't need wildcard domains or wildcard certificates, or worse DNS API access, you can just self-host one app at one domain like everything else.

At this point, they are not significantly more difficult than single domains or certificates, thanks to Let's Encrypt and ACME. In fact the requirement for domains and certificates has been always the single biggest hinderance, which inherently prevents self-hosted deployments in masses. Self-hosting is definitely a good option to have, but it is unreasonable to assume that most users can have their own deployment even in the near future. Multiple companies seriously tried to tackle them and have failed so far. As such, individual applications are not in the good position to solve these issues.

> I pointed out both usability and security issues.

You only have pointed out security issues, which I acknowledged as a single actual mistake. And that only relates to the domain ownership check, so you can fix them without scrapping the entire scheme.

I can't really see which usability issue can arise from `@example.com` as opposed to `user@example.com`. For laypersons who would use federated instances, `@user.example.com` and `user@example.com` is literally a single character difference and doesn't really matter much except that `user@example.com` is much more prone to be mistaken as an email address (see below). For users with their own domains, any user name is redundant and `@example.com` is definitely superior.

> As for who uses this format: email, matrix, XMPP, mastodon/ActivityPub, gnusocial/friendica...

Among others, email is the only protocol that have achieved common usages, distantly followed by Mastodon. Everything else is a nerdy technology which doesn't count as "everyone else". I should note that I have operated an IRC server for more than a decade and I'm very confident that the number of public Matrix servers is even rarer than the dwindling number of public IRC servers. IRC was once popular only because there were no other alternatives, not because it was open (mIRC was a dominant implementation in its heyday anyway).

And even Mastodon doesn't exactly use `user@example.com`; it uses `@user@example.com` with a prefix `@`, presumably because it allows for shortened handles (`@user`) while avoids a confusion with email addresses, but that can be also done with a domain-only handle if desired. Like, that has been a feature of DNS for the entire time. The fact that both Mastodon and BlueSky doesn't use the exact email address format shows that the exact handle format is not very relevant, as long as it can account for distributed registration and verification and can't be confused with existing usages (i.e. email addresses).

remram · on Feb 26, 2024

You counter "it makes deployment difficult" by "lay people won't be deploying, it's too difficult". I can't understand what you're trying to say at all.

People are deploying Mastodon, it will keep being an actually-federated network, and Bluesky can be a centralized could-have-been federated network run by a couple of companies. But is that the goal as you see it?

lifthrasiir · on Feb 26, 2024

My counter is rather "deployment was already too difficult for laypeople even without that", see the emphasized word:

> In fact the requirement for domains and certificates has been always the single biggest hinderance, which inherently prevents self-hosted deployments in masses.

I'm confident that you never satisfactorily answered to that, and even more confident in my belief that such hinderance is inherent and thus should not be the foremost design factor. Unless you have a clear answer that can be universally applicable (that is, no "works for me and my friends", I too have enough technical friends who don't use Mastodon), please refrain from trying to claim otherwise.

remram · on Feb 26, 2024

Answered what? I'm sorry that you can't figure out certificates, I don't see how that excuses a system that is made more complicated that it needs to be, and more complicated than anybody else does it, which was my initial claim. Again, you are still giving me more "it's not that bad" or "it's not the main issue", which are not reasons to do the thing wrong in the first place.

This "conversation" has reached my threshold for puzzlement, so I'm dipping out. Let me show you how it feels to me with an analogy:

Some company is making a new revolutionary car. It gets about 10 miles of autonomy, can only turn right, and stalls every 100 feet. I point that out in the comments.

Then lifthrasiir shows up and says that some people have driven it successfully, that you can get anywhere by only turning right with careful planning, that most vehicles don't get a lot of autonomy only cars and trucks, that there are commercial services that will get the car where I need it to (by loading it on a truck), and anyway if I really want to get somewhere why don't I take a bus?

Sure, I never claimed that it couldn't be moved, or that it would prevent its owner from getting places. What I said is that it's a terrible car.

lifthrasiir · on Feb 26, 2024

You are claiming that a tram is a bad car. It was designed in that way for a reason and you can't (or don't want to) understand it because you like a car so much. Whenever I say a vehicle, you misinterpret it as a car. Of course it would make a bad car, like your analogy. That's enough puzzlement for me as well.

miragecraft · on Feb 25, 2024

ATprotocol seems technically superior, but socially set up for failure.

Because there’s no social incentive for anyone to set up and maintain relays and moderation service when divorced from actual communities, what’s left are purely financial incentives - paid service, advertisement insertion, data mining, information manipulation etc.

Also the centralization of data distribution and moderation means they are huge undertaking now, both in terms of hosting cost and human cost, it’s almost forcing these services to be commercialized, or you can say discourage anyone not Bluesky from setting up their own relay or moderation service.

For all the faults of ActivityPub, the extremely “gated community” like approach provides the social incentive for users to feel a sense of belonging to a server, and for the server owner and moderator to feel a sense of pride and ownership over providing service to their users. Being within a valued community also incentivizes people to donate to pay for server cost and moderation effort.

ATprotocol will technically allow multiple relays and moderation services, but in reality how many will actually materialize?

As such I feel judging social networking protocols from a purely technical and adversarial point of view (like ATprotocol does) is a mistake, sure you are now no longer beholden to a server, but at the same time you are divorced from a tight knit community and you no longer have skin in the game.

When you have nothing to lose (data, identity and community wise), that’s a double-edged sword.

danabramov · on Feb 25, 2024

I’m still fairly new to the team, and don’t claim to represent their point of view here, but my impression is that services (like moderation services) aren’t intended to be divorced from the communities they’re serving. It’s more that they’re decoupled from the physical data placement. The relationship between users and moderation services would be many-to-many. The incentives to self-organize and to provide a valuable service for your community are still there; it’s just that the notion of belonging to a community is more fluid in itself.

miragecraft · on Feb 25, 2024

Fluid community would disincentive tight-knit community wouldn't it?

I feel that if you decouple the social reward and the grunt work of moderation and financial burden of hosting the relay, then the incentives just wouldn't be there to do the dirty work unless it becomes paid or monetized in some way.

inhumantsar · on Feb 25, 2024

> For all the faults of ActivityPub, the extremely “gated community” like approach provides the social incentive for users to feel a sense of belonging to a server, and for the server owner and moderator to feel a sense of pride and ownership over providing service to their users. Being within a valued community also incentivizes people to donate to pay for server cost and moderation effort.

are social incentives going to be enough long-term though?

I keep looking at this and thinking of three things: trust, security, and Reddit. These are actual questions, not arguments, I'm genuinely curious.

Can you trust an admin to host and police your social network? Moving servers isn't hugely difficult iirc, but it could be very disruptive and drama-ful. It would have to get pretty bad before people start to overcome the network effect inertia and actually move.

What is there to hold hosts accountable for security? The threat of reputational damage doesn't seem like it is going to mean much to pseudonymous volunteers, so what can be done to ensure hosts are properly protecting the system from promise?

Reddit moderators are, by and large, reasonable but there's a non-trivial minority who will do things like stage a coup against the other moderators, start arbitrarily banning or harassing people, or just straight up ghost the site and never follow up on the rules. same as above, what mechanisms exist to prevent or reverse or take over the actions of an rogue/absentee mod?

miragecraft · on Feb 25, 2024

> are social incentives going to be enough long-term though?

I'd say all the long-running forum based communities proves that yes it is. Not counting Usenet, IRC and other "old internet communities" which have withstood the test of time.

> Can you trust an admin to host and police your social network?

That's where the federated part comes in, you choose a server and admin that you trust.

Preventing people from moving server would be holding them hostage, breaking the fundamental trust of the platform and become public enemy numero uno overnight.

I really doubt anyone would do it, but yes the chance isn't zero. Plenty of people host servers with their real identity and reputation on the line though.

If you are still paranoid you can join a commercial server where by contractual obligation they have to provide service to you and allow you to move away.

> What is there to hold hosts accountable for security? The threat of reputational damage doesn't seem like it is going to mean much to pseudonymous volunteers, so what can be done to ensure hosts are properly protecting the system from promise?

Again, you choose your server based on your risk tolerance. You can go with the main mastodon.social server, or join a server owned by a real person you feel is reputable/trustworthy, or join a paid server.

And I would disagree about pseudonymous volunteers not caring about reputation damage, the vast majority of people would feel a sense of ownership over their online identity, especially if it has a lot of clout.

> Reddit moderators are, by and large, reasonable but there's a non-trivial minority who will do things like stage a coup against the other moderators, start arbitrarily banning or harassing people, or just straight up ghost the site and never follow up on the rules. same as above, what mechanisms exist to prevent or reverse or take over the actions of an rogue/absentee mod?

Reddit mods are not server owners, at the end of the day they know they're just hosting someone else's party so there isn't a true sense of ownership.

But if a moderator decided to go rogue? It's just like a forum isn't it? Either the owner gets rid of them or you're out of luck.

You can limit damage by backing up your followers and follows on a regular basis so you can rebuild your network if you are forced to create a new account on a different server.

With federated system come the freedom to chose your server and the consequences thereof, you're gonna have to take some personal responsibility by choosing a server that has a healthy community and a responsible moderation team.

At the end of the day I don't think a perfect system exists, they are just different sets of compromises.

steveklabnik · on Feb 25, 2024

> For all the faults of ActivityPub, the extremely “gated community” like approach provides the social incentive for users to feel a sense of belonging to a server, and for the server owner and moderator to feel a sense of pride and ownership over providing service to their users.

I really misunderstood what you were talking about until you got here.

This mindset is extremely alien to me, so that's probably why it's so hard to make sense of this.

I am glad that you have a system that works for you, and that I have one that works for me.

miragecraft · on Feb 25, 2024

I'm actually a fan of POSSE (https://indieweb.org/POSSE) where all social networks are auxiliary to my own website/blog.

After the Twitter debacle I'm done with the idea of being "married" to any network.

ttepasse · on Feb 25, 2024

Since Bluesky published their protocol, I’m thinking about how one integrate a BS-compatible microblog into one’s own weblog software, e.g. POSSE.

With ActivityPub I can see that: a form of headless ActivityPub server which provides an Inbox and Outbox, based on the DB. Such you can publish your own posts both on the web and into an outbox. And even better, you get reactions in your inbox, which you could possibly display on your own web pages with enough determination, similar how the Indieweb people do that with WebMentions. After all every Status in Mastodon/Fediverse has an url property.

With AT Protocol I’m a little bit at a loss. You could possibly expose your Backend as a AT-compatible Personal Data Repository, although that would be a lot of work, exposing a Git-like interface.

But I’m still wondering how one would aggregate reactions. It seems it would not be enough just to have a PDS, but also to run an Indexer/Relay, which indexes other peoples PDS. And not just the PDSs of the people you are following, but also the rest of the world – after all a Like or such can come from everyone. When running an Indexer you are pretty soon into Firehose territory. Too much for a single person website.

It really seems there isn’t a way for reactions to swim upstream – the social graph of Bluesky in a way lives in the Indexer/App View world, not on your PDS.

Maybe I’m wrong, but the AT Protocol docs are not very expansive on such points. Neither is Klabnik’s blog. Everyone talking about ATP seems to be excited about the Solid-inspired DIDs and PDS, but never talk about the transmission mechanisms. Whereas I’m sitting here and thinking that an ActivityPub Actor’s Outbox is already a form of PDS – and it has an Inbox. Addressable via the Web and findable via Webfinger.

steveklabnik · on Feb 25, 2024

Reactions would be a Lexicon. They’d then get consumed like anything else.

Or at least, that’s my initial guess, would have to actually work through the design to know if it would work that way.

ttepasse · on Feb 25, 2024

You’re talking about the data model, I’m asking about the transmission of those and its directions.

“Lexicons” are BS’s Not Invented Here variant of ActivityPub JSON-(LD) meta-syntax and the vocabularies. Obviously necessary for the transmission of information, but it doesn't tells you how that information is transmitted or in which direction.

steveklabnik · on Feb 26, 2024

Sorry, I meant mostly like, "the transmission doesn't work any differently than anything else," you don't need to invent a novel transmission model to make something work.

str4d · on Feb 26, 2024

> But I’m still wondering how one would aggregate reactions. It seems it would not be enough just to have a PDS, but also to run an Indexer/Relay, which indexes other peoples PDS. And not just the PDSs of the people you are following, but also the rest of the world – after all a Like or such can come from everyone. When running an Indexer you are pretty soon into Firehose territory. Too much for a single person website.

You don't need to run a Relay yourself in order to see reactions; you instead subscribe to an existing Relay's firehose (such as the one that BlueSky PBC operates), and ignore anything that isn't "reaction-shaped" (matches the relevant Lexicon).

> Everyone talking about ATP seems to [..] never talk about the transmission mechanisms. Whereas I’m sitting here and thinking that an ActivityPub Actor’s Outbox is already a form of PDS – and it has an Inbox. Addressable via the Web and findable via Webfinger.

Here's my attempt to summarise the transmission difference between ActivityPub and AT Protocol: ActivityPub is symmetric and push-based, AT Protocol is asymmetric and pull-based.

With ActivityPub, your client talks to your server. The server handles both outgoing data (you making posts) and incoming data (viewing other people's posts).

When you create a post:

- The post is placed in your server's outbox.

- Your server looks up the servers for every one of your followers, and places your post in their inbox.

When someone creates a reaction:

- The reaction is placed in their server's outbox.

- Their server looks up your server, and places the reaction in its Inbox.

- Your website checks the Inbox, discovers the reaction, and handles it.

With AT Protocol, your client talks to your PDS and the client's App View. Your PDS handles outgoing data (you making posts), and your client's App View handles incoming data (viewing other people's posts). Relays are what provide transmission between the two: Relays subscribe to your PDS, and App Views subscribe to the firehose of at least on Relay.

When you create a post:

- The post is added to your repository on your PDS.

- The Relay pulls the post from your PDS and broadcasts it on its firehose.

- App Views see the post in the firehose, and include it within the feeds of users that follow you.

When someone creates a reaction:

- The reaction is added to their repository on their PDS.

- The Relay pulls the reaction from their PDS and broadcasts it on its firehose.

- Your website's App View (which might be your client's App View if "reactions" are also posts, or some other App View if this is a different kind of semantic data) sees the reaction in the firehose, and handles it (e.g. includes it within the feed of reactions that your website uses).

ttepasse · on Feb 27, 2024

That is rather helpful, thanks!

But in a way depressing: I fear subscribing to a firehose is not that realistic for small single person websites. Maybe you can filter the firehose subscription.

city41 · on Feb 25, 2024

Had a similar response. Dusted off my account and finally posted something.

stavros · on Feb 25, 2024

Same, actually. The fact that I can get the @stavros.io handle just by adding a DNS record is a nice touch.

skybrian · on Feb 24, 2024

I believe there’s currently a single relay. When there are multiple relays, each will have a different view of the network, and the one that includes more posts will be better than the others.

It might be quite hard to get relay diversity rather than using a single dominant relay, sort of like attempting to get traction for an alternative DNS system. Then, if your posts don’t get picked up by the dominant relay then it seems like you’re out of luck?

But it’s still very early, so perhaps it won’t work out that way. And even if there is a dominant relay, well, GitHub isn’t so bad, and there are alternatives.

evbogue · on Feb 25, 2024

I have previously found Bluesky kind of confusing, but I've recently realized that it was mostly the terminology that was confusing me.

Imagine if you instead referred to the the system as

[microblogs] -> [indexer] -> [client]

There's a lot more to it, but I think this is the basics of how Bluesky works. They just call everything something different than you might think it should be called.

The indexer is only necessary because the user ids do not remain static on the system. Instead there is some weird cryptographic reshuffling going on where DID:PLC starts as a substring of a hash and then because of 'key rotation' the user id of every user actually changes out from under the user themselves over time.

Imagine a system such as Livejournal but your handle is constantly changing so you need an authority to keep track of the handle changing. And of course you are only allowed to post short messages with no titles or markdown support.

I'm personally waiting for a new DID:PLC where you can get a static pubkey where you sign the messages with a private key because I think that'll be way less stuff to keep track of and it'll make self-hosting and local-first development much easier.

str4d · on Feb 25, 2024

> The indexer is only necessary because the user ids do not remain static on the system. Instead there is some weird cryptographic reshuffling going on where DID:PLC starts as a substring of a hash and then because of 'key rotation' the user id of every user actually changes out from under the user themselves over time.

The user IDs (e.g. did:plc or did:web) do not change over time. You are correct that a user that uses did:plc has that generated from a hash (of their initial repository state), but subsequent additions to or deletions from the repo don't change the DID. Likewise, if you set up your repository with did:web, the domain you pick is fixed, and you lose access to your repo if you lose control of that domain.

What is changing with key rotation is the repo signing keys that authorize updates to the user repo. If you run an entirely self-hosted PDS then you can choose to never rotate that key, and then you have a "static private key" repository. But key rotation is a necessary part of the design, in part to enable account migration (you wouldn't want an old PDS to still be allowed to make changes to your repository, so you'd rotate those keys to remove the old server's key and add the new server's key).

evbogue · on Feb 25, 2024

I 100% agree with you, but I think it's an issue of terminology.

I find myself thinking of the pubkey as the id and the DIDs as an alias or a name pointer towards that id.

I think the indexer would need to be less robust if there was no key rotation, and I'm looking forward to that feature.

evbogue · on Feb 25, 2024

The comparison here that comes to mind is imagine if Github issued you a keypair to use over ssh to push/pull from a repo, but they change that keypair for you periodically and also keep track of that keypair for you.

I'd much prefer to generate my own ssh key and upload the public key to Github, Gitlab, and Sourcehut, and Codeberg, etc. Forgive me if I forgot a forge.

str4d · on Feb 25, 2024

A useful way to think about it is that the PDS is a "user agent" - they act on behalf of the user, and with the user's implicit trust. This is much the same way that a user trusts webserver software (and the VPS running it) to correctly serve their website, hold onto the private keys for the site's TLS certificates, use those private keys to correctly set up encrypted connections to the website, etc.

The AT Protocol itself does technically allow for all private key material to be held at all times by the user, but that means the user needs to be interactively involved in any operation that requires making a signature (and online / available whenever such an operation is necessary, which isn't necessarily the same times as when the user is interacting with a particular app like BlueSky running on ATProto). The PDS software that BlueSky have implemented instead requires (IIRC) that the PDS has the private key for at least one of the public signing keys (but there can be multiple signing keys, so the user can also hold onto one locally).

steveklabnik · on Feb 25, 2024

Yeah, I hear you about the names. When talking about this on bluesky, someone came up with "data store -> indexer -> view", which I think fits a bit nicer too.

https://bsky.app/profile/aparker.io/post/3km6zvwfen727

evbogue · on Feb 25, 2024

"data store -> indexer -> view" works very well too. I hadn't seen this post, I guess we were thinking simultaneously.

layer8 · on Feb 24, 2024

I would expect clients (App Views) to be able to use multiple relays in parallel if needed. That would be like having a local GitHub UI pulling data from several GitHub instances (“relays”) that may host different but potentially overlapping sets of repositories.

This would be like a Usenet client aggregating newsgroups from several NNTP servers.

e12e · on Feb 25, 2024

More like a local git repo having multiple remotes?

arcalinea · on Feb 26, 2024

Yes — the canonical source of truth for a user's data is their "data repository" which is analogous to a git repo. These can really be passed around any way you want. Our current relay architecture was designed to support large, real-time scale, but other configurations are possible as well, since all the pieces here are composable.

wmf · on Feb 24, 2024

I would expect every relay to have everything (modulo different opinions about illegal content). Why do you think they would have different views of the network?

skybrian · on Feb 25, 2024

Well, they'll try. The relay plays a similar role to a search engine, though without the ranking. Why do different search engines have different content?

As others have said, spam and illegal content are the most likely reasons.

We might also compare to email delivery; maybe ideally every message would be delivered, but in the real world, they aren't, and open relays are a hazard.

s1gnp0st · on Feb 24, 2024

It's virtually guaranteed that different relays will take different views regarding disinformation, harm, etc.

This is probably good, too. I personally don't want my hand held, but recognize that others justifiably want someone else to filter out gore, or even opinions they find sufficiently off-putting.

I don't want to ram particular content down anyone else's throat, but I'm 100% opposed to anyone making that decision for me.

wmf · on Feb 24, 2024

My understanding is that relays will not filter anything except really illegal material (child porn). Moderation is based on labeling, not deleting.

gpm · on Feb 25, 2024

Unless they want to provide free file storage to the world... surely they must delete spam.

layer8 · on Feb 25, 2024

Is it possible for a relay to filter individual posts, or do the Merkle trees prevent that? (They would if they need to be signed by the originating user.)

PathfinderBot · on Feb 25, 2024

I'm part of a marginalised group that is often considered to be illegal in many countries, or at the very least, very off-putting. (In 50 years, perhaps, things will change and we will be more accepted. To be specific, it's a grey area in the U.S.)

I'm interested in BlueSky but the relays are a worrying "point of failure" if they can just block me from there. I'll be more interested if there are multiple relays that I can use in parallel, with at least one being friendly to us, so I can still participate. Currently on the Fediverse it mostly works since we can connect directly to other servers, and there's no SPOF. But I would love to use BlueSky due to the better protocol design! Just more relays would be good.

... Wait, can we self host relays? And is it plausible for a community to do that? I couldn't find anything about it.

wmf · on Feb 25, 2024

You can host your own relay but if you want accurate likes/replies/etc you need to crawl the entire Bluesky which could be quite expensive in the future.

taion · on Feb 25, 2024

One thing I’m curious about here is how BlueSky can credibly commit to only use the open portions of the protocol. BlueSky appears to be de facto quite centralized – given that, it seems like there’s no technical reason why first-party BlueSky clients have to be ATProto clients. Obviously it would be a major betrayal of user trust to do so any time in the near-to-medium-term future, but it seems like the de facto decentralization of ActivityPub gives stronger guardrails here.

charcircuit · on Feb 25, 2024

What are first party BlueSky clients? Do you mean that a first party relay and PDS could communicate over a different protocol. I don't really understand what you are worried about. 2 activitypub clients could talk to each other over a protocol that is not activitypub.

steveklabnik · on Feb 25, 2024

I mean, the code for everything is open source. It's credible because they're doing it. If they started changing to not do it, people would notice, very quickly.

taion · on Feb 25, 2024

That makes sense. So first they have to go closed-source before that attack vector is even feasible, and doing so would be sufficient on its own to raise alarms.

jrm4 · on Feb 25, 2024

While I appreciate them getting to the "speech v. reach" thing; Mastodon still feels long-run smarter out of simplicity (if short-run less fun or useful because Mastodon feed curation is definitely harder)

The main question, to me, is: If a hypothetical (likely wealthy or powerful) bad actor wants to put their thumb the scales here, how can they?

And this article REALLY hand-waved that away very poorly, with the whole "oh, I don't really know how the feeds work" bit. Makes me very suspicious that they know good and darn well not only how the feeds will work, but that they understand that this is where the power and/or money is and will try to keep a hold of that.

vlz · on Feb 25, 2024

I don't think "oh, I don't really know how the feeds work" is the best summary. Let me quote the author:

> Feeds are a recent addition to atproto, and therefore, while they do exist, they may not be feature complete just yet, and may undergo some change in the future. We’ll see. They’re working just fine from my perspective, but I haven’t been following the lower level technical details.

One paragraph above for some more context:

> This to me is one of the killer features of BlueSky over other microblogging tools: total user choice. If I want to make my own [feed] algorithm, I can do so. And I can share them easily with others.

danabramov · on Feb 25, 2024

Here’s some docs and some code that give an overview of how feeds work:

https://docs.bsky.app/docs/starter-templates/custom-feeds

https://github.com/bluesky-social/feed-generator

steveklabnik · on Feb 25, 2024

Which part implied we don't know how feeds work? They're open source, and exist today, and you can run them.

> wants to put their thumb the scales here

Which scales, and in which sense? There's a lot of moving parts here.

Angostura · on Feb 25, 2024

The idea of users choosing their moderation provider which is independent of their hosting provider so they don’t have to worry too much about choice of instance sounds cool, except that I imagine that instances will still have to moderate to avoid falling foul of hosting locally illegal content.

This feels like it could end up being quite difficult for people running instances and end up as a big of a mess. It may be that I have misunderstood, however.

steveklabnik · on Feb 25, 2024

It is true that folks who host content will have to deal with the legal repercussions of the content that they host, and that inevitably, they will have to deal with this. That's true no matter what protocols you're using to host, however.

Angostura · on Feb 25, 2024

Yeh, but part of the promise of Bluesky is 'your moderation regimes is picked by you and it's not linked to your host'. Except it is so now your moderation regime is determined by two entities, not one.

davidgerard · on Feb 25, 2024

you haven't. Bluesky decentralisation is fake. A PDS is a reasonable Ubuntu box, but even this post admits that a relay is a huge expensive central point of control.

What's the Bluesky plan for spam? Block it at the relay, maybe.

The DID server is the other central point of control.

But sure, go wild with PDSes.

taion · on Feb 25, 2024

A fun bit of Bluesky trivia is that it’s also named after a person, apropos that post from a bit ago – the CEO’s first name is literally “blue sky” in Mandarin.

arcalinea · on Feb 25, 2024

Twitter named the project bluesky before I got involved, funnily enough. Coincidence. But my middle name does mean bluesky in Mandarin.

avinassh · on Feb 25, 2024

This short comic is also great - https://bsky.social/about/welcome-to-bluesky-comic

throwaway290 · on Feb 25, 2024

Does this mean if bluesky stop supporting did:plc then all preexisting plc identities go kaput? Since plc involves pinging their service...

edavis · on Feb 25, 2024

https://plc.directory can be mirrored and this new mirror would become the authority if something like this happened

KerrAvon · on Feb 25, 2024

I'm skeptical about third-party labelers. How do you prevent that from being abused?

edavis · on Feb 25, 2024

Users choose which 3p labelers they subscribe to. If a labeler service starts abusing their position, users can drop it.

https://github.com/bluesky-social/proposals/tree/main/0002-l... is a deep-dive on how Bluesky sees labelers fitting into the ecosystem.

steveklabnik · on Feb 25, 2024

What kind of abuse are you talking about? I'd be happy to give you my take, but it's a bit too broad for me to know how to answer.

arcalinea · on Feb 25, 2024

We'll be publishing a lot more about moderation soon, but the basics of labelers:

- anyone can label anything in the network — including other labelers

- apps can surface labelers for users to subscribe to, similar to how custom feeds currently work in bluesky

- apps can exercise curation over how labelers get surfaced and ranked, similar to how custom feeds get surfaced (currently ranked by popularity expressed through likes and saves)

xg15 · on Feb 25, 2024

> Moderation tools fall under the “reach” layer: you take all of that speech, but provide a way to limit the reach of stuff you don’t care to see yourself.

Sometimes, people say that BlueSky is “all about free speech” or “doesn’t do moderation.” This is simply inaccurate. Moderation tooling is encoded into the protocol itself, so that it can work with all content on the network, even non-BlueSky applications. Moreover, it gives you the ability to choose your own moderators, so that you aren’t beholden to anyone else’s choice of moderation or lack thereof.

Ah yes, the famous "if you don't want to see this content, just close your eyes" approach to moderation. I know this philosophy is well-liked in Silicon Valley, but I think it's fundamentally flawed: There are legitimate situations in which you want to prevent unrelated other people from talking about a certain thing or acquiring certain content.

Classic examples are cybermobbing, doxxing and revenge porn: Two or more people talking about how to hurt a third person, publishing private, unflattering or false information about the person, etc. Removing this information from the victim's feed is completely useless (in fact it likely won't appear in their feed in the first place) as the harm comes from the fact that other people view the content or engage in the discussion. Nevertheless, the harm is real.

In a system with traditional moderation, a moderator could stop this kind of behaviour by deleting the posts for everyone and/or banning the perpetrators. None of this is possible in a "just hide the content" system.

Shared blocklists or "labels" won't work either as the consumers of the content don't have any motivation to block it - indeed, they want to see the revenge porn. The one who wants to block it is the victim, but it has no power to force everyone to use a particular blocklist. (The whole idea behind this system is to no one can force a blocklist on someone else)

steveklabnik · on Feb 25, 2024

There is more stuff in this space coming; I am focusing on the normal cases in this post, and not the bad cases.

The team has considered stuff like this, “consulted with experts,” even, but it’s ongoing work and so I don’t feel qualified to talk about it. The “labels won’t work because they want to see it rather than block it” is a part of the problem space I’ve seen them acknowledge explicitly.

EDIT: see here too: https://lobste.rs/s/shseqz/how_does_bluesky_work#c_vjvmei

est · on Feb 25, 2024

How does it compare to Nostr?

sph · on Feb 25, 2024

They're actually more similar than I expected. Nostr seems to have gotten lost in trying to be a decentralised store of everything [1], while BlueSky is focusing on Twitter-style social media and its challenges—moderation, portability, etc. (though I'm unsure if ATproto can support other types of decentralised content)

Nothing wrong with being a store of everything like Nostr, it's just that a totally open model of contribution with no direction or oversight (well, except fiatjaf) tends to end up bogged down following a million different and often competing ideas. It's very active, but moving in 196 directions: https://github.com/nostr-protocol/nips/pulls

evbogue · on Feb 25, 2024

https://github.com/nostr-protocol/nips/blob/master/26.md NIP-26 (and if there is anything else similar) is one to point to here.

Once you, the developer, are required to implement (thiskey === thatkey) a relay operator has to index the entire network before a post can be rendered in compliance with that NIP. It does bog it way down, as you mentioned.

this public key does not equal that public key either as much as we might wish they could.

BlueTemplar · on Feb 25, 2024

> For example, if every time I post a new update on BlueSky, if I had to send my post to every single one of my followers’ repositories, that would be extremely inefficent, and make running a popular repository very expensive to run.

Of course it would be inefficient, the Internet is not a broadcast medium !

Which is why AFAIK typically the only thing that is pushed (in large scale applications) is the existence of a message (and maaaybe a short title), your "followers" then pull the message when they want to see it. (Some of course might be big enough "fans" of you that they decide to "subscribe", and set up an "AI" to automatically pull all of your messages as soon as published.)

So it seems to be a weird strawman of a solution nobody would even try to use at large scale, or am I missing something ??

oDot · on Feb 24, 2024

Why not use ActivityPub?
ActivityPub is a federated social networking technology popularized by Mastodon.
Account portability is the major reason why we chose to build a separate protocol. We consider portability to be crucial because it protects users from sudden bans, server shutdowns, and policy disagreements. Our solution for portability requires both signed data repositories and DIDs, neither of which are easy to retrofit into ActivityPub. The migration tools for ActivityPub are comparatively limited; they require the original server to provide a redirect and cannot migrate the user's previous data.

Other smaller differences include: a different viewpoint about how schemas should be handled, a preference for domain usernames over AP’s double-@ email usernames, and the goal of having large scale search and discovery (rather than the hashtag style of discovery that ActivityPub favors).

from https://atproto.com/guides/faq

mariusor · on Feb 26, 2024

Using double-@ user identifiers for actor discovery is not an ActivityPub feature. It's a bastardization of webfinger resource search added and popularized by Mastodon.

In ActivityPub URLs are the identifiers, be it for actors or anything else. And the URL can be just of a domain, therefore it's perfectly possible to have domain named identifiers.

aatd86 · on Feb 25, 2024

Domain usernames is a very good point.

Diti · on Feb 25, 2024

Until trademark laws come into play, and you find yourself obligated by law to give up your domain username to a big corporation. (A famous example in France is “Milka vs. Kraft Foods”, the court favored the big corporation’s registered trademark over Mrs Milka’s name.)

Granted, my comment doesn’t add much to the discussion, since this domain ownership issue would have been a problem in ActivityPub too.

madeofpalk · on Feb 25, 2024

Seems still better than the current status quo, where twitter can just decide to give your handle to kraft because they feel like it.

At least there's already a relatively established dispute mechanism for domains.

neurostimulant · on Feb 25, 2024

This is not a problem because the atproto has true account portability, so your domain getting seized is not the end of the world, unlike in activitypub.

jakebailey · on Feb 25, 2024

> Until trademark laws come into play, and you find yourself obligated by law to give up your domain username to a big corporation.

This wouldn't be a big deal in practice (besides losing the domain). Domain usernames are just the combo of you telling Bluesky "I intend to use this domain name" and then you placing a TXT record on the domain to prove you own it. If you want to change domains (or, are forced to), you just give them the new domain name and you set another TXT record (just like if you had set up a domain name as a username for the first time). The underlying DID is still yours.

pests · on Feb 25, 2024

Didn't GitHub or Twitter or someone force someone to give up @Kik or similar?

omoikane · on Feb 25, 2024

That was npm, which lead to a left-pad incident.

https://news.ycombinator.com/item?id=11349870 - Kik, left-pad, and npm (2016)

steveklabnik · on Feb 25, 2024

No, npm themselves chose to take the name away from someone and give it to someone else.

samatman · on Feb 25, 2024

That's also not how atproto domain usernames are implemented.

The identity is just a public key, basically. There's a resolution service which uses domains, but you can move the identity, with all the data, to another one.

aatd86 · on Feb 25, 2024

Well, now it's up to the person who owns milka.fr so if it's kraft... No issue. If this isn't and they complain, it has to go to litigation. And one would think that they own a domain for their products already so it should be all good.

mariusor · on Feb 25, 2024

With all due respect to Steve, he only demonstrates how the AT protocol ecosystem has ways of passing the buck from content creator to (uncreative but rich) middlemen, for dissemination, moderation, content curation, etc.

That is antithetic to what the indie web should be in my opinion: allowing corporations, because let's face it who'll be able to scale better, to control who sees what I post and who I see is not what I want from my social media. We have enough of that as it is.

steveklabnik · on Feb 25, 2024

I am going to be honest, I do not see how I demonstrated that. Can you be more specific?

mariusor · on Feb 25, 2024

> if I had to send my post to every single one of my followers’ repositories, that would be extremely inefficent, [..] an additional kind of service, called a relay, that aggregates information in the network, and exposes it as a firehose to others

Therefore a company with money/resources to build this relay will be in control if/how the firehose presents my content as a content creator/pds owner. This will create the same imbalance of power you find today with Youtube/Twitch/Instagram/etc.

The solution is not to add intermediaries, but to make this dissemination to all of those followers less inefficient.

Designing a whole protocol to cater to the .1% of users that might get into the million followers is another level of premature optimization, in my opinion. For the others, sending requests to 10k followers should be plenty reasonable to do with off the shelf computing without requiring protocol level intermediaries like relays.

AT protocol, in my opinion, caters to the "scale" crowd and in doing that puts all the power in the hands of the few that can do it better. Instead of democratizing social media, which what a federated protocol would imply, it actually maintains the status quo albeit by changing the companies holding the power.

steveklabnik · on Feb 25, 2024

Okay, thanks. I think the fact that you can run your own relays mitigates this, but I can understand some skepticism.

mariusor · on Feb 25, 2024

Sure, but you're never going to do it better than the big guns, and most of the "consumers" of social media will probably use one of their relays instead of yours.

I think that the democratized model that ActivityPub encourages works better (even for those .1 percenters I spoke about earlier) but for sure for the average user because it more or less equalizes the playing field. More than that, nothing in the ActivityPub protocol would prevent users/clients using relays for disseminating content, "feed generators" for consuming content, distributed moderation mechanisms for moderating. The limit is only in the imagination of the developers and not in the way the protocol is structured.

I will stop now, I feel pretty strongly about this problem, and I fear I'm coming out as more combative than I intend to.

grey_earthling · on Feb 25, 2024

> The Authenticated Transfer Protocol, aka atproto, is a federated protocol for large-scale distributed social applications.

Well, there's the problem: large-scale and social are incompatible. Humans evolved to be sociable with maybe a few hundred people.

No-one has ever made a good restaurant by inventing a system to deliver millions of calories per second to each diner.

SahAssar · on Feb 25, 2024

The scale is the whole system, so many different groups of people.

To take your restaurant comparison that'd be like saying McDonalds is an average sized restaurant, which might be true for each individual McDonalds, but not for the entire system of the McDonalds chain of restaurants.