Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I believe there’s currently a single relay. When there are multiple relays, each will have a different view of the network, and the one that includes more posts will be better than the others.

It might be quite hard to get relay diversity rather than using a single dominant relay, sort of like attempting to get traction for an alternative DNS system. Then, if your posts don’t get picked up by the dominant relay then it seems like you’re out of luck?

But it’s still very early, so perhaps it won’t work out that way. And even if there is a dominant relay, well, GitHub isn’t so bad, and there are alternatives.



I have previously found Bluesky kind of confusing, but I've recently realized that it was mostly the terminology that was confusing me.

Imagine if you instead referred to the the system as

[microblogs] -> [indexer] -> [client]

There's a lot more to it, but I think this is the basics of how Bluesky works. They just call everything something different than you might think it should be called.

The indexer is only necessary because the user ids do not remain static on the system. Instead there is some weird cryptographic reshuffling going on where DID:PLC starts as a substring of a hash and then because of 'key rotation' the user id of every user actually changes out from under the user themselves over time.

Imagine a system such as Livejournal but your handle is constantly changing so you need an authority to keep track of the handle changing. And of course you are only allowed to post short messages with no titles or markdown support.

I'm personally waiting for a new DID:PLC where you can get a static pubkey where you sign the messages with a private key because I think that'll be way less stuff to keep track of and it'll make self-hosting and local-first development much easier.


> The indexer is only necessary because the user ids do not remain static on the system. Instead there is some weird cryptographic reshuffling going on where DID:PLC starts as a substring of a hash and then because of 'key rotation' the user id of every user actually changes out from under the user themselves over time.

The user IDs (e.g. did:plc or did:web) do not change over time. You are correct that a user that uses did:plc has that generated from a hash (of their initial repository state), but subsequent additions to or deletions from the repo don't change the DID. Likewise, if you set up your repository with did:web, the domain you pick is fixed, and you lose access to your repo if you lose control of that domain.

What is changing with key rotation is the repo signing keys that authorize updates to the user repo. If you run an entirely self-hosted PDS then you can choose to never rotate that key, and then you have a "static private key" repository. But key rotation is a necessary part of the design, in part to enable account migration (you wouldn't want an old PDS to still be allowed to make changes to your repository, so you'd rotate those keys to remove the old server's key and add the new server's key).


I 100% agree with you, but I think it's an issue of terminology.

I find myself thinking of the pubkey as the id and the DIDs as an alias or a name pointer towards that id.

I think the indexer would need to be less robust if there was no key rotation, and I'm looking forward to that feature.


The comparison here that comes to mind is imagine if Github issued you a keypair to use over ssh to push/pull from a repo, but they change that keypair for you periodically and also keep track of that keypair for you.

I'd much prefer to generate my own ssh key and upload the public key to Github, Gitlab, and Sourcehut, and Codeberg, etc. Forgive me if I forgot a forge.


A useful way to think about it is that the PDS is a "user agent" - they act on behalf of the user, and with the user's implicit trust. This is much the same way that a user trusts webserver software (and the VPS running it) to correctly serve their website, hold onto the private keys for the site's TLS certificates, use those private keys to correctly set up encrypted connections to the website, etc.

The AT Protocol itself does technically allow for all private key material to be held at all times by the user, but that means the user needs to be interactively involved in any operation that requires making a signature (and online / available whenever such an operation is necessary, which isn't necessarily the same times as when the user is interacting with a particular app like BlueSky running on ATProto). The PDS software that BlueSky have implemented instead requires (IIRC) that the PDS has the private key for at least one of the public signing keys (but there can be multiple signing keys, so the user can also hold onto one locally).


Yeah, I hear you about the names. When talking about this on bluesky, someone came up with "data store -> indexer -> view", which I think fits a bit nicer too.

https://bsky.app/profile/aparker.io/post/3km6zvwfen727


"data store -> indexer -> view" works very well too. I hadn't seen this post, I guess we were thinking simultaneously.


I would expect clients (App Views) to be able to use multiple relays in parallel if needed. That would be like having a local GitHub UI pulling data from several GitHub instances (“relays”) that may host different but potentially overlapping sets of repositories.

This would be like a Usenet client aggregating newsgroups from several NNTP servers.


More like a local git repo having multiple remotes?


Yes — the canonical source of truth for a user's data is their "data repository" which is analogous to a git repo. These can really be passed around any way you want. Our current relay architecture was designed to support large, real-time scale, but other configurations are possible as well, since all the pieces here are composable.


I would expect every relay to have everything (modulo different opinions about illegal content). Why do you think they would have different views of the network?


Well, they'll try. The relay plays a similar role to a search engine, though without the ranking. Why do different search engines have different content?

As others have said, spam and illegal content are the most likely reasons.

We might also compare to email delivery; maybe ideally every message would be delivered, but in the real world, they aren't, and open relays are a hazard.


It's virtually guaranteed that different relays will take different views regarding disinformation, harm, etc.

This is probably good, too. I personally don't want my hand held, but recognize that others justifiably want someone else to filter out gore, or even opinions they find sufficiently off-putting.

I don't want to ram particular content down anyone else's throat, but I'm 100% opposed to anyone making that decision for me.


My understanding is that relays will not filter anything except really illegal material (child porn). Moderation is based on labeling, not deleting.


Unless they want to provide free file storage to the world... surely they must delete spam.


Is it possible for a relay to filter individual posts, or do the Merkle trees prevent that? (They would if they need to be signed by the originating user.)


I'm part of a marginalised group that is often considered to be illegal in many countries, or at the very least, very off-putting. (In 50 years, perhaps, things will change and we will be more accepted. To be specific, it's a grey area in the U.S.)

I'm interested in BlueSky but the relays are a worrying "point of failure" if they can just block me from there. I'll be more interested if there are multiple relays that I can use in parallel, with at least one being friendly to us, so I can still participate. Currently on the Fediverse it mostly works since we can connect directly to other servers, and there's no SPOF. But I would love to use BlueSky due to the better protocol design! Just more relays would be good.

... Wait, can we self host relays? And is it plausible for a community to do that? I couldn't find anything about it.


You can host your own relay but if you want accurate likes/replies/etc you need to crawl the entire Bluesky which could be quite expensive in the future.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: