This seems like a fair point, and it makes me sad. I guess to the second part of my question: How would you build a community (or set of communities and identity systems) at scale that doesn't suffer from this? Such that you could point someone to it without destroying it.
I think you'd want a comment ranking system that rewarded people for voting according to the thought that went into a comment rather than whether you happen to agree with the comment.
Maybe a two dimensional voting system, one for "quality" and one for "agreement" (literally a 2D voting arrow widget? up-right means "high quality and I agree", down-right means "low quality but I agree", etc).
It would then be pretty simple to see who's "quality" and "agreement" votes don't strongly correlate, as well as who writes quality comments, and weight their votes more heavily.
If you only had 1 dimension voting like everywhere else, then I'm not sure how to do it, but maybe it's possible.
I've always thought Slashdot's old moderation system was pretty well designed. It's been years since I've been there, but as I recall, there were several categories of upvote -- Interesting, Informative, and Funny come to mind. There was a meta-moderation system to calibrate the moderators. The site was better than HN at showing only upvoted comments by default, in case you just wanted to see the highlights. (Oh, and I think it would send email for belated replies, making it better for ongoing discussion -- something I've definitely missed here at HN.)
Although the site is pretty dead now -- it doesn't have HN's advantage of a wealthy benefactor keeping it ad-free -- its original incarnation had some good ideas.
There are users who don't care about the voting system the same way the designer of the voting system intended. The only way to enforce this would be by appointing a team of moderators that read every comment and rate it themselves. Obviously this is problematic because users might perceive the moderators to be authoritarian and the potential for abuse is pretty high. However, this could definitively increase the maximum size of a high quality community to something like 10000 users but it's still far away from 1 million or more users.
I think that can be solved if a majority of early users use the voting system in the way it was intended. Say there's a cluster of 75% "good" initial users who vote similarly on quality. Any new users that vote similarly to them on quality would also be classified as "good" and be given more weight.
Even if the majority of new users don't use the voting system correctly the system could be weighted more heavily to the group of existing and new users who agree on what quality is, even if it ends up being a minority of users.
Actually, I think even if a majority of users don't vote in the way that was intended from the beginning you might still be able to handle it because you can throw away users whose "quality" and "agreement" votes are most correlated, which I think is the most likely way users would deviate from the intended voting system.
Now, if you had a majority of users who all voted the same but were not correlated with "agreement" (e.x. "vote according to the day of the week" or something), or if only very few users were voting correctly, then it might be difficult to distinguish, but that seems unlikely.
Moderation and a moat. Metafilter has survived for a very long time with (a) a $5 charge to create an account and (b) a 24-hour cooling off period, so you couldn't make an account to do a driveby comment on a thread.
It turns out that you really can't have both community and scale. The attempts end up aging differently, but inevitably, to something that the core membership doesn't want to be a part of and thus the predicates that enabled its existence stop. Its A people move on and the B people move in. This is true of every human group, subculture, or social phenomena really.
I strongly recommend reading Clay Shirky's commentary on the nascent phenomena of "social software" back in 2003:
The Decline, the formation of cliques and factions, incidents of abuse, of intellectual violence and namecalling. The software becomes encrusted with patches and extensions, the unwritten rules are flouted regularly and the meta-rules all but forgotten. It is a time of either shrinking membership, or overwhelming growth.
The Fall, an incident, whether social or technical that makes everybody realize that things aren't like they used to be. It usually leads to a revision or addition to the software as this is the easiest thing to fix.