I find this to be a particularly interesting question, not because I actually care about the answer, but because it feels like the data should be able to tell us the answer... My main goal was to tease out the ranking algorithm from the data in a simple and elegant fashion. This made it a little more interesting as an endeavor and hopefully makes it a more interesting read as well.
You're right, the details of the algorithm aren't hugely interesting and are generally available. The point here was to use the data to uncover it in a somewhat novel way. Figuring things out can be fun in and of itself, even if the answers are already available.
it's from 2345 days ago, which is < 7 years and I'm guessing it had a reasonable lifetime after that anyway - how come you're thinking that it's > 7 years old?
And my point wasn't so much about the actual algorithm either, just that sometimes a pinch of Google = a dollop of calculus :)
reddit's is open source and available for anyone to see. Knowing the algorithm doesn't help you game it in this case -- it's knowing the spam controls that do, and those are secret.
From the point of view of someone gaming HN, there's nothing to be gained by knowing the ranking algorithm. It doesn't matter if you know five flags will bump you off the front page, or that mods can adjust the position of your story, because that's not something that you can use to your advantage. The only thing you'd be able to do with that info is to use it maliciously, like flag a story off the front page. But everyone already knows that you can remove a story via flags, and it's pretty easy to figure out how many flags are required.
Honestly the argument "there are bad people, we need to worry about them" is getting tired. It was cool when pg was showing off Arc to the community. It felt like we were all discovering something new, and how to build a community together.
A person trying to game the front page could get all kinds of useful info from there. For instance how the flame detector and voting ring detector work and what penalties exist and when they trigger. The ranking algorithm is a lot more than a simple formula.
There was a flame detector (and presumably a voting ring detector) six years ago, but it wasn't revealed by pg showing off that algorithm. Dan could just redact whatever he's not comfortable showing.
This is the article that was discussed in yesterday's "The stories that Hacker News removes from the front page" [1]. After speaking with @dang, it sounds like what happened with the original submission was that a moderator accidentally put "(2010)" in the title and users flagged it because they incorrectly thought it was old. He invited me to resubmit the article today to allow for real discussion and to demonstrate that what happened to the first submission was accidental.
I know that this analysis will get less attention than the one from yesterday, but I personally find it far more interesting and hope that it can stand on its own merits. I'll be around to answer any questions that might come up.
Not by itself, but if the age suggest that the content is stale it could be. How the HN algorithm worked 7 years (assuming it has changed) isn't of that much interest, even though the analysis might be.
Flags are for content that's spam, off-topic, inappropriate. While you might claim that the content is dated, it would likely be better to post a comment providing a more recent source for the topic, not upvote the submission, and allow the community to decide for themselves what the best source of information is on the topic.
Flags are downvotes, given the lack of an actual downvote, and hence have much broader uses. What they should be used for is irrelevant to what they're actually used for.
Realize they are used as downvotes which is why using them as part of the ranking system makes no sense. Adding a downvote button also makes no sense. If you see something you do not want to see, but others do want to see it, click "hide" and move on.
That entirely depends on how you see yourself and other users as a member of the community. Do users have a way to influence what content the community has besides what they individually submit and upvote? Yes, flagging. Similarly with comments, we have more than one's own comments and upvotes, we have downvotes. Downvotes and flags both serve as useful signals to others what the community wants to discourage. Hiding / ignoring sends no signals, at least no different signals from never having seen the item at all.
This is a ridiculous position to take. The flagging operation is for things that violate rules and it severely punishes posts. It's not meant to be used as a tool for you to shove your opinions of what is worthy onto the community.
A post will die on it's own if it fails to receive upvotes. If you think something else should be on the front page, you go and upvote something else or leave a comment on the article explaining why it is crappy (comments down-weight an article). If you can't find something else that's better, then move on and stop acting like some gate keeper of worthy content.
If everyone behaved like what you're suggesting, the front page would just be a bland pile of the lowest common denominator content that displeased the fewest number of people.
The front page often is a bland pile of lowest common denominator crap. We couldn't even go a week without dumb political stories which are covered everywhere else (though charitably more because of miscommunication that it was meant to be an experiment and only a week long, still). Lots of people don't even look at it because they gave up, though I'm not exactly close to that point.
This is all beside the point that flagging is (no matter what it ought to be) an extra signal that has broader uses than merely spam/rule violating. I could also argue that "off topic", a use mentioned earlier and on the guidelines, is sufficiently broad and subjective that "something I think the HN community would be better off not discussing" fits "off topic". In any case the flagging mechanism is still there. The site does remove flagging privileges if you use it too often, so there is clearly a sense of how flags ought not to be used too often (or else you lose them) but that hardly influences why flags ought to be used.
Thanks! That's a good question. I feel like I've seen analyses like that before a few times but I'm having trouble finding one right now. I would guess that submitting around 8 EST is probably best in terms of getting the most views because you'll catch most of the US audience during the day. The probability of making it to the front page is another question though and it would definitely require looking at the data there.
I can actually think of some other relevant metrics here that I don't think I've seen quantified before. I'll probably play around with this a bit at some point and if the results are interesting then I'll write them up. If I do though then it will most likely be a few months down the line. I've already written twice as many HN meta articles as I was planning on and need to take a little break :-).
I did a brief analysis a few years ago after shoving the HN Algolia results into Postgres. IIRC, the optimal time (for highest median score) was on a Sunday afternoon or something like that. I figured that meant that you're not competing as much for views on Sunday afternoon and will get higher-on-average points. Then you may still be around on page 1 or 2 for the Monday morning rush and get a lot more traffic.
> it sounds like what happened with the original submission was that a moderator accidentally put "(2010)" in the title and users flagged it because they incorrectly thought it was old
Which bit? The 2010 thing is precisely what happened. It was a case of sleep deprivation, which is one lesson of how trying too hard to make this place good can mess with a person.
The other bit was just my attempt to explain why users might have flagged the post. User flags were what demoted its rank, and it isn't obvious why people flagged it. There's also the issue that meta posts aren't great for HN in the first place, but those rarely lack for upvotes.
It would be nice if there was non-filtered view of HN available for users with a rep above 500, much like the "show dead" option for user comments that have been hidden. Basically allowing the submissions to be placed in the submission rank as if their ranking was not pinged due to users flagging the submission.
Understand though I could easily see spammers using the data and getting 500 really should not take more than a month if you post a handful of comments a day; for example: 25 days, 5 comments a day, average of 4 upvotes per comment.
Worth noting with a rep of 500+ HN users are able to downvote comments.
Yeah, and I think the karma system works well overall. It isn't ideal for people that are significantly more consumer than producer of content, but I would guess that is the minority.
I feel there are a lot more consumers than you might imagine, myself included. I'm on HN constantly but I rarely ever comment. The karma system is not for everyone.
I'm not sure why that information would be valuable to spammers. It would only tell them that their posts are being flagged, but it wouldn't really help them get to the front page.
From the point a spammer is identify until their interaction with the system becomes a burden to deal with, it's better to not let them know it's known they a spammer. Craigslist does this by "ghosting" postings that's flagged as spam; that is the spammer sees the posting as live, maybe even gets an automated reply to see how/if they respond, but the general public does not see the posting.
if you don't need the filtering from the hivemind, you can read https://news.ycombinator.com/newest (I'm not sure if it has a RSS.) Warning: it's full of crap.
And you can enable the showdead flag. Warning: The worst crap is [dead] so this version has even more crap.
Anyway, there are from time to time some interesting articles that are unlucky and don't get even a vote, so after reading the front page please go to the newest page and try to find a hidden jewel there.
To be fair, "the feed" (assume you mean the new feed) is populated by the hive, it's just not sorted by votes or any other filter; of course unless the mods remove/hide a submission for some reason.
It was asked by someone yesterday, but the question got lost in the noise, whether the voting ring detection extends to flagging rings (whether on posts or comments)?
It would be naive to assume that it doesn't happen...
> (= gravity* 1.8 timebase* 120 front-threshold* 1 nourl-factor* .4 lightweight-factor* .17 gag-factor* .1)
[0]: https://news.ycombinator.com/item?id=1781417