Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
News Feed FYI: Further Reducing Clickbait in Feed (fb.com)
286 points by frostmatthew on Aug 4, 2016 | hide | past | favorite | 155 comments


This reminds me of FB tacitly approving of those overly aggressive social game ads / updates for a while, until FB itself became strong enough to not need them anymore.

It also reminds me of Twitter encouraging 3rd party developers, then cutting them off once they became strong enough to be alright without them.

Clickbait articles surely padded FB's bottom line, but now that FB has become strong in the news feed / media source space, perhaps it feels that it can separate itself from the devil's bargain.

Yes, I hate clickbait just as much as the next HN'er (and would like to not consider it in the same vein as Twitter API use), but putting aside the nature of the content, the power dynamics at play seem analogous.


You're right that Facebook is starting to exercise its strength in unhealthy ways, using companies until they decide to discard them. Most of these clickbait sites are getting traffic by paying celebrities to post the articles on their Facebook pages, cutting FB out of the revenue loop (example: [1] data: [2]). They have come to realize that they aren't going to get these companies to pay their extortionate rates for boosting page posts, so now they are moving to crush them instead. While this specific move may turn out to be a net positive for users, they are sending a clear message: pay us or we'll kill your business.

Facebook's behavior is getting more and more territorial, anti-competitive, and revenue driven. They are turning into the scary behemoth that many people feared they would.

[1] https://www.facebook.com/georgehtakei

[2] https://medium.com/@jbialer/influencers-partnerships-are-vir...


I'm having trouble seeing how the scenario you describe is 'unhealthy'. The behaviour to drive clickbait into people's feeds you describe is essentially the same kind of thing as email spam: using a loophole to drive the cost of reach as near to zero as possible in the hopes you'll make it up on the few suckers who click. If this isn't good for the user, how is it unhealthy for them to curb it?

And while I'd rather they be paying me to have my eyeballs available to them, driving up the cost of my eyeballs still benefits me even if I do not monetarily profit from it.

A business doesn't have an inherent right to exist. If it exists because it annoys people in a particularly profitable way I don't know where this sympathy is coming from.


I'm having trouble seeing how the scenario you describe is 'unhealthy'

I wasn't referring to killing clickbait as being unhealthy. I was referring to the practice of implementing policies that kill businesses because FB isn't getting their cut. There have been several much less cut-and-dry situations - for example, the drastically reduced organic reach of Facebook page posts for small businesses.


Right, but why is that unhealthy? If the cost of advertising on facebook is zero or near it, users suffer very badly (as on any platform). Someone has to force the cost up to make the platform livable, and the only party that realistically can in this transaction is facebook.

Like, I get why other things named in this thread are unhealthy. Particularly twitter pulling the rug out from app developers (which facebook also did, and I think was also relatively unambiguously unhealthy). I am not saying that all control of a platform is necessarily healthy, but you seem to be saying that it is unhealthy because it made an action that most see as undesirable unprofitable (thus causing businesses to fail).

Thus, you seem to believe that these businesses have some kind of inherent right to succeed on the good graces of another business allowing them to. I'm asking you to explain how that's in the public good.


It's unhealthy when they allow it for a long time, benefit from it, and then pull the rug out from under them once they believe it is in their financial best interest to do so. It's exactly the same as your Twitter and Facebook developer examples.

Developers were monetizing the Twitter platform until Twitter pulled the rug out. Facebook has similarly neutered their API to the point where it is nearly useless, and many businesses were crushed in the wake of the changes. You seem to be saying that it's only unhealthy when developers are affected, but that isn't the case. Developers have no "inherent right to succeed on the good graces of another business allowing them to" either, yet you see that as unhealthy.

Publishers and developers are just users of the system, getting crushed when Facebook doesn't believe it's getting a big enough cut. I would also argue that Facebook actively blocking the sharing of external content that its users choose to share with each other simply because they aren't getting their cut is different than the affected businesses believing they have an "inherent right to succeed on the good graces of another business". Businesses that don't directly do anything on Facebook will be affected by this, because if their visitors wish to share their content, they will likely attempt to share it through Facebook.


It doesn't affect users though, it affects pages. Articles shared by users as a post don't seem to be affected, it just ranks pages that consistently post clickbait lower.


Its not just pages. The blog post said that links to domains that tend to publish clickbait will rank lower. In other words, if an article I share that happens to reside on a domain that they have classified as a clickbait source, most of my friends won't see it even though I went to the effort of sharing it and wanted them to see it.


> Facebook's behavior is getting more and more territorial, anti-competitive, and revenue driven. They are turning into the scary behemoth that many people feared they would.

I do not fear whatever Facebook turns into because I can decide to stop using it. If they want to continue doing evil things, I can stop logging in, no? Why would someone fear that FB is turning into something?


Let's say you run a restaurant. Facebook decides that they aren't getting enough money from local restaurants, and thus choose to greatly reduce the exposure of content that users share with their friends about restaurants that aren't paying them. You aren't even a Facebook user, but your competitors are and do pay them. Mentions of their restaurants actually get higher visibility than normal content and go viral, while mentions of your restaurant get no exposure at all. Your restaurant dies, while your competitors thrive.

It doesn't matter if you log in or not. Facebook policies can affect the fortunes of nearly every type of business.


This seems like basic market economics at work; as a restaurant owner, you have to make a choice - do I make more money back by paying Facebook their fee or by not paying them. If the answer is that I make more by paying them, then obviously they are providing a service that is worth what they are charging. If not, then it is not worth it and you don't pay it.

I am really struggling to see the problem with what you describe; how is it any different that advertising anywhere else? You decide if the cost of the ad is worth the ROI. Part of that decision includes knowing that your competitors might also advertise.

Does Pepsi get mad that Coke spends millions of dollars on Super Bowl commercials so they have to, too?


>I am really struggling to see the problem with what you describe; how is it any different that advertising anywhere else?

Facebook holds a near monopoly on social networks, and it's exploiting that to extract rents in the form of advertising


The network that displays the Super Bowl doesn't get to decide which brand people talk about after the game. Facebook is using News Feed to control conversations and narrative by controlling what is seen and not seen.


Does Pepsi get mad that Coke spends millions of dollars on Super Bowl commercials so they have to, too?

Probably. Advertising is somewhat analogous to mutually assured destruction.


While I see your point, I would believe this more with a website that was specifically used to search, aggregate, recommend, and communicate with restaurants (or other businesses). When did we start using Facebook to search for restaurants, movies, malls, shopping? I honestly do not consider FB as a source to find such places -- Google Maps (or just any search engine people use), Yelp, or whatever else serves its purpose much better, wouldn't you say?

Is a crowd that uses FB the most also the one with enough spending power to have such an influence?


Where I live now (Philippines) this is exactly how Facebook is used. Local businesses don't even put up websites many times because it's far more important to have an updated and active Facebook page.

Additionally, all sectors of the market use it here.

However, I definitely recognize this isn't the case in the US and many European countries.


This is no different to any other form of advertising in history.


No, it's not. As the preferred social sharing platform, they control conversations about things by controlling the exposure that various shares get. That is entirely different from everything that came before it.


"Let's say you run a restaurant. Facebook decides that they aren't getting enough money from local restaurants, and thus choose to greatly reduce the exposure of content that users share with their friends about restaurants that aren't paying them. You aren't even a Facebook user, but your competitors are and do pay them. Mentions of their restaurants actually get higher visibility than normal content and go viral, while mentions of your restaurant get no exposure at all. Your restaurant dies, while your competitors thrive."

This is no different to word-of-mouth reviews, that came from people who found out about a restaurant from a paid advert in a newspaper in 1950. If your restaurant didn't list in the paper (i.e. you didn't want to pay the premium for the advert) then you were at an immediate disadvantage.


Presumably as it became apparent that Facebook was squelching conversations about businesses that had not paid the tax, people would learn to have those kinds of conversations elsewhere instead.


Well, yes, but that's sort of like saying "I don't care what they teach in public schools, I can choose a private school for myself or my family." In the end, since the majority of people are consuming that thing you are rejecting, there's a high likelihood of it affecting you in some manner anyway.

That is, FB is a behemoth, is a major source of current events for a large swath of people, and their decisions unfortunately may have wider consequences than we or they necessarily predict.


I must be missing out on FB because I do not use it for searching products, concerts, movies, news, restaurants, etc. It only shows me updates from family, specific business I have "liked", etc. But perhaps your point is, many people spend hours on it instead of news sites, thus see the content shared through that via walls instead of the main pages of news websites?


> But perhaps your point is, many people spend hours on it instead of news sites, thus see the content shared through that via walls instead of the main pages of news websites?

Yes. I don't use Facebook, but my significant other does, and of the few news items here that I think are relevant to her, she usually sees them shared through someone's wall post 12-48 hours later. For some people it's a giant peer-to-peer new sharing site, and it's the primary source of news they have. That's extremely powerful position for Facebook, which is also why they got into so much trouble when it came out a while back that they were tweaking people's feeds slightly based on political views.


Are you sure you're blocking all their cookies? Any site with a like button or Facebook comments will track you and use it to monetize your visits.

Ultimately it's like saying you aren't concerned about drug laws because you aren't in the drug economy... It has a high possibility of affecting your life in some way regardless.


If PrivacyBadger blocks them, then hopefully those are.


Cause you assume that you can just stop using it. If enough companies/people build in hooks on mobile or across the web or in any of the other million places that they are integrated (VR etc...), you're part of Facebook whether you want to be or not.


You can stop logging in, sure, but they still own all your data


> While this specific move may turn out to be a net positive for users, they are sending a clear message: pay us or we'll kill your business.

I mean... facebook needs to make money too. These clickbait companies are making money of of facebook's audience, facebook's hardware and software... it's only fair that facebook gets a cut, isn't it?


How many years has it taken them to get to this point? It's not about money.


You're right about the power dynamics, on the other hand, this is good for consumers and perhaps journalists and content provider who don't engage in clickbaity content. In other words, this may drive the news industry to be more substantive and that's a good thing for consumers.

So I guess what I'm saying is I don't mind when force is used for good --on the other hand, this strength, as you imply, can be used for conduct undesirable behavior on their part.


You're absolutely right that this is a similar dynamic. At this point this is FB's MO and you have to be a fool not to anticipate it as inevitable in each ecosystem/market they develop.


They did the same thing with overly aggressive affiliate advertising in the past when the first opened up their self-serve advertising platform.


That's a weird analogy. It's not like there was a lack of content (good or bad) competing for attention before...

I think a better analogy is "we have now implemented a spam filter. F U spammers."


Or: "we have now implemented a spam filter. F U spammers, your ad revenue isn't worth it anymore."


Does it matter? Oftentimes people do things that are socially positive for purely selfish reasons.


Clickbait is just a conspicuous example of a more general problem: sensational and scaremongering news there is no conceivable benefit in knowing. This category encompasses most political news, and the daily litany of misery, crime and terrorism that fills the headlines. I think FB should filter it all out, and stop news organizations lining their pockets by making us unhappy.


It's just that it's very hard to determine what's 'wrong' and what's not, other than very extreme cases and even then you might disagree. Banning some media because you don't consider right/worthy/beneficial/whatever doesn't help with free speech at all.

Imagine someone banning Justin Bieber music because it's not 'worth listening', or banning some books because they have no value. That's a bit absurd.


The way Facebook chooses articles for its feed has no effect on anyone's free speech. Comparing it to banning books is silly.


"Experimental evidence of massive-scale emotional contagion through social networks" [0]

"Was it funded by the US army? First the university said yes" it's US led information warfare [1]

[0] http://www.pnas.org/content/111/24/8788.full [1] https://www.theguardian.com/science/head-quarters/2014/jul/0...


This always depends on which receing end you are. If your book/blog/status is being _hidden_ by newsfeed, then you'll be screaming for freedom of speech and fairness in how people see their newsfeed and to let them decide what is worthy of their attention.


He did say filter, and not ban. To me that implies it's still accessible, but perhaps hidden by default for my user.


Lots of companies "ban" Justin Bieber, in that there are lots of places he and his music are not found.


People obviously want to read it, because they click on it.


People obviously also want heroin, because if you give it to them, they'll take it and ask for more.

Heroin induces desire for a harmful substance by exploiting the chemistry of our brains. News platforms induce desire for harmful information by exploiting cognitive biases and heuristics.


This is true... but what do we do about it? Outlaw click-bait? That will work even worse than outlawing heroin has.... at least heroin is fairly easy to define and identify.

Click-bait is such a nebulous term, that anyone can classify articles they think are vapid and pointless as click-bait, while someone else might find it informative and interesting.


This strikes me as a uniquely Californian point of view.


I'm curious why FB chose to manually label the data vs collecting feedback from users. Eg via a dialog on a large sample of users and articles asking whether the article was clickbait, or asking users to rate it, or some such.

The benefits of this approach, besides a lesser degree of tedious manual data review and entry up-front, include: ease of retraining the system as clickbaiters inevitably adapt in this new arms race (can regularly gather fresh user feedback and feed the updated corpus and labels into the system), and perhaps also closer affinity with what users actually think qualifies as clickbait (as opposed to FB's internal definition). This soft of approach may also lead to more differentiated filtering on a personalized basis or an affinity-group basis... ie they'd have the opportunity to model user-behavior features and create differentiated filters based on user behavior/preferences.

I'm sure there were very good reasons for going this way, so I'm not second-guessing. Just curious what the tradeoffs were in the decision, if any knows or can make educated speculation.


Crowdsourcing to the users is an interesting alternative. My startup actually considered doing just this in our browser plugin [1]—adding a feature that lets people assign a clickbait rating to a link. We would then get a crowdsourced rating for links, and low-rated links would be grayed out for our plugin users. We ultimately hoped to release the data on rankings publicly, so that a predictive algorithm could be created and used by others as well. This feature aligns moderately with our mission/product, which is about reading efficiently on the web. But we've had other priorities so far and haven't built it yet.

I think the reason that crowdsourcing wouldn't make as much sense for FB is that their audience is less early-adopter than ours. Some people wouldn't know what clickbait is. It's not just about whether you like the article or not—it's specifically about whether the headline mischaracterizes or inappropriately teases the content. FB probably decided that they wanted to train the algorithm carefully, so they used an internal team instead of a crowdsourced solution.

1: https://chrome.google.com/webstore/detail/beeline-reader/ifj...


> Some people wouldn't know what clickbait is

The term "clickbait" need never be used. One could describe the characteristics of clickbait and ask about them. "Do you feel this article was meaningful?" "Did you feel this article required to click too many times?" "Did you feel this article had a misleading headline?" Or whatever more refined version of these questions might make sense.


I suppose if you were to ask several questions you could avoid this term. But none of the above questions, taken alone, captures the essence of clickbait. Not sure how many people would be willing to answer 2-4 questions many times over. And people who are familiar with the term clickbait might wonder "why are you beating around the bush—just ask me if it's clickbait already!". And as others have pointed out, clickbait creators might try to find ways to game the survey system to make it less effective.


You wouldn't need to ask questions in succession to the same user. If you have, say, 4 questions you can ask 1 of them to each of 4 users.


Most people don't consciously think "News Feed is full of clickbait"; instead they value the overall experience less, e.g. "I find News Feed less entertaining/worthwhile than BuzzFeed". They aren't aware of the curation going on under the hood, and soliciting feedback is considered more disruptive than anything.


I met someone who was somewhat internet-addicted to buzzfeed. They would spend hours a day going through buzzfeed pages and get a tremendous source of enjoyment from it. So it's understandable that Facebook finds (no matter how hard they have to look for it) the value that users see in such articles and implement a filter that retains as much of it as possible.


I can see that as a reasonable concern. Two points though:

1. Speaking personally as a user, I wouldn't mind this a bit. And would actually love the ability to provide feedback more readily. I often hate the content of my feed and I would feel much better if it seemed like I had greater input on its filtering.

2. Whether gathering feedback has a negative effect or not seems like a testable hypothesis and this could be tried and measured, rather than simply speculated about.

Given the above, I'm not sure if this would be a rigorous rationale for avoiding the active feedback experiment.


re: #1, you can. Each post on your feed has a "show less like this" option on it.


It would be gamed. The people who have the most incentive to take part in a "crowdsourced" labelling exercise to reduce clickbait are, of course, the people generating the clickbait.


Not if they have no control over who is asked to label. You can evaluate how trustable labels are by manually reviewing few random ratings and extrapolating it on the entire population.


>I'm curious why FB chose to manually label the data vs collecting feedback from users.

FB is collecting data from users in a way. Yes, it is not a feedback. Since it is a learning process so getting feedback for each post is going to be an obtrusive behavior.

For more details, please refer to following line in the article - To address clickbait headlines, we previously made an update to News Feed that reduces the distribution of posts that lead people to click and then quickly come back to News Feed.


I did read that. They note that this passive feedback / preference inference was insufficient in terms of achieving their aims. Which is why I'm asking about potentially gathering active feedback / explicit preference.


you know how facebook has been in the news for manipulating news? Now they can legally do that.

They're telling you upfront that some human has influence over the flow of information to users on their platform. This is marketing it as a feature, giving them full legal immunity.


While they're at it, can they do something about the "slideshows" that consist of 8 sentences spread across 25 slides?


Publishers love this. Especially when it causes ads to reload between each click. While we're add it, why don't we add a special "ad slide" that's seamless.


Whats so horrible about them? I don't have to go anywhere off the site. I get the simple version of the story (which often times is instead a 500 word piece over explaining it) and it doesn't require sound.


"It is not a news story, but it is a slideshow" http://youtu.be/IVfslRsNXUc


This ... I ...


Totally agree with ...


... this sentiment.


I can't help but feel these engineering challenges will always be spinning their wheels against human nature. It's your "friends" -- after all -- who are sharing these stories.

I guess it's easier for Facebook to fix its algorithm than for most of us to fix our friends.


Think of it like putting fruit instead of potato chips in the vending machines - you can't change the fact that people like potato chips, but you can make it easier for them to make healthy choices.


Yeah, my general policy is if someone consistently posts stuff like this, either lower their frequency in my Feed or Unfollow altogether.

Similarly, see memes.

Been tempted to remove if they also consistently post severely biased sources, but then I'm culturing my own isolation bubble which I regularly rail against, thus guilting myself into tolerating and trying to understand these individuals.


Before lowering someone's frequency or unfollowing I hide all content from certain sources. My feed is entirely clear of HuffPo and Buzzfeed (neither is worth seeing the rare good content for all the clickbait), among many many other sources. Someone only gets unfollowed if they still manage to keep finding new annoying websites to post clickbait from.


FYI, you can also block pages. It's how I got a lot of the Bernie Sanders spam out of my feed during the primary season, and I didn't even have to unfollow my friends to do that. Unfortunately, it didn't get all of it, because you can't block groups, so every time one of my friends shared a post from a Bernie group, I had to block the individual person who posted it to the group.


How do you lower someone's frequency in your feed?



Human nature reacts to the environment. Change the environment (by reducing clickbait prevalence), and the observed behavior will change.

Your last point hits the nail on the head. Changing core aspects of human nature is not really feasible, so we have to to modify the environment or context in some way to better accommodate those aspects of human nature.


Maybe there should be a feature to auto-unfriend someone if I thumbs down their posts a certain number & proportion of times?


You can also "like" pages and see content from non-friend organizations in your feed.


I find this really interesting, there are so many new 'media' companies that have popped up who have been able to grow because of the effectiveness of the "You won't believe what happened next" style headline, much like modern SEO has moved away from grey hat tactics and now seems to focus on "aligning content with Google's values" I hope that content producers start being rewarded for building great content that's good for users instead of writing click bait headlines.


The really fascinating bit is how transient that kind of growth has seemed to be. All it takes is FB tweaking their algorithms to close the hole you discovered and suddenly your growth story is over. So the key is to use the window of time you have to exploit the news feed to transform your company into something that doesn't need to do shabby things like exploiting the news feed to stay alive. If you just keep riding the same pony, FB will eventually shoot it dead under you.

BuzzFeed is the one example I can think of of a company managing to use that transient effect as a sort of "booster stage" into becoming more of a general media organization, but there are plenty of others who failed to pull that transition off (Upworthy, ViralNova).


> A team at Facebook reviewed thousands of headlines using these criteria, validating each other’s work to identify a large set of clickbait headlines.

What a soul-crushing job that would be.


It was actually only number six on our top ten soul crushing jobs list. Number one will surprise you! Click here!


Owch. That was painfully good.


Looks boring, won't click

Fun fact:

https://medium.com/i-data/29-reasons-youre-reading-this-arti... "29 reasons you’re reading this article or why odd-length BuzzFeed listicles perform better than even ones"


> Looking at ten thousand published BuzzFeed listicles over a period of three months I found a statistically significant difference in the performance of odd-length listicles compared to even ones.

They published at least 10,000 listicles in 3 months! now, that is what I call a soul-crushing job (writing those articles)


Seems like a similar thing to what goes behind pricing something as 1.99 instead of 2.00


- You wont believe how much money they make

- Engineers are shocked with these results

- Here is the top 10 headlines they found. #6 will make you cringe

- The CEO wrote the sweetest message to them

Perhaps they should simply ban buzzfeed or use all of their headlines as examples.


> Perhaps they should simply ban buzzfeed

What a kind, just world that would be.


You'd miss out on the handful of really good real journalism articles per month that Buzzfeed publishes in that case, which would be kind of a shame. Maybe that's how they justify the expense of doing real journalism -- it makes it harder for other sites to ban their domain.


HN bans buzzfeed submissions outright, but I would have loved to see the community's comments on this article, for example:

https://www.buzzfeed.com/andrewrice/the-fall-of-intrade-and-...


FWIW, some buzzfeed articles are supposedly of a higher calibre and have been shared and successful on HN: https://hn.algolia.com/?query=buzzfeed.com&sort=byPopularity...


Thanks. That was an interesting article.

"Roman bookies ran numbers on the election of Renaissance popes until Gregory XIV banned the practice on penalty of excommunication. Around the turn of the 20th century, Wall Street brokers openly traded election futures and newspapers quoted their prices like modern opinion polls. Strumpf estimates that at the peak of this practice, in the election of 1916, around $10 million was bet on these markets — more than $200 million in today’s dollars. By the end of the New Deal era, though, the electoral markets had all but disappeared, due to both competition — modern polling pushed the betting lines out of the newspapers — and legal crackdowns."


FWIW BuzzFeed insists they don't do clickbait because they don't over-promise on expectations (the second requirement listed in Facebook's post).

> Most clickbait is disappointing because it’s a promise of value that isn’t met — the payoff isn’t nearly as good as what the reader imagines,” Patel said. “BuzzFeed headlines pay off particularly well because they actually make fairly small promises and then overdeliver.

https://www.buzzfeed.com/bensmith/why-buzzfeed-doesnt-do-cli...


This is often required work in supervised learning / any machine learning with labels. Sometimes you get better result with human judgment on the labels, and that means you have to grind through a meaningful corpus of data.

If you believe what you're doing is important and beneficial, it doesn't feel like too much of a grind. Ultimately this was probably about a week of work, if every labeling participant independently reviewed ~3-5K articles.

Some people might have chosen to mechanical turk this, or to gather feedback from a subset of users and use those labels. Doing that might be a better approach as it might not only be less labor-intensive up-front but also allow the system to be regularly retrained easily (ie by gathering new label information as clickbaiters inevitably try to adapt their headlines). I imagine there were reasons for starting off this way, though.


Indeed, I've seen at least one paper where they used Mecanical Turk and aggregated results (by showing the same examples to multiple people) to check for quality. It sounds like they did almost exactly the same thing here.

Paper: 'Antisocial Behaviour In Online Discussion Communities' (Cheng et al., 2015)

Link: http://arxiv.org/pdf/1504.00680v1.pdf

(Search for 'Mechanical Turk' / section Data Preparation -> Measuring text quality.)


If you call that a soul-crushing job you can't have had many bad jobs. Most people would only dream of a "fun" job like that.


It may be fun for a few hours but reading clickbait headlines excessively is like eating vomit. I'm sure these people have quotas to meet so I doubt even more the "fun" element.


I have a heuristic that gets evaluated over time on randomly-sampled tweets. (It's the core of the Python "ftfy" package, which fixes Unicode mistakes based on a heuristic for whether text "looks right".)

I frequently have to read the randomly-sampled tweets for debugging purposes. And, yes, random tweets are often so dumb that my brain slightly regrets the time it spent reading them. But that is far outweighed by the benefit that the Internet is delivering me fresh test data all the time.

On the whole, I enjoy the notion that I am converting stupid babble into something somewhat useful.


That's different than reading randomly-sampled tweets or clickbait headlines all day long, is it? At FB(AFAIU) they have people doing just that which makes sense(to reduce costs) but it's a miserable job regardless the scope(i.e. to make more money for Facebook or help people who `friend` rough entities or both.)


It would be fun designing a system that can automatically detect those headlines, using machine learning and whatever other tricks you can think of.


It's also for a great cause :) that helps.


You only really need a few seconds to classify a headline. Spending an afternoon classifying data with a few colleagues isn't the worst thing ever.

You're making a big change to a system used by many millions of people. There's probably no easier way to validate your system is working as expected.


Anyone who has built a data set for machine learning systems (labeling images etc...) has done this. It's just part of the way you teach ANNs.


Over a year ago, I did a statistical analysis of BuzzFeed's clickbait (http://minimaxir.com/2015/01/linkbait/) and found that it is highly formulaic.

In fairness, the linkbait game has changed since then, with Medium posts using calls-to-actions and "just" needlessly in their headlines. Then again, with FB's machine learning expertise, I'm surprised they need a team to manually classify linkbait posts at all.


Even with expertise, you need labeling of some kind for supervised learning.


I'm extremely surprised they didn't outsource to Mechanical Turk or leverage their user base in some way for crowdsourcing the labels.


They may have. Did they release any technical details on this?


I wonder if outlets like Buzzfeed will be able to do something like write Facebook specific headlines to get past the clickbait filter, while still using clickbaity headlines for the same stories other places


I've noticed the reverse on many occasions, especially from moderately reputable news sites. The Facebook specific metadata for the page will tell Facebook to annotate the article with a clickbait headline, but the actual title will be more reasonable and informative.


This is actually interesting. When Facebook determines the article title, their crawlers go out and pull down the page. It'd certainly be interesting to catch the user agent and present a different version if Facebook is detected.


I've read that google penalizes users if they present a different version of the page to bots than end users (I guess they pretend to be a browser and access the page from an undisclosed IP). I'd be surprised if Facebook didn't do the same.


You don't even have to do anything as complicated as that.

Lots of sites already use a separate title in their OG tags from what the normal version is.


You don't have to do anything more than at some metadata to your page which Facebook will automatically use.


I believe that Google's crawler is designed to detect when it's being fed false information, and punishes pages accordingly for it. Wouldn't be hard for Facebook to do the same thing.


Someone is probably writing you a job offer as I type this message


But then the post displays the non-clickbaity title, which defeats the point.


Presenting different content to Google's spider vs. users' browsers is the kind of thing that will get you banned from Google. I wonder if Facebook has similar policies.


For the record since many people are mentioning Facebook's metadata, the actual name for what it uses is called Open Graph : https://developers.facebook.com/docs/sharing/webmasters


Just deduct points from any article that uses these phrases:

"... with this one weird trick..."

"... you won't believe..."

"... people are losing their minds about..."

"... mind blowing pictures of..."

"... what happens next..."


I say yellow flag any post with a single or double digit number that doesn't meet date, currency, or other known formats. Not sure how else to capture one of their favorite forms of clickbait

* " ## X you should be Ying"

* " ## ways to make X hate you"

* " ## times X made us laugh"

* " ## times X was almost too X"

* " ## things that make X say, "that's me"

* " The hardest ## rounds of X you'll ever play"

* " ## ways X you is the best you"

* " ## animals that will make you say X"

* " ## memes you'll only get if you did X"

* " ## people you forgot were on TV show X"

Note: all of these were taken from today's Buzzfeed.com front page


The beautiful thing about machine learning is that nobody needs to sit there and write these patterns. A system can learn from a training set much faster than you can write regular expressions and it can look at features of the text that never occur to you.


Humans can design good features for the ML algorithm to use though. A simple naive bayes filter, for instance, would not be able to learn these complicated patterns on it's own.


> Humans can design good features for the ML algorithm to use though

For now. Articles like [1] make me wonder whether this current generation of programmers might be the last.

[1] https://arxiv.org/abs/1606.04474


But someone needs to make that training set.


And in HN: "X is dead", "Why we ditched X stack for Y stack", etc.



TBH, Considered Harmful has taken on such a tongue-in-cheek life of its own that I almost don't even mind it anymore.


"...?" should have decent coverage.


While we're talking about the FB news feed, I'll plug the FB Purity extension for Chrome. It's a really wonderful extension that lets you customize what you can see on your feed from the content to the layout.


Awesome. I didn't realize how bad clickbait had gotten until I subscribed to the Economist for a short while - I could just scroll through its articles, read the title and subtitle and know what the article was about and in most cases it wasn't something I needed to read.

Hn could do its readers a huge service here by requiring descriptive headlines, instead of the current rule about not changing the headlines. Even better would be adding a 30-50 word summary under the article.


There must be people who not only click on the clickbait but then read and enjoy the article.

Which is why I wish they had instead done a joint model of features of the user, the headline and the article to predict the likelihood that the user would quickly return to FB.



A lot of 'relevant' 'authentic' 'genuine' words there. Yet I'm still afraid that this is sugar coated heroine. The very nature of facebook MUST induce "clickbaiting", if not in title one, then a stream-of-short-yet-another-video/picture/sillyarticle.

It's because facebook is optimising news stream for TIME SPENT. The longer the better. Facebook must optimise for it because:

* facebook income is directly corelated with time spent, as time spent is corelated with number of ads shown

* the longer time spent the bigger user retention in long run, which simply is fb goal.

So human/fb relation is screwed no matter what authentic, genuine PR will be written about it :(


I wonder if this clickbait filter will also flag ironic blog titles. There have been a fair amount on HN that were clearly a joke, like "We ditched Postgres, what happened next will BLOW YOUR MIND".


I've seen a rise in HN submissions with linkbait titles which are not ironic, usually to Medium posts.

The non-punishment of linkbait posts on Medium has made it a custom, which makes things worse, and it's good that Facebook is publically doing something about it.


Hope so. I'm happy to be rid of ironic clickbait as well.


I find it ironic considering the majority of clickbait articles that I encounter on Facebook are the sponsored posts. If FB was serious about clickbait, they should simply not accept money from such sites.


One could cynically view this as an extractive measure, dressed as a user-friendly effort. In other words:

Sorry, Mr. Publisher. Your most effective titles are going to be exposed less frequently for free in your audience's feeds. However, you can of course always promote your stories with whatever title you feel is most effective when exposed to users! wink


> People have told us they like seeing authentic stories the most. That’s why we work hard to understand what type of stories and posts people consider genuine

The secret of success is authenticity. Once you can fake that you've got it made.


I wish they would open up a clickbait-score API so places like HN could benefit from their research, but I guess if they did then the perpetrators would use it to test work-arounds.


I'll wait to see it before I believe it.

They also said they were going to crack down on people ripping each others videos off and yet I still see that happening all the time.


So, now that FB has "muscled-in" into the lives of most consumers, it is in a strong position to employ the classic protection racket with the producers.

Businesses that do not engage with FB or do NOT pay FB "protection money" will see a gradual drop in their consumer engagement online. FB will gladly "nudge" the consumers in the direction of the businesses who pay FB.


Those click bait and fake news sites can be a pain. About a month ago, I discovered that one of them had taken a pic of me and used it in an article about a Canadian guy with a micropenis being denied euthanasia (!).

Through reverse image search and other tools, I found that at least five other sites had copied the article and pic verbatim.

Took over a week of sending DMCA requests to get all of them removed.


BTW I found the article in the first place when it came up on a friend's FB feed, he recognized the picture, and let me know.


I just unfollowed BuzzFeed years ago. Problem solved.


I think it is a good thing for all three groups: facebook itself, because I might start using it a bit more; the users, I get the info I want, friends and family stories; and the news sites, they will be able to concentrate on the open web again instead of giving more and more power to facebook.


Facebook shouldn't be deciding what people read which they seem to be trying to do and people don't find it odd.

If people click on clickbait, it genuinely means they are interested in it. The people who don't could just visit news site like Al Jazzera, NBC etc.

It like banning ads because people are clicking in irrelevant things.

I think it's save to assume Facebook will apply click bait algorithm to there Facebook ads considering they want their users to have a positive experience ;)


Facebook shouldn't be deciding what people read which they seem to be trying to do and people don't find it odd.

If people click on clickbait, it genuinely means they are interested in it. The people who don't could just visit news site like Al Jazzera, NBC etc.

It like banning ads because people are clicking in irrelevant things. I'm going Facebook applies click bait algorithm to there Facebook ads ;)


speaking of market / social manipulation , i had a crazy thought last night.. could FB reduce ISIS terrorism by radicalized moles by carefully saturating their feeds with enough anti-extremism education videos?


You created this monster, Facebook. You did this.


Thank god.


Great, but I'd rather them work on eliminating all the engagement/wedding photos from people I DON'T know from my newsfeed.


I keep hoping for a "you should probably have realized by now that your baby is a lot uglier than you think" filter.


OT: What is the distinction between "fb.com" and "facebook.com"?


There is none! It's a redirect, afaik. Fun fact: They paid over $8M for to acquire the domain from the "American Farm Bureau Federation".

[0] http://mashable.com/2011/01/11/facebook-paid-8-5-million-to-...


Farm Bureau ( fbfs.com ) sold fb.com to Facebook few years back.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: