I can't tell you what they are, but there are probably internal Google incentives to filter and internal Google incentives to not filter, and the ones to not filter are probably stronger.
My theory is that google went from ads in search results to ads on visited pages. By buying doubleclick etc they are suddenly incentivised to drive traffic to ad-supported websites.
Almost all the interesting factual websites are not ad-monetized. The SO spam etc are all scraps of the factual websites with ads injected. If google simply deprioritized ad-supported websites the search results would be much cleaner, but the part of google that sells the ads on sites instead of in search results would throw a fit.
We could test this. Take a few hundred search queries, strip the pages that display Google ads, and see if the remainder of the search result is better or worse.
We'd need to get some humans in to rank the results, but that's not a big problem. "How well does this web page answer this query, on a scale of 1-10?"
With a collection of ranked pages, we can answer other questions as well. I'd be interested in running the same test but for google analytics, not google ads, as I think there might be a misaligned incentive there too.
It's worth bearing in mind that the stackoverflow clones may actually answer the query just as well as the original site - that is, it might be our definition of "a good result" that's out of whack (because we have an unnecessary bias towards the original source). I doubt this, but again it's something that's testable.
I don't doubt it, but obviously something's going wrong between the human-generated training data and the SERP, else why are we getting utter crap back?
(Or, as I said, it's our idea of what constitutes a good result that's wrong).
But the same websites show up in e.g. DDG (through Bing), as far as I know neither DDG nor Microsoft make a dime from ad-supported websites like Google would, why are these results not nuked similarly to what Kagi is doing?
Aha. Couldn't help but scratch my own itch. I wonder if DDG has a deal with Google where they get a cut of the ad profit if they are mentioned as a `ref` in the doubleclick ad request.
:path: /pagead/viewthroughconversion/796001856/?random=1695374589838&cv=11&fst=1695374589838&bg=ffffff&guid=ON&async=1>m=45be39k0&u_w=2704&u_h=1756&url=https%3A%2F%2Fwww.geeksforgeeks.org%2Fc-plus-plus%2F
&ref=https%3A%2F%2Fduckduckgo.com%2F. <<<< What does this do?
&hn=www.googleadservices.com&frm=0&tiba=C%2B%2B%20Programming%20Language%20-%20GeeksforGeeks&auid=68284397.1695374483&data=event%3Dgtag.config&rfmt=3&fmt=4
Hence providing the same incentives to keep shitty sites like geeksforgeeks in the results.
I guess also geeksforgeeks is incentivized to report these references, so that search engines and other linking services will continue to show their links.
To reproduce:
1. Go to duckduckgo.com and do a search that will turn up a geekforgeeks website
2. click on the link
3. watch the network tab as requests are made to googleads.g.doubleclick.net and check the path.
Most other search engines train with a target of google or with some form of reward which is bootstrapped on google rankings. It makes Bing results implicitly have the same behavior as Google. DDG and others just use BingAPI so googles incentives pass on through.
That doesnt make much sense to me. Google's interests are not microsoft's or DDG's interests and to hold up Google as some sort of ground truth in what the optimal search results for a given query are is, as proven by Kagi, highly deluded and also quite subjective.
If true however, it does go to show that Google is really a monopolist in the search space as well... and to substantiate this claim would go a long way into proving that.
I can't tell you what they are, but there are probably internal Google incentives to filter and internal Google incentives to not filter, and the ones to not filter are probably stronger.