Hacker News new | past | comments | ask | show | jobs | submit login

The problem isn't just SEO, it's that Google itself aggressively rewrites queries to produce more results (which I suspect they want to do to show more "relevant" ads).

On the most extreme end of this, I've seen four-word queries produce results, in which three of the words were stricken out. More often, it's just one word, but it's usually exactly the one that makes the difference between a very specific query, and a very generic one.

Worse yet is that they try to do synonym substitution, but their algorithm has a ridiculously low bar for that. Like, you might be searching for "FreeBSD", and it will substitute that for "Linux", or even "Ubuntu". Or search for a specific firearm model, and it finds "gun".

Quoting keywords suppresses all of that, but synonyms are actually useful - if it did them accurately...




I left Google in 2010, so it's just a wild guess, but I suspect a big part of the issue is learn-to-rank is probably being trained on everyone's searches. I think it would probably do much better if they used the presence or absence of search operators as a simple heuristic to separate power user searches from common searches, and trained a separate ranking model on power user searches.

Maybe they're already doing this, but it sure acts like learn-to-rank is always ranking pages as if the query were very sloppy.

It's been a long time, and I certainly never read the code, but I vaguely remember a Google colleague mentioning something (before learn-to-rank) about a back-end query optimizer branch that would intentionally disable much of the query munging if there were any search operators in the query. There was some mention about using cookies / user account information to do the same if the same browser/user had used any search operators in the past N days, but I'm not sure if that was implemented or just being floated as a useful optimization.


Google image search is also not searching for a duplicate but what object the ml recognized on the picture. For image search I switched to Yandex.


I wonder if this synonym substitution was their use case leading to invention of word2vec.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: