Hacker News new | past | comments | ask | show | jobs | submit login

Google engineer here.

We run experiments that show ranking improvements before launching changes to how we interpret query words. I would guess that for every time you notice Google "ignoring the word you asked for," there were several times where we got you the right result even though it didn't have the exact words you asked for, and you didn't even notice. We're not perfect but we're always working on improvements.

We also added "Verbatim Mode" to save you the trouble of putting "each" "query" "term" in "quotes" when you want to exactly match all your query words.




The old mode was, you put a + before a required term, and you quoted exact (multi-word) phrases. That worked reasonably well and was easy to understand. A + could apply to a single word or to a single phrase, as could a -.

Verbatim was a regression, given it requires you to use the UI instead of typing inline, and it seems to interact with other search options in hard to predict ways.


Putting an "word" in "quotes" is the inline way of exactly matching an individual word. It's conceptually similar to putting a phrase in "quotation marks," so personally I think it makes sense that we have one operator for literal searches.

> A + could apply to a single word or to a single phrase

I don't think a + could apply to a phrase. That was one problem with the + operator; it was not clear to users how it actually worked. In fact, though there were many searches whose results were helped by their use of + (many of whom made by users who are commenting here on HN), there were many searches whose results were largely made worse by their use of +, whether it was inadvertent or overzealously applied.


That was the AltaVista innovation, and one I'm sure Google is happy to weed out of everybody's minds regardless of the fact that it works and works well.


When Google first appeared on the scene, I remember being disappointed in its lack of tweaks, compared to AltaVista. However, the quality of Google's results quickly won me over.


No. Your metrics are deceiving you, because your test suite under-weights or fails to include sophisticated searchers and programmers, and fails to include highly-specific queries with only a few results. For technical people, Google search has gotten so much worse it's hard to ignore.

Google fails for more than half of programming-related queries, because it splits up multi-word identifiers, and spelling-corrects valid technical terms to unrelated English words. It fails when searching for uncommon error messages in quotes.

This might be tolerable if turning on Verbatim mode was easy, but to do it (without a browser plugin) you have to first do a failed search, then click three times, the second of which is hard to aim because the target is animated by the first click.

Google Search fails whenever I'm doing a search where I suspect there are few or no results and want to confirm that. Then there are the queries with one or two matches on StackOverflow or a mailing list that fails to answer the question. You can't just move on, because there are pages and pages of scraper sites cluttering up the results with the exact same message. Improving the ranking doesn't help, because the problem isn't the rankings, it's that you don't know when you're done.

Then there's personalization, and in particular the inter-query persistence. You do this because some people make two queries, and include keywords that accurately indicate what they want in the first query, then omit them from the second query. In my own usage, however, if I do a second query it's often because the first query had something in it that I didn't want, which spoiled the results. Since I always include the keywords that were actually helpful, I get none of the upside. And if my second search then gives bad results, personalization means I can't trust that the reason is part of the query I made.

Basically, you've improved your metrics at the expense of everything on those metrics' blind spots. There are a lot of people whining, most of them unable to articulate what's wrong, but they are right. Now please, go enter these complaints into your bug tracker and fix it.


This is great feedback. Thanks.


Do you guys dogfood your programming search questions or do you have an internal thing for that? Because honestly when I google technical stuff and get corrected despite using "" (since now '+' is deprecated) it makes me super rage. The only thing that escapes it is the super.crazy.java.conventions that are obvious.


You're obviously not vb6 programmers at Google, it kept changing vb6 to 'visual basic' making trying to fix obscure behaviours in vb6 really hard to do because all it would find are vb .net results after the switch.

I had a couple of very unproductive days before working out how to force it to stop changing my search terms.


Guilty as charged. :-)

I could see how vb6/"visual basic" would cause some irrelevant results to come up. It's probably good sometimes, but also really bad sometimes. I've recorded it in our list of motivational examples that we use to try to come up with improvements to our algorithm. Thanks, and sorry for your wasted productivity.


You're obviously not vb6 programmers at Google

ouch.


> "there were several times where we got you the right result even though it didn't have the exact words you asked for"

How do you know that? Just because I click on a result doesn't mean it was the correct one. I usually click on A LOT of results just to check if MAYBE there is something relevant (which is mostly not, especially with Google lately).


Maybe they should have another metric that if you click on a second result after the first result, then the first result should get penalized a little. Repeat steps for 'n' clicked results.


Well, that might not be true as well. Sometimes you just want to to check information from more than one source or the result is correct but you want more on that topic and hope to find it in the next results etc. For example, you search for a technical term and (naturally) the first result is a definition in Wiki, but you already know the basics and want a deeper knowledge so you just skip to next results instead. It would not be fair to penalise Wiki for that.


What I meant is, not the first result on the list, but the first result that you click, and then press back and then click on the second result.


Most people I know just open every result in new tabs and continue recursing forward, pruning irrelevant or slow-loading tabs in the process.


I guess you're right that we don't notice. I just searched for "Jim Otteson", only to be given results for "James Otteson". Same person, big help.

At other times, verbatim yields the best results. I guess I just wish there was a link next to the search results that let you repeat the search verbatim, so that we wouldn't have to go all the way to Advanced Search.


I would very much like an expandable section below the search text where I could turn very specific portions of the google magic on and off.

I also wish you would remove the multitude of sites that scrape content from the original sites and SEO the hell out of it to get to the top of search results. I assume the advertising revenue is too lucrative to do so.


I haven't had too many issues with Google guessing what I mean but the sites that scrape other sites are a real pain. find the oldest page with that text and show it to me, then exclude the rest from the results. Or something like that. If it has the exact same information then it's useless to me.


We're working on the scraped sites. Also, advertising revenue has nothing to do with it. We don't even talk about revenue in our search quality meetings, just the utility and speed of our search results page.


Personally I would be a fan of giving users more "knobs" to turn in their search results. For example, we have the toggle switch to include personal results and personalized ranking, vs. showing un-personalized search results. However, it's a complicated product design problem whenever you want to add complexity to something used by a billion people.

Remember that the thing that made Google so popular and iconic originally was the plain search box.


AV had a plain page, too.


> there were several times where we got you the right result... and you didn't even notice.

1) If I don't notice, then it's not the right result.

2) If I put a + in front of a search term, there's only one reason for that: I explicitly want results containing that word. If you return something not containing that word, then it's not the right result plus I'm going to be annoyed about it.


1) No, I mean, if you search for [nutrition information gm corn] and click on the "right result" for you, you might not even notice that the "right result" didn't have the exact words "nutrition" "information" "gm" but instead had the words "nutritional" "facts" "genetically modified".

2) Do you mean putting a search term in "quotes"? The + operator was retired a while ago. And putting the term in quotes should do what you want, except maybe for high confidence spelling corrections or extremely long tail cases where there are just about no search result to show.


Tip for you guys: Stop pushing "We couldn't find results. Here are results similar to these" whenever I type in a very long-tail semi-obscure query. it makes you guys look bad. Just display all the results you find, even the ones that aren't super relevant.


Tip for you: Google thinks that for every one of you, there are 10x as many folks who do exactly the opposite - and want those results.

For most of them there is no "try again", they simply give up and blame Google (and maybe try the query in Bing/Yahoo).


Thanks ^_^ personally I like the results I get.


What it might come down to is whether making one query worse to make others better is actually a good thing. If they were all acceptable to begin with, it might not be.

Are you permitted to share the types of metrics that Google uses to determine whether an algorithm change is producing better results? Is bounce rate the primary metric that is used?


They look at metrics such as whether an user immediately returns to the search results after going to a page, how long they stay at a page (relatively to other results, etc). That's the primary metric, according to a search engineer who told me during an interview.


Yes, we deal with those sorts of trade-offs every day.

I can't discuss our internal metrics, sorry.


I think you guys are doing a great job. Thanks!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: