Hacker News new | past | comments | ask | show | jobs | submit login

Relevance and quality are subjective.



Of course they are. That's the reason Google "won" in the first place after all. They had a algorithm which better fit what people expected.


But tides are turning, Google's results are steadily going downhill. Unfortunately it seems to be too much of a daunting task with today's size of the internet to jump in with a viable competitor.


> But tides are turning, Google's results are steadily going downhill. Unfortunately it seems to be too much of a daunting task with today's size of the internet to jump in with a viable competitor.

I wonder what portion of the internet you have to scrape to be able to have good results for 90% of queries. An enormous number of my queries end up on Wikipedia somewhere, for example. A lot end up on Stack Overflow. The two of those probably constitute the majority of my searches, and almost certainly a majority of the searches where I really care about the answer (and I'm not just googling whether pirates and ninjas were contemporaries of each other, because why not).

If Google's results are really slipping that much, you can make up for the lack of data by providing better results on the data you do have.

I am curious why something like that hasn't gotten more popular. A search engine that only indexes highly reputable and public sources would be interesting. Wikipedia, dictionaries from somewhere, reputable newspapers globally, etc. Is it just too big of a target for abuse to be viable?


Good points.

> I wonder what portion of the internet you have to scrape to be able to have good results for 90% of queries.

But this is it. I don't think it'd be too hard to get 90% of queries right. It's the 10% that are the challenge.

"Page rank" / "link juice" (amount/strength of incoming links) is still a main ingredient in Google's recipe, so they are actually able to determine, without manual intervention (more or less), which sites are "reputable", while your suggestion would require someone to curate the assortments of "reputable" sites.

How would you discover niche blogs? Or even more high profile ones, just not known to the major public?

That said, as you say, you could probably do with something like this for most of your day-to-day searches, and then just !g or whatever for the rest.


What criteria should be used to rank search results?


Search engine should give access and filtering to all metadata so you can narrow your search as you want.


Ok, but a search engine has to sort by SOMETHING


Right, but sorting results isn't the same as removing a subset entirely.


Putting something last on a search result of thousands is basically the same as removing it entirely. Hell, even putting it on the second page is basically equivalent to not returning it at all.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: