Hacker News new | past | comments | ask | show | jobs | submit login
The future of search is boutique (a16z.com)
144 points by lxm on May 18, 2022 | hide | past | favorite | 109 comments



Google is great at answering questions with an objective answer, like “# of billionaires in the world” or “What is the population of Iceland?” It’s pretty bad at answering questions that require judgment and context like “What do NFT collectors think about NFTs?”

The stated mission of a company worth almost two trillion dollars is to “organize the world’s information” and yet the Internet remains poorly organized.

The first examples are information. The last example is not really information.

I get that search is so convenient and good at the things it's good at that you want it to be good at other things, but you shouldn't expect a search engine to make judgements on your behalf or put things into your context. The searcher has to do some legwork of their own there. Maybe I'm a control freak in this regard, but I trust myself more than any website in this regard.

Wire cutter is one of the curators they bring up. I don't trust that wire cutter manually curates top products. I trust that it only curates products it can make affiliate links to. Maybe these products are good enough, but maybe not. For that reason, I don't know what the value really is, since their incentives don't necessarily line up with mine.


“What do NFT collectors think about NFTs?” Sounds like the kind of thing you pay consultants for. I mean, it does sound like exactly the kind of question a VC might ask, but it would be really hard for the average person to know what to do with this information. I guess if you could disrupt nuanced cultural mastery, that would be great, but also, impossible to get an objectively answer. Also, talk about an entrypoint for algorithmic bias.


I think the point is not that you'd get an objectively "best answer" from that type of query. NFTs are an odd example here, so let's use 3D printers instead for the boutique search example.

I found a lot more value when looking at 3D printing forum discussions from passionate hobbyists to see their opinions about the best printers on the market for various needs, compared to a search engine. In this case, the curators would be the forum moderators who would need to do the work weeding out obvious shills and misinformation. Presumably this moderation would be increasingly difficult if more people make their purchasing decisions like this and businesses respond with more astroturfing.

For something like NFT (or investing in stocks or real estate or anything), I'm not sure there will ever be a good answer for that because the super obvious conflict of interest. "It just so happens that the NFT/Stock I hold is the best investment" is not really interesting or useful and would be terribly difficult or impossible to weed out.


The issue is that {total available revenue}<<{cost of curation time} for most niche search space. It's the Consumer Reports problem: if people need your service rarely, and it's only worth a modest amount of money, then it's impossible to balance the books.

And we're just talking different flavors of Ponzi schemes if we're trying to design a system where available revenue doesn't cover costs.

I remember pre-Google, when it was an open question as to whether curated directories or search engines would be the dominant form. Turns out, the former when signal>>noise (early web) and the latter when signal<<noise (current web).

Reddit is a gamified answer to a fundamental imbalancing of profit: convince people to do valuable work for free, pay them in karma/gold stars/Monopoly money, and then sell that work (without having to pay its fair market value).

And for profitable search curation niches, there are already solutions. F.ex. Bloomberg makes $10B of revenue off one.


> Maybe I'm a control freak in this regard, but I trust myself more than any website in this regard.

You are not. Sometimes you need a search engine, not an answer engine. Google/AI can be a force for good, but often it is not; some examples & views of others covered in a thread [0].

[0] https://twitter.com/ColinHayhurst/status/1420698145898520579


an opinion is a type of information

google's mission statement is not "to organize the world's facts" but maybe you'd like it to be


> you shouldn't expect a search engine to make judgements on your behalf

This is true but it Google still tries to despite the fact that they can't. If Google returned completely unbiased results and let the reader filter the results then I think people would have fewer results.


If a search engine is full of documents curated directly by the user, through commands to save and index such and such site and this and that document, then a language model is well capable of making judgments on the user's behalf based on their interests. Further, conversations about such documents over time may give priming that narrows the judgments down to what the user approves of the AI's judgments and what they don't.

Just speaking from my experience in thinking and working on this concept over the last 2 years...I could be wrong, but testing shows some hope in this idea being a reality.

The document set is the critical part here. The user must obtain the documents that are of interest to them. Keeping the document set small and focused allows for higher quality results and discussions about the content.


If you want unbiased results, you are looking for a directory like the Yellow Pages, not a search engine.

And then when you need to search inside the Yellow Pages, you'll ask yourself why YP doesn't do something about the misleading descriptions provided by the companies in their index.


What's wrong with fewer results? I don't want thousands or millions of results for my query, give me 10 good ones. I'd imagine most people are similar here.


I misspoke. I meant people would have fewer *issues. However fewer results would also happen and I agree that it's a good thing (if you reread my original comment, you'll notice that I didn't make any value judgement on whether fewer results was good or bad).


what is an unbiased result in context of a search engine? Google has to decide what to put on page number 1 and what on number 10. This necessarily requires making judgements based on criteria any of which will be considered biased by someone. (which they are, because judging what is most relevant is the point of a search engine to begin with).


With the state of Google these days, I think if I searched “What do NFT collectors think about NFTs?” I'd probably just get back results for NFT themed merchandise from Etsy and monkey pictures on Pinterest.


If you ask a language model:

> NFT collectors think that NFTs are a great way to collect and trade digital assets. They also think that NFTs have the potential to revolutionize the gaming industry.


agreed, this is a false dichotomy. I wouldn't look to any search engine to "answer" questions like “What do NFT collectors think about NFTs?" Answer != Organize


I don't think even the author expects the search engine to actually "answer" the question but that the results of such a query ought to be articles, social media posts, blogs, videos of NFT collectors talking about various NFTs.

Outside of Google's attempts to summarize information in the answer box "questions" posed to a search engine are really "find me documents which discuss the subject I'm querying about and might contain my answer."


In fairness if the best example of a question that proponents for this can come up with is “What do NFT collectors think about NFTs?” it’s not obvious it’s worth solving for in the first place.


The amount of money a reseller is willing to pay for his goods to appear in a curated list is likely to always be higher than the amount of money a consumer is willing to pay for access to a curated list.

That suggests that a platform that aligns with the needs of the advertiser is likely to earn more revenue than a platform that aligns with the consumer.

For this reason I find it hard to see a dominant "boutique search" company that is funded by anything other than advertisers.


This is something that makes more sense as a business ran out of your garage than as an investment opportunity. Someone hand-pruning stack overflow search results wouldn't need much to keep the lights on and could just sign up a certain number of clients.

"Boutique" to me means high end tradespeople focusing on a specific point of view over growth. I can't imagine a search service fitting that category and allowing advertisements. That's not what either party involved is there for.


This is true for consumers, but not B2B


Sorry for another reply, but a quick thought:

This is the main problem with Uber. There was never any need for investment. The service itself could run perfectly well without anyone skimming off the top, and there wouldn't be as many mouths to feed running things.


Incorruptibility is only possible (and not necessarily guaranteed even) if money is taken out of the equation.

We need an open big-list-of-URLs exchange standard, then one can use some sort of app to create/rewview, upload/download and import/export lists of favorites/URLs.


Money is an instrument, it’s not the source of corruption.

I prefer the auditors skepticism; anything that can be corrupted, given enough time, will be corrupted.


Meh, this is a lot of words to say nothing really, a sponsored article for A16Z. The core “problems” of the internet business are still up in the air: ads vs payments. The author mentions she prefers her own curated “search engine” company over Google (I wonder why!) but any paid service won’t be able to compete with someone making a free version with ads. One of the biggest lessons of the past 20 years on the internet. I know the author has a vested interest in this not being the case but unfortunately it very much is. This is Google’s whole strategy: make a free thing with ads and they will come. CF Eternal September; why HN is much more pleasant than mainstream social media sites; etc.

Also seriously stretching the definition of a “search engine.” I certainly wouldn’t consider LinkedIn a search engine. It’s a social media site with search, and recruiters can filter their queries on people but it’s a closed system. Only LinkedIn accounts are there, unlike the web which is open and needs a true “search engine” to organize it. If the web is open, then to me LinkedIn already is curated, and then what the author is proposing is further curated still but certainly not a “search engine.”

Snark aside: The random, bold “Nascent token-based business models show early signs of promise.” from an author who has no understanding or experience with them and is tacked on at the end just makes me picture someone at corporate writing this in in sharpie.


I'm really curious to know whether this view is at all reflected in the wider population outside of tech. For myself, being a tech-y person in SF, Google is good enough 99.9% of the time. I have a hard time believing there is a significant percentage of 'normal' people who think Google is a paint point in their daily lives, and good luck to any company who tries to break the "just Google it" habit that we've all developed over the past ~20yrs...


I've started hearing from non-tech people complaining about the spammy results in google results.

Feels like "trust" that they're actually seeing the best results are going down.

That said, many of these same people rarely actually google things outside of simple factual information that Google does okay at (e.g. height of the eiffel tower). Their experience of the internet is mostly through various social media filters (e.g. Facebook, TikTok, Pinterest, Instagram, Twitter, Reddit, etc.)

I also suspect this is part of the reason Google is giving worse results. Most of the actual content being generated on the internet is now occurring in various walled gardens that either pollute search results (Pinterest!) or don't show up (Facebook).

It's a tough problem for a company built on the open web that web mostly resembles a late-game Risk map with just a few big players.


Sounds like the internet has interpreted the parasitic value exploitation as damage and routed around it. By cutting off its nose.


>I have a hard time believing there is a significant percentage of 'normal' people who think Google is a paint point in their daily lives, and good luck to any company who tries to break the "just Google it" habit that we've all developed over the past ~20yrs...

To 'normal' people Google is the synonym for internet search and yea you are right nobody will be breaking the habit of "just Google it" anytime soon.

Google's lack of awareness regarding the power users is astonishing but a change needs to come from within Google not from outside because Google's top management just doesn't care what power users think.


All it takes for Joe User to change is one click to set the browser's default search engine - this is why google pays so much to Mozilla. There is not much that google offers that captures the user besides the trust that they are getting the best search results.

By now, Joe User has developed a healthy distrust/dislike of Facebook, but they stay there due to the network effect. Google has no such snares.


> All it takes for Joe User to change is one click to set the browser's default search engine

I'm pretty sure that takes a bunch of clicks and is confusing.

> this is why google pays so much to Mozilla.

Google pays Mozilla to fend off antitrust.

edit: Could Google disappear in a day? Maybe. People are still Xeroxing things but nobody is using a Xerox. Google have so much cash that they can pour into marketing that people would gradually drift back after any stumble, though. At the least they could get regulation passed that would be too expensive for newcomers to comply with, or that requires that Google be used as a middleman for some processing.


They have their family of apps tied to Google e.g. Gmail, Google Maps, Google Translate, Google News etc. That could be a drawback if you wanted to switch to a new search engine.


For non-English locales, Google sucks 99.9% of the time for popular queries since they allow spamming from big media outlets. A top result is literally composed of spammy SEO keyword content filled with a chain of questions in the article. So (I think) it sucks for 'normal' people in non-English.


I"m an engineer, but also a writer, so as my writer self i totally feel the inability of Google to find quality information.

Just from performance perspective I can see that the listicles and how to's are doing better then any other niche topics.


It's interesting how books somehow retained a certain "quality" to them. For example each book has an ISBN ID. Maybe it's because of the publishing cost?

What if there was a premium Web where each website had an ISBN-equivalent, and the dates during which it was "in print?" And to get one published you'd pay a fee to a central register, such as $100.


To the SEO listicle sites those 100 bucks are pocket change. To the high quality blog, it's maybe a deal breaker.


I'm wondering why SEO listicle books are not as big of a problem, and how to replicate that.

I guess there is no Google for books, and no ads/affiliate links. And there are libraries where books are manually curated using the Dewey hierarchy.


People pay money for books. It's a much harder sell for a web page.

Listicles do show up in print though. I see them at the grocery checkout stand.


Books have gatekeepers, publishing houses that put in work to market the book as widely as possible, get it reviewed in newspapers, arrange media appearances.

How many Kindle direct releases are going to get that kind of backing?


That's one of the biggest problems. You have to build something that is better than Google, at least to a certain subset of users that makes them ecstatic about the alternative.

So far all I've seen is shittier versions with terrible UI(you.com), or exact low-risk copycats going at it with the privacy angle.


- our UI at Breeze is currently awful - we're emphasizing date-based + topic filters - while we do come with privacy, not main angle for many reasons

1. direct link, https://breezethat.com

2. tons of date filter examples to get newest info or to go back in time, https://twitter.com/search?q=from%3Adotdotjames%20(date%20OR...

3. while there are several experimental topics in the drops menu at top, the best examples is probably our jobs filter at 14M+ openings, https://breezethat.com/x/job-search-beta

4. gradually merging topic filters into main UI


Google is a great search engine! However, from time to time I am still enjoying other search engines when I am getting good search results for some queries, just as someone who has been in prison for 20 years enjoys the fresh air once released.


I find the framing of the discussion errant and interestingly, errant due to the perspective of someone who clearly was not there at the time, let alone has any real context for when and where Google emerged. The problem Google had was not at all that the answers one would seek did not exist, it actually solve that problem better than all the other search engines that existed by finding more relevant and accurate answers.

What is often missed even by those who were around and paying attention in that context back then, is that most people were blinded by the "good enough" factor that rather mindlessly simply assumed that Google served up if not nearly, approaching perfect results ... that there was simply no better results available. A clear fallacy, as one could and especially today can prove on their own by simply taking some site and page with the most important or even niche information on a topic, and then trying to search Google for it. You will likely find that there is approaching no string of basic search terms that you can use to find that resource. Even if it is a page that is part of a major, mainstream site, it is immensely difficult to find information using general terms.

I would argue that all the censorship and corporatizing and centralizing of information and data has only amplified that problem even beyond the early days. As a life long user of Google search, even long before there were public discussions of what a google is and people still thought AOL is the internet, I feel like the peak of Google and search in general occurred some time around the mid aughts, so, 2003-2008.

I would love to hear from anyone with similar direct contextual knowledge.


The article is written by someone who's founded such a search engine, so they're of the opinion that Google is dead and [thing_im_doing] is the way of the future.

> Sari Azout is a founder of startupy.world, a search engine for tech and culture insights.


These articles are muddying the definition of a search engine. I guess in someways these articles and the general public perception is telling us a pure search engine no longer cuts it for us.

However the point is, search engines were meant to scour the web for data and present them in the order the algorithm deems most relevant. That is it. Google built a vertical integration into it for specific types of data that could be turned into a single answer like bus routers, NBA scores, temperature, wiki info etc. This is not the norm. It can only be done for some data types.

"Boutique" search engine by definition is doing logic analysis based on context, and so that will bring in "someone's" point of view into it. Immediately becomes useless for objective research/analysis. This is supposed to be done by the user. IF "boutique" is needed we already have it in reddit and this forum, media outlets like tv and print.

The specific question asked in the article "What do NFT collectors think about NFTs?" is a terrible one because a search engine cannot achieve that objectively. Someone needs to conduct a poll and that too its subjective because NFT collectors for different verticals/price points might have different opinions. NFT naysayers and NFT platform owners will have different motivations too. So I don't know if we should trust a boutique search engine to tell us a one line answer.


Is A16z just capitalizing on what we hackernews commentary whine about and then posits a theory about the future of the market based on that? I mean thats a great secret sauce we as a collective are very on the pulse ...


Then bakes in NFTs to turn the conversation on to something more edgy. Thus allowing them to tie back to their long push into "Web 3.0" which I would imagine is starting to look a bit bleak nowadays.


My thoughts as well! I've never seen anyone complaining, or even talking about their Google Search experience outside of Hacker News.


Did I miss something or did this Silicon Valley VC gibberish fail to mention a single example of a "boutique search engine", whatever that made-up label means. Assuming he means web search engines such as wiby.me or marginalia.nu, he cannot cite them because they are non-commercial. The problem with SV VC is they cannot imagine a non-commercial use of the web. Everything has to tie back in to some proclaimed "business model", which never amounts to more than manipulating web traffic and capturing attention/eyeballs. Without the traffic, the model fails. In the SV VC's mind anything that does not not directly or indirectly lead to money gets filtered out.


She mentioned Spotify, Wirecutter, Thingtesting, On Deck, and Tegus, as well as her own startup, Startupy.


"If the value proposition is signal over noise, how do you scale the signal?"

The author also assumes must scale signal instead of reduce noise. She plays the futurism card and tries to attack the Yellow Pages. Why. The Yellow Pages has zero noise. Google has infinite noise. Gee, I wonder why. For starters, the telephone companies did not keep rearranging the alphabetical listings to some sui generis order based on some secret algorithm and create a competition to see who can be listed first, then auction off ads to be placed above the first listing.

"How many New York Times accounts did you create before finally giving in to the paywall?"

None. I do not use cookies or Javascript to read the NYT. I only use HTTP and HTML because that is all that is needed.^1 NYT chooses to share information publicly, no password is required. Why. Because if they don't someone else will. The internet allows its users to share data and information. They do not need a printing press. User publishers have already paid for that privilege when they paid for internet access.

Given that we already pay for our computers and our internet subscriptions, why do we need further intermediaries telling us what to consume, for a monthly fee. I say we don't. The time has come to stop trying to exploit people as they access public information, be it through surveillance and advertising, or "subscriptions".

The idea of asking for a subscription fee to search what is available for free over the internet is nothing more than needless "gatekeeping". In 1993, I sat down at a university computer to use gopher, ftp and the www for the first time. Nevermind not knowing what to search for, search was slow. Getting access to the internet was neither free nor easy then. It still isn't free today, but it's certainly easier. Search is much faster. The problem of not knowing what to search for is a good thing. It forces one to explore. Intellectual curiosity instead of marketing.

From what I have seen, the number of new www users is growing faster than the size of the www, but Google is unlikely to let anyone know that. Like the Wizard of Oz behind the curtain, it is better for business to keep up the illusion that the web is too large for any mere mortal to investigate without Google's help. But today we can scan the entire IPv4 address space in a matter of hours. That time will keep getting shorter. Exploring obscure corners of the www is not something only Google can do. They have no motivation to do it because unpopular websites with low traffic are not suitable for Google's programmatic advertising.

Now we have Silicon Valley VC telling that us charging a fee to conduct searches is the "future". The reason the www is filled with noise is because of VC funded "tech" companies, like Google, that only focus on the www on condition that it serve as an advertising medium and/or a means to collect user data and generate metadata about users. VC and their "tech" company spawn want monopolies, millions of people all using the same websites, so-called "scale". The possibilities for millions of small businesses, not beholden to a "tech" company providing a "platform", even just people using the internet to exchange text, images, audio and video for non-commercial purposes, without the need for an all-knowing, all-seeing intermediary spying on their every move, is not something VC are interested in. The www has suffered the consequences of VC "thinking", including endless noise as a result of their sponsorship of hype, a race to the bottom competition for lion share traffic at any cost. The future of the internet is one without the corrupting influence Silicon Valley VC. Enough of the noise. We can do better.

1. Visiting https://www.nytimes.com gives the public JSON file

   https://static01.nyt.com/services/json/sectionfronts/$1/index.jsonp 
where $1 is a section, World, Business, Travel, Arts and Leisure, etc.

Alternatively we can use sitemaps, e.g,

   https://www.nytimes.com/sitemaps/new/news-$1.xml.gz 
where $1 is 1, 2, 3 or 4.

For example,

   curl -A"" -4o 1.jsonp https://static01.nyt.com/services/json/sectionfronts/world/index.jsonp 
   exec sed '/\"guid\" :/!d;s/\",//;s/.*\"//' 1.jsonp
or

   exec yy059 < 1.jsonp # yy059 is a quick and dirty JSON reformatter I wrote

   curl -A"" https://www.nytimes.com/sitemaps/new/news-world.xml.gz


The reddit hack for reviews succeeds at being a good source for "reviews", _because_ that's not what it's for. As soon as you create a search engine, someone will figure out how manipulate the results. Even reddit search results can be pretty spammy as I think some people have figured this out.


I find Reddit to be the best source not only for reviews, but for any topic that has even a whiff of commercial relevance.

On Reddit you have a very good chance of seeing multiple high quality, well-informed posts on whatever topic you're interested in, written by real people who genuinely want to offer help in an area they're knowledgeable about.

On Google these days, even once you scroll past the wall of actual ads, you'll usually just see a bunch more defacto ads masquerading as "organic" results.


> On Reddit you have a very good chance of seeing multiple high quality, well-informed posts on whatever topic you're interested in, written by real people who genuinely want to offer help in an area they're knowledgeable about.

The pervasiveness of fake, shill, and/or paid reviews across the rest of the web makes it difficult for me to escape the suspicion that there is pervasive review fraud on Reddit as well.


But are those comments highly upvoted?


Some of them presumably are? Redditors are notorious for not being able to distinguish creative writing exercises from reality. Throw in the various methods of vote manipulation that the less scrupulous marketing folks there are surely using, and it is hard to imagine that a large fraction of highly upvoted Reddit product testimonials aren’t fake


Exactly. Reddit’s culture acts as an anti-spam, anti-shill immune system. That’s not to say it’s perfect, but it’s quite a bit better than the typical search engine or social network at weeding out content-spam and other bs.


> As soon as you create a search engine, someone will figure out how manipulate the results.

I see this sentiment a lot. Are there any actual examples of this happening, or does it merely propagate on the basis of sounding true?


The example is the way that the dominance of Google changed the web.

It's "manipulate the results", by way of "be careful what you wish for" - if there's money to be made in pandering to a search engines whims, then that will happen at scale and that content will swamp the good stuff that you wanted in the first place.


Google is arguable exceptional, in that they are in a position to shape a significant amount of web traffic. Their largest competitor doesn't even have 1% market share.


I agree with this - Google is so dominant and so large, that it distorts the entire web. But I think this is probably just a historical accident, more or less. This is just [Goodharts Law](https://en.wikipedia.org/wiki/Goodhart%27s_law) write large:

> When a measure becomes a target, it ceases to be a good measure.

So, when you try to measure content quality in some way, that becomes a target, and starts to distort the quality of the content.

I think this applies equally in any niche, at any scale where people find it worthwhile.

If you have a niche search engine that has different content requirements, and it becomes worthwhile to SEO niche content for that engine, then it will happen, eventually.

I think it just has to become worthwhile (and then for this fact to become known) - but that threshold is going to vary a lot, in different niches, times, places, etc...

There are plenty of people (~everyone?) gaming/SEO'ing the search on Amazon, Etsy, Ebay, etc... for example, by doing niche & platform specific SEO on their listings on those platforms.

I also think that people's definition of "worthwhile" differs too. For the broad SEO industry, it monetary. For people making posts on HN or Reddit, it's Karma, or Discussion or something - and people will optimize their content towards that, just for "fun".

Actually, I think this applies pretty boardly to making _anything_, more or less, once you have someone or something evaluating/measuring your output?

How _harmful_ this is, depends on what's being optimized for; scale & rewards probably change how "hard" the optimization arms race gets - as well as how many people are affected.


The Panda update Google made in 2010 or so removed enormous volumes of spam websites that contained entirely machine-generated content, usually linking to shops selling counterfeit goods like Prada bags and Jordan sneakers.

And that was when Google was over 12 years old.

There was a time I only saw listicles on Cracked.com. Now most of Page 1 content on Google is structured like that, and most are low quality blogspam.


I don't think these types of conclusions can be drawn based on Google, as they are in an exceptional position, with nearly complete market dominance as well as an economic incentive against cracking down too hard on ads.


This is an interesting paper by the Brave search team:

https://brave.com/goggles/

Essentially “goggles” allow anyone to create their own algorithm and share it. It’s a really interesting concept. Definitely worth reading the paper.


> The stated mission of a company worth almost two trillion dollars is to “organize the world’s information”

Wikipedia, millions of volunteers and bloggers organise the world's information. Google just leaches off them, they cant even organise Youtube

Does youtube prioritise videos that have the most informational value, news that are most objective? No, they financially reward those that produce misinformation and conspiracy theories. You could argue 'well, thats what sells', but then you can't claim any higher purpose or mission - its just peddling whatever is popular, like fast fashion.

Or do an experiment, search for a government service like 'open a company in UK' or 'apply for a driving lisence', queries where there is only a single government service per country - and google will return a page full of scams.

During the pandemic i tried to find COVID related rules for entering moldova- the result was on like page 7. If failing to find official public information is not 'ublnderperforming', i dont know what is.

The premice of this thesis is wrong - no-one is seruously trying to make the search engine that produces the best answers because the competition is basically gone, and because the web has been balcanised from open system into feudal kindgdoms.


Kagi.com has a search feature option that focuses on stuff programmers want to see such as results from stackoverflow.

i love it and yeah I'm at the point now where i do want my search to be boutique. me searching for something and my grandma searching for something are two completely different animals.


Kagi has been a game changer for my SO and I. She's been using it for research for her doctorates program the last month or so, and frequently says how it has become an indispensable tool.


Question: Is it normal to be redirected to some page called 'vladimir245' at typeform.com when trying to sign-up for Kagi beta?


Yes, vladimir is the founder of Kagi and created the form himslef, beause Kagi is a bootstrapped effort running on a tight budget.

(vladimir in question here)


Ah, that explains. Thanks a lot, love the dog as a mascotte. And nice bicycle you have (love the frame color combined with the saddle).


> The problem, now so drastically different from a decade ago, is not what to read/buy/eat/watch/etc., but figuring out the best thing to read/buy/eat/watch/etc. with my limited time and attention. (...)

> Someone who wants to find the best freelance designer, or the best sushi restaurant, or the best NFT to buy will not find the answer on Google.

This is a poor article IMHO. The reasoning is as follows:

1/ We are looking for things to consume, and we can only accept the very best of anything -- because we are busy (and, let's face it, exceptional geniuses, contrary to people from a decade ago who had all the time in the world and were far less bright);

2/ Generalist search engines can't help us, because they lack category knowledge necessary to "rank" things

3/ Only curators can help us get there.

None of those points are obvious, and they don't follow from one another. Not everyone is looking for things to consume, all of the time. We don't need the very best of everything. There's no such a thing as an absolute intrinsic quality, independent of other dimensions such as availability, durability, cost, suitability for a specific purpose, etc.

Mostly, it's extremely naïve to think about "curators" as infallible sources of truth. Curators can be incompetent, and they can be bought -- usually for very cheap.

Eventually, the article asks:

> Who curates the curators?

That's an excellent question! And the answer is: Google. Google points people to Wirecutter, or Booking.com (infinitely more successful than Expedia or Yelp, and not even mentioned).

Google curates the curators.


> We don't need the very best of everything.

I think it's very human, when presented with an array of options, to desire the best. Just imagine you are presented with N options for a vacuum cleaner. Would you really just pick randomly? 10 years ago (or maybe more like 20 years ago), there was not nearly as much choice. You couldn't find hundreds of different iterations on the same item. Now that you can, many people spend a lot of time agonizing over the options. I agree that points 2 and 3 don't logically follow from desiring the best though.


I deliberately go to the closest dentist to my house because I don't have tooth problems and it makes no sense to look for a dentist that does a better than adequate job until/unless I do.


How do you know the dentist is closest to your house unless you considered multiple options and ranked them by distance? In your case, "best" includes time convenience and you did compare options without even being aware of it.


By that definition, everything is always selecting the best.


unless... multiple inadequate cleanings build up over time to a dental problem that could have avoided in the first place by going to a better dentist


It’s an adequate dentist office.


'how do you scale signal' is a great frame for verticalization

I was in camp 'google search is dead' until I was actually shopping for something and then light dawned

google is an ecommerce search engine; all their product decisions make sense if you think of them as amazon without the warehouse, shopify without the checkout

(unfortunately it also puts them in 'more ads, lower margins' spiral that degrades to infinity times infinitesimal)


*Google is an ad search engine.


although run Breeze search, I sometimes explicitly will search on Bing / Google to see what ads pop up when shopping for something haven't ever bought before as part of product discovery

Breeze search -> https://breezethat.com * Breeze jobs -> https://breezethat.com/x/job-search-beta * Breeze date filter examples -> https://twitter.com/search?q=from%3Adotdotjames%20(date%20OR...


Whoa! Something lit up in my head. Still processing what it means


One obstacle is that we've all been trained that domain-specific search engines suck. Like Wikipedia search gives bad results unless you know the exact name of the article, so I just use DDG adding the word wikipedia.

Another obstacle is that the searches Google does worst on are for heavily promoted topics. What company would give you an unbiased answer to "are NFTs good?"


Interesting take, but you're biggest complaint about vertical search engines seems to be "relevance depends on the sociology of the current moment. " Curation is just a another form of this, and a lot more momentary.

You also state "With strong opinions on how to organize information, reflected in their choice of filters, vertical search aggregators have distinct advantages that horizontal software can never achieve." I strongly believe google can do this, structured data was a big push to make this easier for search engine, but as machine learning models expand to extract structured data from unstructured text, giants will be better positioned to do this then vertical search engine.

"Spotify doesn’t curate what songs make it to their platform. Instead, it takes the entire universe of music and finds endless ways to discover and search across its library, including a mix of manual curation (via playlists curated by its in-house team of curators and its users) and algorithms (like Discover Weekly)." So would the metrics like whether a user follows more links, after the first one, trys another query, hops back to the search results quickly, a form of curation?

And I also think most of your boutique search examples, aren't search engines, but in app searches. A search engine needs to search an index external documents to be a search engine.

Looking at the query, "What do NFT collectors think about NFTs?", how you describe curated search engine would be terrible at answering that, because it would give very limited scope of the curators opinions. Google should return all the results containing content where people talk about this, and you can form an objective opinion based on what people are talking about. A better model for answer this type of question would probably be a massive language library like GPT3, continually trained on current information.


Random topical question I've been meaning to ask the HN community- has anyone gotten any solid value out of Google's Custom/Programmable Search Engines? They can supposedly be scripted to be much more precise or targeted. Curious if anyone's ever played around with them- I know a couple people in my industry are into them


I have no idea whether they’re right or not, but one thing that I think would be nice with search results is a way to sort them. They’re given to you in a specific order and that’s all you get. For general search, it probably can’t be implemented in a sensible way, but try sorting things on Amazon by brand or price or review rating, or whatever. If there’s a way, I haven’t found it.

If I’m searching for songs on a music service, I may be interested in sorting by band, by album, chronologically by release, alphabetically by song name, by band member name, etc. (and that is usually possible). Likewise with video, film, books, etc., I want similar capabilities, but am not usually able to find them. So many search engines just leave out this ability. It’s quite frustrating.


Seeing as how this is VC thought leadership piece, how will a boutique search engine work? Deliver pre-2004 style Google results on a free basis while coasting on VC money, then switch to aggressive monetization, advertising and tracking once the SPAC IPO takes place?


Neal Stephenson in his book Fall riffed on a near-future like this, in which young people no longer rely on the recommendations of social media and news aggregators. Instead, they outsource their feeds to a mix of human and AI curators.


This was also a (secondary) feature in Stephenson's _Anathem_; their equivalent of the Web was so full of viruses, artificially intelligent spam, and what we'd now call disinformation (and network effects of that disinformation) that it was impossible to discern any information of value by accessing it directly. Only by using a sophisticated program that was constantly filtering data could the characters determine what was _probably_ happening in the world.


We are building this right now for a niche. In our world the hardest thing is good, clean, indexable data. Google can crawl the web just following links and index web text, but for niche search a lot of the relevant material to index is private.

If you are working on niche search, I’d love to hear how you are being creative about aggregating and cleaning data to index on. It’s a fun, challenging problem and there are going to be a lot of winners on the other side of this!



That first paragraph nearly made me spit my coffee. Self-parody even if you believe in crypto.


Only if Google allows us to submit a list of websites to exclude from search results, it will resolve, if not all, most of our Google search detoriation complains.

This might even help Google know which sites people exclude most and down rank them.


That's an online public vote which can be gamed. I'm reminded of when 4chan mass-voted on a poll for where Pitbull should perform, and he ended up somewhere in Alaska. This was back when 4chan was known more for generally harmless trolling instead of the dumpster fire it is now.


It's interesting that most of the responses here so far are criticizing the article and positing that Google is, more or less, the best we can expect. Yet numerous other front page posts on HN have been very unanimously critical of Google, e.g.:

https://news.ycombinator.com/item?id=30347719

https://news.ycombinator.com/item?id=29772136

https://news.ycombinator.com/item?id=29392702

My own anecdata is that non-tech people are beginning to really dislike Google's results. Take a very normal-person kind of search: looking for almost any mainstream recipe. The vast majority of results ramble on for pages about the author's childhood experience and Italian grandmother and how the dog liked the recipe so much, painstakingly repeating the SEO phrases in the currently-most-effective combination. Most people despise this stuff.

I think we all know why that is; SEO is unavoidable, gaming the system is unavoidable. But the important question would seem to be: is Google actually doing a bad job, such that someone could disrupt their model?

I can't see that the answer is anything but "yes". Google is deeply addicted to search ad revenue. That ad revenue comes from the exact same people gaming SEO and producing reams of poor quality content. Google is dependent on its abusers; they are more valuable than its users. How can that end in anything other than a general degradation in quality? And as a public company, Google can't exactly back off from its ad revenue focus for a while and fix its business.

At the same time, nothing I've seen yet could disrupt Google. DuckDuckGo et al are largely just copies, and their differences are too minor to get users to switch.

But it's very odd to say that because disruption hasn't happened, it cannot or will not. That seems like a failure of imagination.


> it's very odd to say that because disruption hasn't happened, it cannot or will not

Nobody is saying that. People are upset with Google. They're just recognizing that the arguments made in this article don't make sense. That nonsensicality is heightened by this being, in essence, a venture capitalist's PR statement.


I think the arguments in the article made as much sense as you'd expect. Nobody has yet produced anything that really counters Google. So the best you can do is cast around to all the different quadrants that look like they might produce something and ramble about what could be.

Also, the blog post is republished from somewhere else, not written by a16z.


not sure, but with specific search on google I end up on very concise old personal webpages with deep knowledge. I dont need wannabe reviews on prepubere communities websites that dont makes sense. If i search for garden i want personal webpages with gound knowledge, not a medium article on how to yoga in a community rentee garden on rooftop. Google helps me find universe old webpages


Here's what I want in a search engine (and I would pay for it!):

A combo of curation and filtered google/bing results.

For curation, pay people who are into whatever niche topic and have a review process that spots and eliminates collusion and corruption.

And where you don't provide curated links give me google/bing results with the crap sites filtered out. Also, upsell me on research librarian services that then feed the curation lists.


building that at Breeze if we haven't intersected here before

links:

although run Breeze search, I sometimes explicitly will search on Bing / Google to see what ads pop up when shopping for something haven't ever bought before as part of product discovery

Breeze search -> https://breezethat.com

* Breeze jobs -> https://breezethat.com/x/job-search-beta

* Breeze date filter examples -> https://twitter.com/search?q=from%3Adotdotjames%20(date%20OR...

* @dotdotjames on twitter for DMs / etc


I work on alternative search engines, my first was something called Whize which flared for a bit, hit the front page of product hunt and then died after a few months of trying to support a search experience that we've all become used to ala Google.

We had thousands of users stopping through over the months but that's not enough for an AD supported model and ultimately indexing the web with a significant enough amount of coverage so that results weren't lacking and then working on a novel enough ranking methodology that combined vector search signals with classical search signals is a massive undertaking that was running us thousands of dollars a month.

It's possible but you need to basically be prepared to burn a lot of money, more than I and my friend were willing to support. That's why most of these alternative search engines serve custom Google results combined with other aggregator content or license Bing and they produce good results but at the end of the day your still beholden to the index and to some degree the ranking signals of the engines your building over.

So far in the landscape Kagi is doing a good job on establishing itself with it's use of lens and subscriber based support for it's boutique engine. Neeva and You.com are also both in that space and have the benefit of being supported by gobs of funding (Neeva in particular due to the background of the founding team.)

All of these engines are attempting to provide the curated experience this article talks about in the form of programmatic boutique like searches over Google and Bing's existing index. From a technical standpoint this makes obvious sense.

But.

From the year+ of working on that search engine and my continued interest in alternative search engines here's the thing. Google and Bing index web pages, but there are so many alternative forms of curated information we've created over the last decade and a lot of that information is in semi-public spaces like slack, discord, whatsapp, telegram, facebook groups, newsletters etc.

As more and more spam has propagated over the internet real useful information and discourse has retreated into these semi guarded spaces that the existing incumbents do not index and do not serve which means all of these respective engines are missing that information which means in turn you're missing that information regardless of where you search.

Want to buy a car? The best review of what car to buy is probably happening in a car enthusiast discord or slack maybe an e-mail newsletter. But you won't see that. So my opinion? I think the next real killer search application is going to figure out how to index and make that content searchable not an alternative take on searching web pages.


>Google and Bing index web pages, but there are so many alternative forms of curated information we've created over the last decade and a lot of that information is in semi-public spaces like slack, discord, whatsapp, telegram, facebook groups, newsletters etc.

That's called Deep Web("semi-public spaces like slack, discord, whatsapp, telegram, facebook groups, newsletters etc.") and you can't crawl and index it unless you partner with them and they allow you to do it.

But the biggest question that keeps haunting me is Does the world need another internet search engine? Casual users just don't care if your new internet search engine is 10 or 20% better than Google, what I'm trying to say it needs to be something totally different and significantly better in order to catch the attention and the engagement of the users.


So to the question of does the world need another search engine? I mean I think the fact that so many people think yes enough to both create new search engines and in some cases fund them, especially when google for a lot of people myself included really doesn't produce useful results outside of basic questions, is yes.

It also makes more sense if users aren't the product and you can have a business doing 10's-100's of millions in revenue based on serving some niche without having the be exploitative via subscription or some b2b offering you make. Google is so big it's almost a certainly they are more than suboptimal at a profitable enough niche. DuckDuckGo proved that with it's focus on privacy which is most certainly a niche and the 100million in revenue they do.

Personally I think there's a lot of B2B opportunity for a lot of this information where the search can be a multi billion dollar business just not on the backs of consumers.


> Want to buy a car? The best review of what car to buy is probably happening in a car enthusiast discord or slack maybe an e-mail newsletter.

I thought about this and I disagree. Indexing discord/slack (even if would be possible) would probably just add a lot more noise. Yes there might be an occasional gem in a discord channel discussion, but vast majority of time it would be low quality content/chatter. There is another problem - how do you rank it? If someone says on Discord that earth is flat how do you know they are right or wrong? What signal can you use - emoji like count? How do you rank this result vs a wikipedia result?

It is still more likely that the best review of a car will come from a specialized car review site (that is not ad-ridden).


I don't think you'd search discord for "is the earth flat". Existing search engines have that as a solved problem, fact based queries are low hanging fruit for them.

You'd search it for opinion content and I think for that it'd be up to you to discern the quality the same way it works when you stumble on a reddit post with relevant content but since it's a niche question you don't have a flood of up votes to legitimize it which is how a lot of my reddit like searches go. I've found perfectly valid answers to questions I have on reddit posts that have no or few < 5 upvotes, something like 60% of the time I wind up on a reddit post. So to bring it back to discord maybe positive emojis are a signal in that case.

In the case of cars to continue my example, If I see a discord message in a search that is like "I own a 2021 Toyota Corolla Hybrid and I have x y z concerns with it." it's an opinion you can choose to value based on ownership.

But I think asking a search engine, even a search engine like google, to make a value judgement on an opinion is asking a lot and not something they really should do. A user has to have some culpability in the information they consume.

It'd be pretty trivial to negatively downrank though after a user interacts with a result and finds it lacking.

There are proxies for quality of opinion but I think even now with google it's largely up to us as users to discern that especially on places like reddit which are ranked highly by google because it's reddit but don't qualify anything beyond that except in all but the most crowded of subreddits.


Ok we had a different view of searching. I think you propose 'vertical' search as a search engine feature where you could chose to search discord or some other index. I was thinking in the context of general web search results where ranking discord results among others would be hard.


Yeah I think that like opinion or review type content is probably a big enough area of information that google fails to really serve these days that going after it is a smart idea both because of how much better it can be for users but also because I think it's a large and potentially profitable niche.

I think general fact based search is super solved and competing there is a waste.

Edit: And when I say general fact based I mean like "Who is the president of the US right now?"

For individual areas of knowledge (programming being an obvious one for me) there's still improvement to be had for sure.


> Indexing discord/slack (even if would be possible) would probably just add a lot more noise. Yes there might be an occasional gem in a discord channel discussion, but vast majority of time it would be low quality content/chatter. There is another problem - how do you rank it?

This is where a savvy entrepreneur would thrown in "AI-augmented search technology" to really get VCs salivating.


I agree with your last point. How would a search engine crawl closed spaces like discord or subscription newsletters? I find most of the reviews generated by bloggers, journalists, or influencers less valuable than an aggregate review on an HN post. I can read through a hundred comments before making a judgement.


Well remember, discord isn't entirely closed. The benefit of these communities wanting grow is that they're for the most part a simple as finding the link to the discord and joining it to get your information.

I have a lot of thoughts on how to capitalize on that and especially after the latest HiQ ruling and consulting with a lawyer who specializes in this type of thing on how to go about indexing that type of information.

Happy to talk more over e-mail if that piques your interest. One of the e-mails I check is in my profile.


> I think the next real killer search application is going to figure out how to index and make that content searchable not an alternative take on searching web pages.

IMO Google got lucky that they were able to profit from this kind of valuable information being out in the open in the early days, before content producers wised up to it. I think the gates are increasingly closing shut, and TBH I hope it stays that way. No one should have the right to just take this kind of value without giving back equally. Ostensibly Google has been giving back by being an excellent search engine but this seems to be on the decline.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: