An Interview with DuckDuckGo's founder, Gabriel Weinberg

dkhenry · on Aug 23, 2012

I like DDG and I even converted chrome over to use it as its primary engine, however, recently I am noticing more and more spam links and less and less links that really matter. I think I am coming to the conclusion that DDG did this is the wrong order. They are developing a great frontend and using others backends, but to really beat google or MS or yahoo you need to develop a great backend that can filter results well and troll the web efficiently then you can put an awesome frontend on it.

I still think the real silver bullet will be to make a backend system that can peer with other systems to gather data and can be customized to get deep results from a small subset of the web that interests a particular user ( think corporation uses it for internal search and to make their site show up better on a common front end). I even started a research project to begin working with some of the things needed to accomplish it[1].

1. https://github.com/dkhenry/SimpleMapReduce

Ralith · on Aug 23, 2012

> you need to develop a great backend that can filter results well and troll the web

Trawl. The web gets trolled enough as it is.

Steko · on Aug 23, 2012

Not to hijack further but 'troll' is technically fine there.

troll ... 4.(intransitive, fishing, by extension) To fish using a line and bait or lures trailed behind a boat similarly to trawling; to lure fish with bait. [from circa 1600]

http://en.wiktionary.org/wiki/troll#Verb

nodata · on Aug 24, 2012

It's not fine because "troll" would be recognised here as the standard meaning, not the meaning from 1600.

culturestate · on Aug 24, 2012

The fishing definition is still very much in use. See e.g. http://outdoorsportinggoods.poorfish.com/search?w=trolling+m...

boyter · on Aug 24, 2012

I would imagine there would be a few problems with that which might be insurmountable.

The first being speed. People complain about DDG's speed already. Relying on a host of external searches would only cause more issues. Especially if as Google reports that a huge amount of queries are unique.

Another would be ranking. Who determines what is the most relevant result to a query? While you have multiple sources knowing which one to go to has to be determined somewhere. If you hand it over to the peers, they can game the system by insisting they are the most relevant. If you leave it to the server what incentive do the peers have to participate? If you use an open algorithm which people use whats to stop people from gaming the system?

I don't think there is an easy answer to the search game. You either need to build on others systems to have something compelling, or have millions (if not billions) of cash to have enough runway to build/improve your own system, improve it to the point its worth using and then turn a profit.

I don't even think its the complexity of the problem that stops the second. It's bandwidth and disk storage. My personal prediction is once we get disk's up-to a size where you can store a sizable chunk of the web, coupled with enough bandwidth to crawl it in a reasonable time you will see more innovation in the search space as the barrier to entry will be lowered.

mmahemoff · on Aug 24, 2012

"Especially if as Google reports that a huge amount of queries are unique."

Google reports 20-25% of queries are unique, way more than I think most of us would expect. http://www.readwriteweb.com/archives/udi_manber_search_is_a_...

We can probably assume it's not evenly distributed across people. Some people would probably always get cached queries, while others would need fresh results at least half the time.

boyter · on Aug 24, 2012

That's probably true. I imagine it would also be true to say anyone using a new search engine would be those throwing unique queries at it most of the time too.

With that in mind, and with the tech world generally driving adoption of players in the search space I can't see the approach working. Nobody would switch when the queries are massively slower, even if they were 99% accurate.

vaksel · on Aug 23, 2012

yeah you'd think with the VC investment they'd be going for that

isaacwaller · on Aug 23, 2012

I have never liked DuckDuckGo after seeing their advertisement tactics, especially the "educational" http://donttrack.us/ minisite. Apparently your anonymous HTTP referer info (from Google Analytics?) will be sold to insurance companies / appear in a background check? It is blatant fear mongering and really made me angry.

stephengillie · on Aug 23, 2012

It all goes into the huge (secret by law) national insurance database, which is accessible by all insurers. All information gained by an insurance company about an individual is put in there, and insurers are prohibited from discussing it.

Jayschwa · on Aug 24, 2012

Citation(s)?

toomuchcoffee · on Sept 4, 2012

http://www.techspot.com/news/49975-how-ants-have-used-the-in...
If you look at http://donttrack.us/, it says your data "can potentially end up" in those places, not "will be sold to..."

MikeCapone · on Aug 23, 2012

I've been using DDG as my default browser for a while (though I sometimes revert to !g to search google), and I'm satisfied with the experience.

The main thing that would really improve my experience would be if it was faster. Google really spoiled my with the instant results and suggestions as you type.

I'm also looking forward to them switching to SPDY as I always use the encrypted version, and this should make it a bit more responsive.

JD557 · on Aug 24, 2012

One of the main reasons I changed to DDG was because I was having some problems with google:

- Some times the instant search simply doesn't work until I press enter (and sometimes, even when I do, the results just don't appear) - I noticed that google doesn't take you directly to the page. It takes you first to an intermediate page (probably for tracking purposes) and then redirects you to your result. I would be fine with that, but sometimes my browser just got stuck in the intermediate page (no idea why).

I sometimes need to use !g too and I agree that google sometimes is faster than DDG (at least when it works). What I really miss from google is the autocomplete

MikeCapone · on Aug 24, 2012

Agreed. Autocomplete would be awesome. I think it's pretty much expected from any search engine nowadays. I like how Chrome and iCab will sometimes use google's autocomplete even if I'm doing a DDG search, but I wish it was also available straight from the site's search field.

Steko · on Aug 23, 2012

Google has put so much effort into speed over the last 10 years that they can't perceptively improve much in this department going forward. So DDG's target is relatively stationary and closing the gap may be expensive but is at least straightforward.

mmahemoff · on Aug 24, 2012

Well, it's only stationary if Google's features are stationary too. e.g. when they introduced live search, that's a new performance target any competitor has to reach.

MikeCapone · on Aug 24, 2012

That's a good point, though I'm afraid it might be one of those things where diminishing returns mean that it becomes very very expensive to catch up, and I don't know if DDG has the resources. Moore's Law should help, though, but personally I haven't really felt (subjectively) like DDG has become much faster since I've been using it.

Steko · on Aug 24, 2012

The scale of capital investment needed is really only feasible in the context of an acquisition or maybe a huge DDG boom stemming from a massive Google privacy scandal.

MikeCapone · on Aug 24, 2012

Probably. But just adding autocomplete and improving speed some (doesn't need to be quite google-equivalent) would go a long way towards making the experience subjectively better.

Another thing that annoys me is that they have ads near the top of the results that don't load quite at the same time as the results, so I sometimes am about to click on the top result but an ad pops up and pushes everything down making me miss my click. If they could somehow avoid that happening, it would also make the experience better.

brokenparser · on Aug 24, 2012

Error: DDG is not a browser.

MikeCapone · on Aug 24, 2012

I meant 'search engine', that was a weird typo. Sometimes my fingers don't listen to my brain.

JohnsonB · on Aug 23, 2012

DuckDuckGo has really fallen behind Google in the zero click information department, which is one of DDG's key selling points. DDG needs to have one or more major competitive advantages over Google in order grow. Privacy is great, but zero click is something obvious and usability orientated that really stands out. I'm not going to use DDG if Google gives me more zero click information no matter what; it just makes too significant a difference in search experience.

boyter · on Aug 23, 2012

What about things like,

  https://duckduckgo.com/?q=frequency+of+letters+in+The+quick+brown+fox+jumps+over+the+lazy+dog
  https://duckduckgo.com/?q=days+between+6%2F22%2F1979+and+10%2F5%2F1979
  https://duckduckgo.com/?q=hn+duckduckgo+interview
  https://duckduckgo.com/?q=php+xml_parser_create+example
  https://duckduckgo.com/?q=currently+in+theaters
  https://duckduckgo.com/?q=msft
  https://duckduckgo.com/?q=currency+in+panama

The only one I think Google does better is "currency in panama" however it also gets the information wrong in the "zero click" answer. The only reason I like that result more is the Wikipedia answer on the right is just more appealing to my eye.

comex · on Aug 24, 2012

Careful... for "hn duckduckgo interview", at the moment, DuckDuckGo shows a HNSearch widget with three (stale) results, none of which is this page; no relevant results appear in the main listing. Google has no special widget, but this page comes up first in the main listing.

To digress a bit, zero-click is great when the information you want is actually accessible with zero clicks, but it's very, very limited: as soon as you need to click through to a website, special widgets can't compete with a solid backend for regular search results. That's why I can hardly imagine switching to DuckDuckGo...

moollaza · on Aug 24, 2012

Hey, I'm the intern who implemented the HNSearch Zero Click plugin, so let me explain the poor result:

A HNSearch (HNSearch.com) for "interview duckduckgo" doesn't return this thread at all, in any of the results (even on a 'stories' only search), however the one thing it does surface is your comment, because that exact phrase was found in it. I realized we weren't showing the right comments, and I found a very small bug which has now been fixed. So thanks for getting me to notice that :)

However, if you search "interview duckduckgo's" (ie. the same wording in the thread's title) the ONLY result returned is this exact thread (an HNSearch limitation).

These same results are fed to us by the HNSearch API and so the fault lies within HNSearch's search methodology (as far as I can tell). We're still looking for a resolution to this, however, any suggestions are welcome!

boyter · on Aug 24, 2012

Fair comment. I was hoping the hacker news search would be a little more accurate. I would argue that it is what failed in this case but it does reflect poorly on DDG...

Interesting both Google and DDG have the linked article as result number one for the term "duckduckgo interview" but no Zero click for either.

That's an interesting comment about the zero click. It's interesting, but I find myself using zero click info more these days. I tend to craft queries which I know will pull this information back for me. Its much the same way that Google trained me to use their syntax or how the instant search trained me. I agree that a solid index of the web is critical for regular search results though.

I would be curious to know if the apparent issues with Bing/DDG are more down to people perceiving Google to know the answer. I know that when I use alternative search engines I get weird looks from people at work who say quote "Why are you using that? Its crap! Just use Google."

AtTheLast · on Aug 23, 2012

I like what I've seen from DDG so far. I don't expect it to be Google, but it seems like the search engine keeps getting better with each iteration.

With the launch of DuckDuckHack it will be interesting to see what people build for the platform. Plus, I'm just excited to see a talented team take on search.

whichdan · on Aug 23, 2012

I used DDG for a few weeks, but ultimately switched back to Google. It was mainly for two things:

1) It's slightly slower than Google, which became more apparent after the fifth or sixth search of the day.

2) No images integrated with search results. I didn't realize how often I searched for images until I used DDG. At Google I usually get a few images and an "Images" link to click. At DDG I needed to add a !gi to my query.

Like JohnsonB said, there really need to be another draw besides just privacy.

panacea · on Aug 23, 2012

I've found I add !gi to my search string subconsciously nowadays, when I know I want to perform an image search, whereas before I would enter my search terms, wait until google had loaded its results and then hunt for the images link to get the full image search results.

In some way it's 'trained' me to perform more targeted searches from the outset.

The slight lag before getting results is noticeable though.

I'm going to stick with the upstart for the moment nevertheless.

raghus · on Aug 23, 2012

The article mentions $115K in revenue last year. How is that close to self-sustainable for a 10-15 person company?

dkhenry · on Aug 23, 2012

They average 500% growth. If revenue also scales like that then they are well on their way to profitability.

603techguy · on Aug 24, 2012

That doesn't explain sustainability now.

Mythbusters · on Aug 23, 2012

This is one product I hope people really get behind. Good interface and smart aggregation. Keep it up guys!

mark_l_watson · on Aug 24, 2012

I purposefully use duckduckgo several times a year as my default search engine for several weeks because I like to support alternatives to mega corps.

Google meets my needs better and their recent power search class was very cool.

A little off topic but I have had more than a few fantasies about starting my own micro search engine. Text analytics and knowledge management in general have been an interest of mine since the early 1980s. What stops me is that if I were to invest my personal resources in this I would want tens of thousands of users getting value from my system every day, and frankly, I don't think I could achieve that. Gabriel gets an order of magnitude more than what I would hope for, so I hope that he is very satisfied with what he has achieved. Good job!

petdance · on Aug 24, 2012

Sad that there's no mention of the huge role Perl + CPAN play in their infrastructure:

* http://help.duckduckgo.com/customer/portal/articles/216392-a... * https://github.com/duckduckgo/duckduckgo/wiki/DuckDuckGoPerl * http://www.perlmonks.org/?node_id=848999

They've also been visible in sponsoring the Perl community.

pacomerh · on Aug 24, 2012

Also, don't miss Gabriel's interview on Techzing http://techzinglive.com/page/423/techzing-68-gabriel-weinber...