Hacker News new | past | comments | ask | show | jobs | submit login
Google Squared is Live (google.com)
95 points by zeedotme on June 3, 2009 | hide | past | favorite | 63 comments



"Religion" is entertaining. Christianity is written by CS Lewis and published by the Oxford University Press, which in fact publishes most religions. The canonical image of Islam is the guy holding the sign "behead those who insult islam". Slavic mythology (the 27th result) is located in the country of Afghanistan. The distinctive image of Humanism is a bus advertisement (I think) saying "Why believe in a god? Just be good for goodness' sake."

Adding the column "adherents" reveals that there are 4,400 Jews in the world to Islam's 300, to Christianity's 1.9 billion. Adding the column "Leader" reveals that the leader of Christianity is Chris Argabright, complete with a phone number I won't reproduce here.

Actually I have to admit this isn't too bad for a computer algorithm, it's just the better the algorithm gets, the more entertaining the wrongness gets.

Ah, and poking "science" in gives you a phone number for each science. Nice.

Further edit: Poke in "X-Men Origins" as a column for almost any query. It's like magic. A coworker tried something like the following and discovered the column showing up, then we couldn't resist poking it into many other queries: http://www.google.com/squared/search?q=square&items=tria...

Further, further edit: Add "Rating" (not "ratings") to religion! It's so great that we have Google to sort through the difficult problem of rating religions for us.


Adding "Star Trek" as an additional column is pretty enlightening as well. Apparently Islam has a Star Trek value of "Set Phasers on Awesome!"


It sounds like this will be fun^H^H^Heasy to googlebomb...


Nice. I searched for four scenarios. Apparently, one can train the engine, as discovered in the 4th scenario. In that, it said that it couldn't automatically build a square about the topic and asked me to enter up to 5 examples. I entered Alan Turing, Alan Perlis, John McCarthy, Donald E. Knuth, C.A.R. Hoare. I was delighted to see that Google Squared built the square and added names of Norbert Weiner and Claude Shanon to it.

This is a good application of machine learning.

  Scenario 1: "renaissance artists" florence
    Squared: http://www.google.com/squared/search?q=%22renaissance+artists%22+florence
    Web Search: http://www.google.com/#hl=en&q=%22renaissance+artists%22+florence&btnG=Google+Search&aq=f&oq=%22renaissance+artists%22+florence&aqi=&fp=1mZ_-PL2Zjc

  Scenario 2: "open source" "cryptographically strong" "random number generators"
    Squared: http://www.google.com/squared/search?q=%22open+source%22+%22cryptographically+strong%22+%22random+number+generators%22
    Web search: http://www.google.com/#hl=en&q=%22open+source%22+%22cryptographically+strong%22+%22random+number+generators%22&btnG=Google+Search&aq=f&oq=%22open+source%22+%22cryptographically+strong%22+%22random+number+generators%22&aqi=&fp=1mZ_-PL2Zjc

  Scenario 3: "string theory" problems
    Squared: http://www.google.com/squared/search?q=%22string+theory%22+problems
    Web search: http://www.google.com/#hl=en&q=%22string+theory%22+problems&aq=&oq=&aqi=&fp=1mZ_-PL2Zjc

  Scenario 4: "mathematicians" "computer scientists"
    Squared: http://www.google.com/squared/search?q=mathematicians+%22computer+scientists%22
    Web Search: http://www.google.com/#hl=en&q=mathematicians+%22computer+scientists%22&aq=f&oq=%22open+source%22+%22cryptographically+strong%22+%22random+number+generators%22&


Google Sets (http://labs.google.com/sets) was one of the earlier things available in Google Labs, back when it was just a dumping ground for weird stuff (as opposed to one for non-mainstream features). I'd guess this is where the technology used to generate the square from the samples comes from.

I also wouldn't call it training, for what it's worth. I punched in the same query, and was asked (like you were) to provide some samples - shouldn't (ideally) it know that it's not known this before, and use the samples you suggested?


Would you trust a single input for immediate future use?

If so, Google Square has some Viagra to sell you...

In other words: A single user can never be trusted when they know nothing else about you - if they did, the spammers would be out in force as soon as it got any traction at all.


Yea but like for the "wiki search" they can trust you know what you want to see.


Aha, thanks for the recollection about Google Sets. Your guess looks plausible.

I agree with vidarh's point about spam potential, re: reusing previous samples.


Great, I tried the same query. Description says Alan Turing was born in Orrisa, India but Place of Birth field says London. I think such inconsistencies might limit the use of Google Sqaured for serious research.


I used it to compare televisions (I'm thinking about buying one) and it was surprisingly useful. It definitely reduces a lot of the noise around doing side-by-side comparisons of multiple products. The data was VERY complete, I am impressed.



Oh sorry, I re-read the article. It says Alan Turing was conceived, and not born, in Orissa, India. Apparently, Google is better than humans at interpreting text :)

PS: I am not sure if place where Alan Turing was conceived is an apt information. Why would anyone want to know that?


Once you wrapped your mind around how to use this, I can see this being really useful - perhaps even more useful than Wolfram Alpha.

It still gives you lots of garbage results at the moment (try a search for religion - it's hard to get relevant data when you add columns or religions), but the ability to do a structured search on a whole array of similar items at once? That's cool! With countries, although you get some obscure nations to begin with, you can add rows and columns (like GDP, population, area, etc...) very easily, and get very useful comparisons.


I like that they show you the source pages, confidence, and alternative sources for the data.

However, it suffers the classic Google app problem; it does interesting stuff, but just looks ugly, and is generally unappealing. Hard for me to explain why, but despite my interest in the technology I just didn't enjoy using it at all.

But it is interesting stuff, I'd been wondering when something like this would emerge since watching this Norving talk from 2007 [http://www.youtube.com/watch?v=nU8DcBF-qo4].


Is it bad that I just don't get this?


Google is returning structured data.

Google's specialty has been navigating through the seas of unstructured Web data that comprises the Web and making sense of it. Squared shows that they understand enough about this data to (semi-) automatically return it in a structured form.

Structured data on this scale is considered a bit of a holy grail, especially from the POV of Semantic Web advocates. The cornerstone of the original Semantic Web vision was to have web developers manually annotate semantics into their HTML sites. Now they don't have to.

The real value of Squared is the ability to query this data via an API. Arguably, Freebase already provides this capability, so the value of Squared is questionable, but presumably, Google is more automated and up-to-date or at a larger scale. I'm not sure -- maybe somebody more well-informed than I can comment.


Nice, thanks for the very useful dissemination.


I think it's one of google's answers to wolfram alpha


I don't see the similarity at all. Wolfram Alpha gives you detailed information about one thing. Google Squared gives you general info on a things in a category.


This fits right in with Wolfram|Alpha. In order to produce all the amazing comparisons Wolfram|Alpha they had to turn databases, and the public web into structured data sets. What would make this much more then novel is if the collected data sets were made public or licensed for academic use. I'm hesitant to say commercial use because a) they can already do that and probably will, and b) it's the public assisting with collecting the data. One problem with Wolfram Alpha that a user submitted structured data will not solve is the ability to cite sources (or provide some reason to trust the sources). Wolfram Alpha has created a bunch of it's own datasets and academically you just have to "trust them".


And Yahoo YQL


There is no relationship at all. YQL is just a way to query disparate Yahoo data sources with a relational-like language.


I didn't either. Maybe I still don't.

I think it takes a search and is smart enough to give you comparisons between results. Look at their examples to get a feel for how it works.

I could be wrong, someone let me know if I'm completely off base.


Using heuristics, it attempts to form structure from unstructured data. It's a pretty good first attempt.


I did check out the example searches. First one I clicked on - Hawaiian islands. http://www.google.com/squared/search?q=Hawaiian%20islands...

What does "color" refer to here? The island's flower? What colour is melemele?

I should point out, I get what it does technically, but I'm just struggling to think of a repeatable user scenario where this would be useful...

I also noticed a couple of weird things:

1) They're not URL encoding their example searches. Tsk.

2) Searching for "google squared" on regular Google brings up Techcrunch, and the Google Squared site is halfway down the list.


Try example searches for "roller coaster" and "US presidents" . I think those are better examples to try and get a feel for what's going on.



It works fine for products, but in that capacity it's just a slightly different layout for a Google Product search. (anyone else still think of that as "froogle"?)


This doesn't seem to be consistent in the way it is trying to help me; for example, a search for "tympanuchus cupido" (the genus and species name of the greater prairie chicken) returns only phasianidae, the family to which prairie chickens belong. Meanwhile, a search for just "tympanuchus" (the genus, which includes more than just prairie chickens) returns Attwater's Prairie Chicken National Wildlife Refuge, which seems only tangentially related. Terms with multiple meanings (which is precisely what structured search is supposed to help with) don't seem to recognize this multiplicity--consider a search for "peripatetic", which is both the name of a school of ancient Greek philosophy (which is recognized in the results), and a term meaning wandering or itinerant (which is not recognized, even when the search is changed to "peripatetic definition"). Even the most straightforwardly categorical queries don't seem to work that well ("dungeons and dragons classes" returns a list that is neither complete, nor correct). Speaking of neither completeness nor correctness, if you think my query choices were a little too farfetched, consider the query "search engines"--it returns none of bing, Duck Duck Go, and cuil in the first 50 results (although, somewhat entertainingly, Yahoo is the first result and Google is the second).

Is there something I'm missing that this engine actually does well?



I'm not sure how this is superior (or how anything along this line would ever make it superior) to getting the information via classical search...particularly since it is readily available, more complete, and more organized via wiki (http://en.wikipedia.org/wiki/ABBA_discography). Can you think of any advantage this has?


Now you can find any kind of lists in one place, with any columns you want. For example, add a column for length. Much easier than navigating to each album's Wikipedia page.


Stunningly bad. Are they desparate to show they're not standing still and you shouldn't start using other services or something?

First thing I did, I clicked on one of their examples "US Presidents" - which one would assume would probably produce the best they've got to offer right? The result I guess would've been kind of cool circa 1999, but now.. no:

Google^2: http://www.google.com/squared/search?q=US%20presidents Result: 7, seemingly random presidents, in no apparent order or with rhyme or reason as to why they were selected.. Washington - OK, Jefferson - OK, Obama - Sure, he's current. Rutherford B. Hayes... WTF?

Wolfram Alpha: http://www.wolframalpha.com/input/?i=us+presidents Result: Basic stats in Barack Obama - current president of the united states, a brief list of the past couple predecessors and their effective start and end months in office, and an AJAX link to expand it to a complete list.

Wikipedia: http://en.wikipedia.org/wiki/Us_presidents Result: --> List of Presidents of the United States.. Awfully nice summary of what that means, notable highlights, and a color-coded complete table with pictures of the presidents, in order, including their full dates and vice presidents.

I don't foresee ever going back until I hear news they've dramatically improved, but based on my past recollection of Google's habits.. it'll sit and suck for a very long time with perhaps small incremental improvements.



In my square for programming languages, SNOBOL and COBOL both made the list (in that order, too -- I've never even heard of SNOBOL!), but none of the following appear: (in no particular order...)

C, C++, C#, Common Lisp, Java, Ruby, Smalltalk, x86

Heh, I listed those in alphabetical order without even planning to...


For me it's 1. Pascal, 2. Fortran, 3. Scheme. Guess the order is somehow scrambled, which would explain that Yahoo turns out number 1 for "search engines" :)


Funny how it partially confuses Pascal with the real thing, but still records its influences.


Try adding these columns: Paradigm, Typing Discipline, Designed By, Major Implementations

Impressive.


Not really -- those are all categories in the Wikipedia sidebar. The data is already structured for easy consumption.

Try something like a "Version" column and it starts to break down. It makes decent guesses overall, but they're not current (e.g. Python says 2.4, where 2.6.2 or 3.0.1 would be better).


My search for "freedom fighters" turned up Indians exclusively. I wasn't signed in. Strange. Are they not called "freedom fighters" in other cultures? It is a semantically dubious phrase.

http://www.google.com/squared/search?q=freedom+fighters


The Indian freedom fighters are unusual, in that most freedom fighters offer violent, armed revolt. Even Wikipedia singles them out in the second paragraph: http://en.wikipedia.org/wiki/Freedom_fighter


"Australian States" is probably the perfect query for this. result set should fit on a page, easily structured data, etc, etc..

Area worked for every state Population worked for some (NSW, QLD) When I added "Capital" it worked for all of them

I love the idea - still has a while to go


The results rarely make any sense. This kind of thing still needs to have some human intelligence mixed in; compare to http://www.noodlesquares.com.

Search engines

http://www.google.com/squared/search?q=search+engines

http://noodlesquares.com/SearchEngines.html

Cameras

http://www.google.com/squared/search?q=cameras

http://noodlesquares.com/Cameras.html

[disclaimer: associated with noodlesquares]


Ugh, I clicked your link. I want the 15 seconds you took from my life back.


Sorry, we're just getting going. Is it because it's slow or because the content sucks? We want to try a hybrid human/automated approach.


I liked it. I cannot enter queries there though, it's a pity since I really wanted to test that engine.


And it's really quite good: I got meaningful results on a wide range of topics of varying obscurity. A few outright failures, mostly on complex terms; eg 'california TV stations' gives me CA city information.

I sent my list of 12 suggestions to the labs team: highest priority were share/export to docs|base, and entering search terms or [sublist search] in individual fields. If they can make it social and create 'trusted tables' there's the possibility of having users curate a lot of data for them a la Wikipedia.


Well, it's good for a laugh.


"Google Squared couldn't automatically build a Square about Hacker News."

That's odd. It built a square around my personal name, the name of my personal website, and the name of my nonprofit, the last of which I would think would have to be less famous than Hacker News. Richness of semantic associations appears to be key here. My nonprofit's name has more words in it.


I added it as a row - and it could pick it up... (With the info from yesterdays hack post)


A few months ago I attended a talk by Alon Halevy (of Google) on the algorithms behind this. There are several papers with more details for those who are curious. Check his publications listed at http://alonhalevy.googlepages.com/, specifically those about WebTables and dataspaces.


"Porn stars" is resultful, but not in any order of notoriety. "Measurements" is, surprisingly, an available column, and "Sexual Orientation" worked, although not suggested until I started typing it. (I wouldn't recommend searching for this at work, obviously, although the Image column seems to be keeping things PG-13.)


Seems pretty useless right now as far as the actual data is concerned, but I like the interface and the idea.


Much like Wolfram Alpha. ;)


Interesting. Try starting with an empty square. Add a few items, and then add a few custom columns.


This reminded me of Google Sets, which has been in Labs forever: http://labs.google.com/sets

Sure enough, if you ask it to build a Square and it can't figure it out, it uses a Sets-like interface to ask you for five examples.



It took me a while to figure out what I wasn't seeing on these results pages.

SEO spam.

So I searched for "SEO spam" and it asked me to build a Square for it. No thanks >:(


Searching for "java xml binding frameworks" says that XMLBeans is required. I guess the other frameworks just aren't worth using. :-D


No damn VIM in "text editors", infuriating.


Well, emacs costs $39.95, and XEmacs costs $74.99, so maybe that's just as well.


'Squirrel' returns a good search square, but no red or grey squirrels.





Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: