Hacker News new | past | comments | ask | show | jobs | submit login
Google's real monopoly is on the user data (propellernet.co.uk)
49 points by illdave 5 months ago | hide | past | favorite | 36 comments



Story time: many years ago, user testing of search ranking worked with a process called "side by sides". This where for a certain group of people they would compare search results in production with an experiment and they would see which were preferred. The goal of Google is to make the result the user wants #1 on that list. So you would have known searches and you could compare the old result and the new.

This was actually labor-intensive on Google scale and thus expensive. It also had problems. For example, certain results might be time-sensitive. Searching for "Simone Biles" now should have her Olympic results near the top of the list. Run that same query in 2 years and the desired result will be different.

Also, user behaviour changes over time. If you had user search data from 2000, how would you deal with Reddit search results now? Or the SEO content farms that came later?

You need ongoing user behaviour data to continually refine search results. It's a constant arms race.

So along comes Chrome. Originally then-CEO Eric Schmidt didn't really see why Google should get into the browser business. This was in the 2000s. But we were still in the grip of the IE dominance although it was waning and Firefox was floundering. In the 2000s browser support was a much bigger problem than it is today. It's why we got things like jQuery.

But Chrome went ahead and even from Chrome 2 or 3 it was so much better than anything else. One big innovation was a process-per-tab. FF was known for freezing the entire browser because each tab was a thread.

But why did Google invest so much in this? Search results.

Chrome gave Google insights into how users interacted with search. Side by sides no longer became necessary because Google had direct insight into how users interacted with search results, whether they clicked on a link and immediately left (ie bouncing, this absolutely hurts how Google ranks your site and Google uses it to downrank SEO content farm sites) and what link gave the user the result.

This accumulated knowledge and insight from the user's browser is something that no one other than possibly Microsoft could theoretically compete with.

Why did I tell this story? Because this article talks about Google's search dominance and doesn't mention Chrome. You cannot talk about one without the other. If you don't mention Chrome, I really question your knowledge on the subject.

Disclaimer: Xoogler


Good points. And the more reason to avoid Chrome! I do not consent to my browser behaviour being used to train ad serving models and other "AI" cases.

Glad I use Firefox. It works, and while not perfect, is considerably more respectful of my privacy and less misaligned with my interest.


Makes complete sense, vending the data (i.e. search results) and having direct insights into the consumption of those results (i.e. chrome browser behavior) is a game changer. I imagine this is also part of the value of Google analytics.


> Chrome gave Google insights into how users interacted with search.

How did his work? Google would know what result was clicked and if another result was clicked shortly thereafter in any browser.


> But the real question is, given that Google has around 90% of the search market share, and so has around 90% of the available user data, is the advantage it gives them so unfair that it should be illegal?

In the U.S. it’s not illegal to be a monopoly but to use your monopoly position to prevent competitors from entering the market. (This is why it’s not illegal for a small town to have a single gas station, for example.) If 90% of users freely choose to search with Google that’s fine from a strictly legal point of view. The U.S’s case was that Google was coercing that choice by bidding extremely high amounts to be the default search on web browsers, thus hindering competitors from entering the search market.


They also hurted Firefox a lot by promoting Chrome on every Google search query.


They hurt Firefox by giving them money. If not for the big fat google pay check Firefox would have had to be a good browser instead of the piggy bank for the personal projects of its former ceo.


There aren’t a ton of medium-sized successful browsers out there that didn’t take Google’s money, I think funding web browser development is just hard.

IMO the real anti-competitive behavior was iterating so fast on web standards. Chrome seems to be the vanguard on a lot of new web features, and has started up this whole silly JavaScript performance as a figure-of-merit competition.

The fact that Firefox has to pay dozens of full time engineering salaries or whatever just to tread water is the real problem IMO. Because not only can they not do it well, but most groups can’t do it at all.

As a result, the web as a standards-based platform for sharing documents is basically dying, being killed better than Microsoft ever could. And without the obvious appearance of being evil.


That's hardly the only possible outcome of the scenario; we might not have it at all.


They also hurt Firefox by literally sabotaging them: https://archive.is/tgIH9

And by promoting Chrome on every Google property from Youtube to Search to Google Docs


The other way a monopoly is illegal is if you use it to try to acquire a monopoly on some other market. (Microsoft using the OS monopoly to try to get a monopoly on browsers, for example.)


And they own Youtube...

That will be a massive benefit when it comes to training AI-powered robots I think. Conceivably, an AI that's seen every plumbing-related video on Youtube would be superior to any human plumber you can imagine, assuming a good mapping between the human hand movements in the video and its own robotic arms. And so on for other domains.


Eventually we'll develop an AI that, if shown every plumbing video, would be better than any human.

Right now there's a few million Tesla cars on the road, in principle all could gather training data and gather tens of billions of miles of real world experience each year compared to the low single-digit millions of miles that a professional driver would see in their lifetime… and despite that, FSD has yet to earn the "F".

Cracking the data requirement for training AI is high value research; I don't think anyone has solved it yet, but this field being what it is it may have happened months ago behind closed doors, or it might be something that's just been added to the HN feed while I write this, or we may still be saying the same things in 2034.


It's a scary thought to me that 90% of online videos are hosted on YouTube. If Google were to ever fail, I know that won't happen, but I think the internet would be truly broken if for any reason YouTube is down.


In a sense, I am slowly preparing for a variant of that event now by archiving items I wanna keep using youtubedl.

I can't honestly say youtube disappearing won't have an impact, but it will not be the end of the world even I won't get to archive all the items I want. Something will replace it. And if it doesn't, that is ok too.


For me personally it wouldn't matter too much if YouTube would be gone. I think I have several days or even weeks I don't use YouTube at all. But I agree for alot of people the entertainment factor (and for some the educational one) would be hugely broken if YouTube went offline.


Personally I would be fine too. But there is a lot of information which is not written down anywhere online, but exists on YouTube. This is anything from conference talks, to people showing how to use or fix some equipment/machinery. And it seems that informational content is getting more and more video centric. So there is a long tail of valuable information there, I believe.


as an avid youtube junkie, I mainline it right to the jugular several times a day, and if it went away maybe I could finally find the time to tackle my ever growing todo list


Go and join the Distributed YouTube Archive project


For a little bit of time. Vimeo, etc. offer readily available alternatives, there are some open source initiatives to improve availability, and frankly VOD is a commoditized service these days. So YouTube going away would impact a mostly-not-connecting ad channel, and savvy creators that preserved their videos and farmed emails will survive without too much hiccup.


Imagine if chrome had a bug that required users to reinstall it.

Imagine the falloff of internet traffic.


I mean, in a sense, the entire internet is likely ephemeral.


Is owning YouTube a big advantage? I assumed that nearly every public video has already been downloaded and is being used to train non-Google AI. Google definitely has some extra data (viewing behavior, clickstream, likes, etc.) that others don't, but that doesn't see critical to AI training.


Early in this article, there's this sentence: Google’s results are really good.

Reminds me of the comedian who, when asked, "How's your wife," would say, "Compared to what?"


I can't get behind the article's conclusions.

1) Google’s results are really good. Not quite. Google's results used to be really good but now are a mix of helpful to some, less helpful and fully counterproductive.

2) and they’re really good because they have a monopoly on the user data. User data may elevate some products but the worth of search results stems from the crawled index and search algorithms.

2a) every engineer could leave and a start a new search engine with the exact same source code and that search engine would be worse. Only until their crawlers built a sufficient index. If they can filter out results gamed for Google SEO, they'll be better.

3) In a landmark case, a US judge has ruled that Google’s monopoly on search is unlawful. Trim the first 3 letters of the last word and this conclusion will be sound.


Whenever this comes up I'm glad having read "The Age of Surveillance Capitalism".

Not an easy read by any means, but it's core hypothesis is spot on IMO: Google (and others) core business is aggregating and processing as much user data as possible. Every product they create and every decision they make aims to increase or protect their access to data.


after hearing about this book, I understood why every shop nowadays is pushing their clients to relinquish as much personal data as possible.

It seems like their real business model is to gather customer data in order to sell it to brokers, and the shops are the bait


While "The Age of Surveillance Capitalism" is great, I don't understand why everyone assumes that Google uses the user data it gathers _only_ to enhance search and for selling advertising. Look at all the things people are using chatbots for today and then ask yourself what could you use a chatbot trained on user behavior for? This is like having a window into the psyche of the world, an oracle of global scope. Maybe they use it to evaluate which companies to buy? For doing market research and product evaluation? For analyzing corruption in government? Seems like there would be a million uses of it, all of them exclusive to Google because they have the most comprehensive data. Remember that they collect location data also. Imagine what China and Russia would do with that data if they had it (and look at what they are already doing with the data they have).


They barely make use of it for things like gemini (or else their model would likely surpass gpt-4) so I’m not that convinced by this


Google is playing catch-up in the LLM space. Before 2023 they did not spend anywhere near the resources and focus on it as OpenAI - who at that point had been building for 3+years already. It is actually astounding how quickly and how many have caught up to OpenAI. But I would not expect Google advantages in data to become visible for another 2 to 5 years, if they are able to harness it at all.


I'm sure all that user data was acquired from consenting users who knew what they were doing, and not at all illegally. Right? And certainly not while taking advantage from a monopoly (or duopoly) in many areas?


Canceling the end of 3rd party cookies is also an abuse of their monopoly. Users should be given the choice of how their data is used.


Firefox and Safari have massive hard coded lists of carve outs.

There is a colossal amount of specification work to do to figure out how to make various necessary web flows possible again after removing third party cookies.

Chrome has an incredible commitment to web standards, to not throwing in arbitrary browser-specific web functionality, to going through the process of improving the web holistically. Trying to live up to that high expectation is hard, and we can see a dozen third party cookies or storage isolation concerns they're having to tackle if we go look at https://github.com/orgs/explainers-by-googlers/repositories?...


Funnily enough, it's exactly the other way around. Google was prevented by completion regulators from turning down third party cookie support by default.

(And users already have a choice, Chrome has a setting for whether 3p cookies are allowed, blocked, or blocked in just incognito mode. Reading between the lines, they are probably going to ask users to explicitly choose, since they're not allowed to change the default.)


...users should jave the choice to not have their data used period.


Nonsense.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: