Hacker News new | past | comments | ask | show | jobs | submit login
How Bezos built his data machine (bbc.co.uk)
126 points by colinprince on Feb 18, 2020 | hide | past | favorite | 40 comments



This is an interesting article, but I'd like to bring some attention to a bit of relevant history. From https://en.wikipedia.org/wiki/Rick_Dalzell

"Prior to joining Amazon.com in 1997, Dalzell was vice president of information systems at Wal-Mart starting in 1990. At Walmart he developed their datawarehouse strategy from the ground up, giving their suppliers direct access to demographic sales information."

https://www.cnet.com/news/wal-mart-sues-amazon-others/

"Then, in July, Amazon hired Jimmy Wright as vice president and chief logistics officer. Wright worked for Wal-Mart for 13 years, serving as a "key logistics leader," according to Amazon.

Observers say Wal-Mart, one of the world's largest retailers, prides itself on its computerized supply chain and retail management system and is aggressive in protecting it. "

I was working in WalMart Information Systems Division at the time, and Dalzell's exit, along with some others, was a Big Deal. We had meetings about it.

At the time, I and pretty much all of my peers thought the internal reaction was kind of absurd. WalMart was, in the late 1990s, ascendant in many ways, and Amazon was just some 'west coast dot com selling CDs and books in the mail'.

Funny story: in 1997, we were deploying routers to over 2000 WalMart stores worldwide, and they were made by a company called "Advanced Computer Communications". We didn't switch over to Cisco routers for a couple of years, because the ACC routers properly supported SNA protocol encapsulation and Cisco didn't. Anyway, the name of the ACC routers we were buying was 'Amazon', which, I was told later, momentarily caused some panic and confusion in senior leadership.


I don't know how I feel about this. I have been an Amazon customer since 1996, and the company almost certainly has data on me for a period spanning more than half of my life, including very private things like information on my medical history.

On the other hand, I have also owned Amazon stock since shortly after their IPO, and at some level have financially benefitted from all the data that the company has gathered on its customers.


Amazon has a similar amount of data on me. And yet, their product recommendations are never really helpful. A toddler would probably be better at recommending products for me than Amazon is. So I'm not sure if the amount of data actually helps them financially.


I am constantly creeped out by how well Amazon markets things to me.

There are two types of recommendations I get from Amazon though. The first is ads I see on sites like Facebook and are things I would never look up, but I'm also curious if that is part of their strategy. I get ads on facebook for things like a unicycle that is motorized and can go like 35mph. Half the time I end up clicking on these insane ads just to read reviews or the description.

The on site recommendations work very well for me though (they know I love woodworking and Cooking/Kitchen stuff).

I'm curious if the outrageous ads are to get you on site, then the targeted on-site ads are to actually get you to buy things?


Agreed that their recommendation engine is nearly useless.

But that's not really where the data benefits them. The power is in aggregate data: Product sales history, trends, etc.. The things you can do with that kind data, in that volume, across nearly endless markets and verticals... Throw in an army of third party sellers paying you while they do your research and power your analytics for you for free: The real mystery is that they're not eating the internet retail industry faster.


I am being half sarcastic half serious here, but what if they are actually able to properly recommend things to you, but they want you to buy more of the other in addition to what you are currently consuming?

If you look at google search results it seems more obvious that at least this is what G is interested in doing.


Interestingly, I interviewed with Amazon, with a person who worked on the [new] AI recommentation engine. Of course, I did point out that the new recommendation system was nearly unusable and tried to explain why.

Did not get the job. Which means they might be quite comfortable with the status quo.


The better the product recommendations, the more creepily obvious it is just how well they know you.

Nah. Things like search would be better if they were hiding competence.


Hint: Reader Mode [ enabled ]

This is a horrible presentation but reader mode makes it much better.


+1

I am big in MTB community, and horrible presentation is super prevalent on quite a few websites for brands.

Perhaps the worst example: https://www.yeticycles.com/bikes/sb165


How else are you going to sell some rich idiot an $8K _bicycle_?


I had no idea it was possible to obtain one so cheaply.


You scared me, but I found it an enjoyable and tasteful presentation that enhanced the content.


For UK-based visitors, it's alongside the release of a Panorama episode on Amazon:

Panorama, Amazon: What They Know About Us: https://www.bbc.co.uk/iplayer/episode/m000fjdz


mm. this is what annoys me a bit with BBC online - much of their interesting content is simply promotional fluffing for TV media they have on otherwise.

I get why they do it and don't particularly feel they shouldn't - but there's been so many times I've clicked on ostensibly interesting articles on the BBC website only to find it's just a mechanism to encourage me to watch TV... which I try to avoid.


Is anyone else always bothered by titles that make these celebrity CEOs sound like they did everything single-handedly on their own? Why can't we credit the teams of people that did all the work instead?


I think for some CEOs it is actually applicable. Bezos being one of them. He has build Amazon from scratch. Sure he hired lots and lots of people to help him. But hiring the right people at the right moment is not easy and very, very important. Many startups fail at exactly that point.


I think Bezos is great and probably the right person but I also think Amazon was inevitable. If it wasn't Bezos it would've been someone else. Maybe not in 1994 but shortly thereafter. Amazon is not invincible either. They started amazon.cn in 1998 but they are now shutting down that division/business.


Did his wife fund the effort with him, but remain out of the spotlight? (I always wondered.)


So Bezos gave other people’s money to other people, they built something, and he deserves all the credit?


Yes, he deserves the majority of the credit. Why do you think CEOs exist?


To take credit for things they themselves didn’t do


I think he deserves like 17% of the credit.


the person who gets the credit is the person who is not fungible in the process. We credit the architect with the building, not the bricklayer.

Amazon does not exist without Jeff Bezos having created it. The employees and the investment are the resources with which he built it.

No-one actually believes Jeff Bezos is running around himself doing all the labour.


>No-one actually believes Jeff Bezos is running around himself doing all the labour.

You might be surprised just how much a figurehead is credited in contribution, regardless of how much or little they did along the process. I imagine a significant portion of the population at large would assume Bezos masterminded and pushed most the process down to about 1-2 levels of organizational hierarchy from such headlines and glances.

The reality is, he put a business need in place (focus on metrics and quantification) and let C levels percolate the concepts around the organization until a strategy emerged. He may have signed off on it but likely someone below him did and ran with it. I'm not sure emergent organizational successes should be credited to whomever set a few initial conditions unless the entire process can be outlined into how those successes emerged and were shaped--but of course they are frequently credited to someone who simply got lucky or had piles of money to throw at a problem with the right people in the room.

While it's sometimes good someone decided to throw piles of money at a given problem, their goal was not inherently to solve the problem. It also doesn't mean they solved the problem. Lots of people throw lots of money at very important problems and get no credit or sometimes have no success.


I think you underestimate the role of a leader in building an organization like Amazon.

The things you handwave away - putting a business need in place, letting C levels percolate the concepts until a strategy emerged, they are all evidence of effective leadership.

It's wickedly difficult to set the initial conditions for emergent organizational success. Simply discounting Bezos by saying that he threw piles of money at the problem with the right people in the room is misunderstanding what his job is. His job is to get the right people in the room - identifying the right people is very hard. The more difficult job is listening to what they say. Casting aside your own ego and confronting difficult data. Most leaders have too little courage and too much ego to be able to do this effectively.

If you'd like insight into how important having the right leader can be then I'd recommend starting by reading the book Good to Great by Jim Collins or at least the chapters on leadership.


If you're going to bring in "Good to Great" then you have to consider the fact that some of his case studies (Wells Fargo, Fannie Mae) went down the tubes because of what Collins postulated made them successful (overfitting, unscientific).


Maybe? But you haven't made a case supporting that's why they went down the tubes.


Eh, this is just a feature of humanity. It’s easier to grasp things when we put a face on them.

Caesar didn’t literally conquer Gaul by himself, Napoleon didn’t conquer Europe by himself, Renaissance painters didn’t paint works by themselves, etc. etc. To use the metaphor in Steve Jobs, they played the role of conductor.


>One database contains transcriptions of all 31,082 interactions my family has had with the virtual assistant Alexa. Audio clips of the recordings are also provided.

>The 48 requests to play Let It Go, flag my daughter’s infatuation with Disney’s Frozen.

So does Amazon tag voices to specific people too? Seems like a no brainer to start collecting data on a little kid if you can, legality aside. I wonder.

Also, if i set up a pi that just continually spams Alexa with random voice generated questions, will they ever figure it out and stop saving those recordings?


I think Alexa can be customized to recognize specific voices now but that's a recent feature.


I haven't come to a personal conclusion on how I feel about the morals of it all, but you have to marvel at the feat of engineering.


Marvel at the feat of operations, not the engineering. The techniques used to process this information over the years have mostly been quite banal. Look behind the scenes and much of it is just batch processes written in old school languages (Perl, Java, even C) reading text files from one folder, filtering or enhancing them, and writing them out to another. I once taught a PM how to use a Perl hash to do a join instead of nested for loops and she spent the next year feeling godlike because she could do this stuff in O(n) instead of O(n^2).


> I once taught a PM how to use a Perl hash to do a join instead of nested for loops and she spent the next year feeling godlike because she could do this stuff in O(n) instead of O(n^2).

Loved this.


Correct I think that would be a better way to put it. None of the individual building blocks seem very technically challenging, but to consistently enforce it at a company wide scale seems quite cool.


What’s a PM, prime minister?


Because of the context, I'd assume "Project Manager", basically a low-level manager of sorts.


At Amazon "Program Manager", in this case really was a TPM or "Technical Program Manager". TPMs are effectively serial project managers, often charged with shepherding complex projects that require the efforts of more than one business or development group. Dealing with the data produced by those projects sometimes fell on them since it was difficult enough to persuade the developers to write the primary functionality.


i havent come to a personal conclusion on if its even a feat of engineering. Im impressed by amazon’s handling of logistics, and I do think consumer data helps Amazon know when and where products are most likely to sell, perhaps even upsell customers or get customers to agree to a steeper markup, but beyond that, i don’t think actual predictive models of consumer behavior has really been achieved here. Ive never gotten a sense that amazon intuits my consumer habits and ive made a lot of purchases over the years. i dont think amazon has made a dime on storing timestamps for when i turned a page on a kindle or learned anything from the music i listened to. ive never bought a single recommended product. On a scale of 1-10, 10 being Delos incorporated and perfectly modeling user behavior—-accurately predicting how i’ll react before I even see the products they place in front of me—and 1 being Wall-E making blocks of trash, id say this data collection is a 2.5 at best.


I would personally define efficacy of predictive modes to be a scientific achievement rather than engineering, but that might just be a vocab discrepancy.

Do you think it’s possible to determine how good their modeling is externally? I would assume we’d need an extensive peek behind the curtains to come to a conclusion.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: