Why did Google Brain exist?

dr_dshiv · on April 26, 2023

Lots of great insight. Here’s one:

“Given the long timelines of a PhD program, the vast majority of early ML researchers were self-taught crossovers from other fields. This created the conditions for excellent interdisciplinary work to happen. This transitional anomaly is unfortunately mistaken by most people to be an inherent property of machine learning to upturn existing fields. It is not.

Today, the vast majority of new ML researcher hires are freshly minted PhDs, who have only ever studied problems from the ML point of view. I’ve seen repeatedly that it’s much harder for a ML PhD to learn chemistry than for a chemist to learn ML.”

dekhn · on April 26, 2023

As somebody who has crossed the line between ML and chemistry many times, I would love to see: more ML researchers who know chemistry, more chemistry researchers who know ML, and best of all, fully cross-disciplinary researchers who are both masters of chemistry and ML, as those are the ones who move the field farthest, fastest.

quickthrower2 · on April 26, 2023

Society is not structured to encourage this. Getting a job sooner is more lucrative. Any breakthough you make having studied for a couple of decades is property of a corporation not you.

randmeerkat · on April 27, 2023

> Society is not structured to encourage this.

I wish more people understood this and the value that it would add to society. For some reason most people understand how roads and transit improve commerce, but fail to understand how education and social support for the individual does as well.

quickthrower2 · on April 27, 2023

Sport incentives work out nicely for getting a supply of top talent. Big money through sponsorship/ads/tickets, fairly formulaic system to get better at a sport (top coaches/teams, get fitter, practice a lot), an easy way to identify progress (wins games, scores goals).

But for sciences, someone could do great work for 20 years and produce obscure papers that might stay obscure, or may unlock the mystery of the universe or help us meet the world's energy demands without pollution! So the pay off if magnitudes higher than sport, but magnitudes less probable. Someone will run 100m fast, guaranteed, but will we solve science problems X Y Z and when?

petesergeant · on April 27, 2023

Would employees of non-sports industries accept the weird-ass ways that sports contracts work? Major League Pharma Research is certainly intriguing

nylon_panda · on April 27, 2023

To my understanding it'd need to be set-up so the proposition of R&D is granted by unbiased parties. I believe its been linked on HN before, where it describes what system would be best to incentivize research? It stated the dangers of funding the research of only the popular (and contemporary) topics, which tilts R&D away from discovery and more towards 'lets try to yield xyz, and if it fails, lets not release it'. Someone who is motivated to grant something must be completely separate from the publication, most likely those who researched it and many other scientists, & unlike peer review try not to trim any of it out

pama · on April 26, 2023

Present. I think there exist many of us. Chemistry is a very wide field though, so not sure if organic synthesis vs theoretical chemistry vs physical chemistry vs biochemistry will end up more useful to help tackle drug discovery problems or other chemistry applications. Same with ML I suppose; even though the specialties are less concrete nowadays, the breadth of publications has far exceeded the breadth of modern chemistry.

whymauri · on April 26, 2023

You could probably fit all the people who fit the last criteria in the same room (chemistry side is probably the bottleneck, especially drugs which is a effectively a specialization).

bertday · on April 27, 2023

I agree with you but does anyone even recognize the last category outside blue-sky research? People have a tendency to bin other people into buckets. Being a master at 2 things means you can’t be easily placed in a typical team structure.

kergonath · on April 26, 2023

> I’ve seen repeatedly that it’s much harder for a ML PhD to learn chemistry than for a chemist to learn ML

I can confirm. We regularly look for people to write some computational physics code, and recently for people using ML to solve solid state physics problems. It’s way easier to bring a good physicist or chemist to a decent CS level (either ML or HPC) than the other way around.

epolanski · on April 26, 2023

It's the same reason analysts come from math rather than economy degrees.

You can teach a mathematician what he needs to know about finance, you can hardly do the opposite.

throwaway2037 · on April 27, 2023

In my experience, people with degrees in math, physics, chemistry, biology/medicine, astronomy, computer science have higher/stronger intellectual rigor that can be re-applied to other fields, e.g., finance. Also, those areas of study are much harder than economics. Partly, it is a self-selection process. Yes, there are some with economics degrees whom are very bright, but they probably could have majored in any of the sciences mentioned earlier.

bertday · on April 27, 2023

The story I’ve heard is the economics undergrads can’t get into economics grad school. This is just a rumor but the sentiment is that undergrads get taught a watered down version of economics theory. Economics theory is potentially very technical and includes game theory and proofs. Even in CS, undergrads take intro theory courses and “bottom out” in their math skills, even though grad-level CS gets much more difficult. Therefore, I’d imagine the primary determinant of this rigor phenomenon is the GPA inflation of the major.

aer_aea · on April 27, 2023

This is somewhat true - economics undergrads do get into grad school, though math+econ or an econ degree with lots of math is better. You can just the check the difference between the textbooks for undergrad and graduate economics, eg Principles of Economics by Mankiw (undergrad) and Mas Collel (grad).

cauthon · on April 27, 2023

That’s the opposite of what OP is observing. It’s easier to teach a domain expert “good enough” quantitative and technical skills than to teach a pure quant “good enough” domain expertise (and the corresponding intuition)

sulam · on April 27, 2023

I think you’re reading this backwards, or perhaps it was edited? Mathematics is surely the more pure discipline compared with economics, so physics : CS :: mathematics : finance is the right ordering.

marcosdumay · on April 27, 2023

You just decided that one of those expertises is the "domain" while the other is not. There's no reason at all to pick one of them.

pyrale · on April 27, 2023

and it's easier yet to find one of each that are willing to collaborate.

hgsgm · on April 26, 2023

This is because software developers are too good at automation themselves out of a job.

MichaelZuo · on April 26, 2023

It's also because nobody goes to get a phd in solid state physics for the money or career prospects, at least not in the last decade. So it's a small and self selected group.

kergonath · on April 26, 2023

We love automation. There is just too much to do, and the field is unbounded. More automation means more papers and more interesting science.

selimthegrim · on April 27, 2023

What sort of problems are you trying to solve?

kergonath · on April 27, 2023

Lots of things, but mostly things like designing software architectures for massively parallel codes implementing fancy physics or coming up with ML models to predict complicated properties (things like wear, radiation, or corrosion resistance for which we just do not have any comprehensive model) or explore humongous problem spaces (things like predicting simple properties of very complex materials with 5 to 8 elements and complex microstructures like superalloys).

Neither the Physics nor the CS are cutting edge (that’s why we do not necessarily reject people with limited experience in Physics, though they need to show motivation and abilities to learn); what is is the combination of both.

We’d like to be as close to the cutting edge on ML as possible, though, because it’s a significant competitive advantage to be able to use fancy new techniques before our friendly competitors. But as I said it seems to be easier to train a Physics or Chemistry undergrad to get some feeling about how ML works than to train a CS undergrad to have some intuition about the Physics. And intuition is critical to detect when models hallucinate and get off the rails.

selimthegrim · on April 27, 2023

Ha well if you’re looking for a solid state physics PhD I’m on the market ;)

kergonath · on April 27, 2023

Not right now, unfortunately. Where are you located? We can discuss if you need any advice (username at me.com).

selimthegrim · on April 27, 2023

New Orleans, Louisiana.

kergonath · on April 28, 2023

Damn, my contact at LSU has moved. I know there’s a good group at UT Knoxville where they do great stuff with Oak Ridge’s neutron source but that’s not specifically ML. There are adverts regularly for post-docs at Los Alamos; they can be found on lanl.jobs . They work with various universities depending on the group and might point you to something.

selimthegrim · on April 29, 2023

I did email you, thanks!

qumpis · on April 26, 2023

I'm yet to see an ML PhD be required to learn chemistry to a similar extent that chemists would need to doing ML (especially at research level)

kevviiinn · on April 26, 2023

That's because application and research are quite different. If one does a PhD in ML they learn how to research ML. Someone with a PhD in chemistry learns how to research chemistry, they only need to apply ML to that research

ghaff · on April 26, 2023

Back when O'Reilly was still hosting events (sigh), at one of their AI conferences, someone from Google gave a talk about differences between research/academic AI and applied AI. I think she had a PhD in the field herself but basically she made the argument that someone who is just looking to more or less apply existing tools to business or other problems mostly doesn't need a lot of the math-heavy theory you'll get in a PhD program. You do need to understand limitations etc. of tools and techniques. But that's different from the kind of novel investigation that's needed to get a PhD.

frozenport · on April 26, 2023

>>math-heavy theory you'll get in a PhD program

Lol. With the exception of niche groups in compressed sensing, math doesn't get too hard. Furthermore, ML isn't math driven in the sense people are trying things and somebody tries to come up with the explanation after the fact.

selimthegrim · on April 26, 2023

Well I think the issue is more of if you’re Genentech and you need ML people and can’t afford to pay them you’re probably better off retraining chemistry PhDs.

kgwgk · on April 26, 2023

What if they don’t need “ML people”? Computational biology has been a thing for a while.

selimthegrim · on April 26, 2023

Well they had a whole suite of presentations at NeurIPS that suggests they hired a bunch.

antipaul · on April 26, 2023

https://www.gene.com/scientists/our-scientists/prescient-des...

kgwgk · on April 26, 2023

They could afford them then…

kevviiinn · on April 26, 2023

I think you missed my point. Genentech, AFAIK, was not doing research on machine learning as in the principles of how machine learning works and how to make it better. They do biotech research which uses applied machine learning. You don't need a PhD in ML to apply things that are already known

cmavvv · on April 26, 2023

As a PhD student working on core ML methods with applications in chemistry, I second this. During my PhD, I read very few papers by chemists that were exciting from a ML perspective. Some work very well, but the chemists don't even seem to always understand why they made the right choice for a specific problem.

I don't claim that the opposite is easy either. Chemistry is really difficult, and I understand very little.

dekhn · on April 26, 2023

Genentech has several ML groups that do mostly applied work, but some do fairly deep research into the model design itself, rather than just applying off-the-shelf systems. For example, they acquired Prescient Design which builds fairly sophisticated protein models (https://nips.cc/Conferences/2022/ScheduleMultitrack?event=59...) and one of the coauthors is the head of Genentech Research (which itself is very similar to Google Research/Brain/DeepMind), and came from the Broad Institute having done ML for decades ('before it was cool').

They have a few other groups as well (https://nips.cc/Conferences/2022/ScheduleMultitrack?event=60... and https://neurips.cc/Conferences/2022/ScheduleMultitrack?event... and https://neurips.cc/Conferences/2022/ScheduleMultitrack?event...).

I can't say I know anybody there who is doing what I would describe as truly pure research into ML; it's not in the DNA of the company (so to speak) to do that.

sdenton4 · on April 26, 2023

Sadly a lot of foundational ML research works for single-label image classification and not much else. ImageNet is a niche problem and way too much ML research is over-indexed on it. If you can make your problem look like ImageNet, you're going to do OK, but if not you effectively need to re-invent the wheel...

p1esk · on April 27, 2023

What you wrote was true until 4-5 years ago. These days almost all foundational ML research is about generative models.

sdenton4 · on April 28, 2023

It's still problematic.

The HEAR benchmark is a great eye-opener. They have basically three classes of audio tasks, and find very different models excel in each, with the best overall models being kinda-mindless ensembles of the ones that do well on particular problems.

So if you've got something that works well for text... it'll take a couple years and maybe an entire new branch of research (diffusion!) to work well for image generation. I have no idea what generative models for chemistry will look like, but will happily bet that it takes some significant specialized effort.

p1esk · on April 28, 2023

I see two things currently happening in ML:

1. The era of ML benchmarks is ending. New models have to be and will be evaluated the same way human experts are evaluated.

2. Foundational models are becoming multi modal. There will be no separation of text and image generation. Sure, different methods will be used for each, but the learned representations of visual and textual objects in models like Stable Diffusion already live in the same conceptual space.

I don’t think there will be specialized generative models for chemistry two years from now. There will be GPT-5 (and similar competitors) which will be used to perform all kinds of research, including chemistry.

sdenton4 · on April 28, 2023

I think this is a bit over confident, though...

For example, AlphaFold just fundamentally isn't a language model, but is fundamentally useful. We'll still need these models that Do Stuff in many areas, and that will still involve benchmarks... Even if we're able to ask GPT-N+1 to design the next version of the model for us.

gowld · on April 26, 2023

You can get an ML PhD doing applied ML.

ramraj07 · on April 27, 2023

I’m pretty sure the author is implying that the new crop of ML PhDs are just not a smart group of people - at least the level of intelligence required to do truly transformative things with ML in any field.

I think what you’re saying is a commonly found attitude that relates to this topic: it’s pretty limiting to think a cursory knowledge of a field is sufficient to go change it. That’s likely why most “use ML to solve x” projects fail when some like AlphaFold succeed because the ML engineers truly understood the fundamental tenets of the topic and exploited it.

throwaway2037 · on April 27, 2023

    AlphaFold succeed because the ML engineers truly understood the fundamental tenets of the topic and exploited it

What makes you think it wasn't chemists or biologists who learned enough ML to solve the problem?

ramraj07 · on April 27, 2023

Because the chemists and biologists have been doing that for decades without success.

melagonster · on April 27, 2023

maybe he just say do not expect people resolve problems outside of their discipline.

bertday · on April 27, 2023

I don’t understand what you mean. Here’s how many applied ML papers work: create a new dataset for a novel problem, download a PyTorch model, point model at dataset directory. Is it novel? By construction. Is the ML technique novel? No.

qumpis · on April 27, 2023

I actually meant exactly this. People apply ML to their domains and say thay that this is similar to having a chemist with a PhD.

angarg12 · on April 27, 2023

Not so dramatic, but I have a traditional CS background and got into ML much later in my career.

Last year my company formalized the process to hire ML Engineers. The interview is the same format as the software engineer round, but with an extra theoretical ML round.

I've observed two distinct groups of candidates in the process. One are recent grads with PhDs in ML. Other are people from diverse backgrounds that happened to start working in ML in their current job. The first group tends to excel in the ML theory part, but flunk the leetcode style coding questions. The second group tends to do better in coding, but do worse in the theory part. This is exactly what the process has been designed to do.

I am very put off from looking for ML Engineer positions in other companies if they follow a similar process. I know I would fail the interview for my current job. They could get away with it for a while, but I doubt it's sustainability or desirability in the long term.

ternaus · on April 26, 2023

Loved this argument as well.

With respect to more mature research fields the entry point to ML is much lower.

Hence I always recommended people to have major in Physics, Chemistry, Biology etc but look for projects in these fields that could benefit in ML (I have a number of them about Physics).

So that argument was not novel.

But the fact that pure ML PhDs will have significantly lower multidisciplinary knowledge is a good one. It could be compensated by the fact that ML is growing fast and all kinds of people join the ride, but still.

SkyBelow · on April 26, 2023

>I’ve seen repeatedly that it’s much harder for a ML PhD to learn chemistry than for a chemist to learn ML.

Perhaps this is selection bias. Among all the chemists, the ones who will dabble in ML will likely be the chemists with the highest ML related aptitude. In contrast, a ML expert on a chemist project is more likely not being internally driven to explore it but instead has been assigned the work, which means that there is less bias in selection and thus they have less chemistry aptitude.

asciimov · on April 26, 2023

> "[...] I’ve seen repeatedly that it’s much harder for a ML PhD to learn chemistry than for a chemist to learn ML.”

That's good ol' academic gatekeeping for ya, available wherever PhD's are found.

mattkrause · on April 26, 2023

There’s more to it than that.

CS is unusually easy to learn on your own. You can mess around, build intuition, and check your progress—-all on your own and in your pyjamas. It’s easy to roll things back if you make a mistake, and hard to do lasting damage. There are tons of useful resources, often freely available. Thus, you can get to an intermediate level quickly and cheaply.

Wet-lab fields have none of that. Hands-on experience and mentorship is hard for beginners to get outside of school. There are a few introductory things online, but what’s the Andrew Ng MOOC for pchem?

jltsiren · on April 27, 2023

This is more about ML than CS. ML is fundamentally about developing general-purpose algorithms applicable to a wide range of problems. If your job is using ML to solve problems in chemistry, it's more about chemistry than ML, and a chemistry background is more important than an ML background. It's unlikely that you have to develop novel ML methods for the problems you are facing.

I've seen the opposite in bioinformatics. While dedicated bioinformatics programs are now common, you still see many CS / mathematics / statistics / physics / EE people moving to bioinformatics after bachelor's / masters's / PhD / postdoc. In some bioinformatics jobs, you often have to solve new computational problems, and it's easier to teach enough biology to people with a methodological background than the other way around.

a_bonobo · on April 27, 2023

I've seen both in bioinformatics, because the field is so wide now.

1) Bioinformatics as tool-building, algorithm-dev: you're right, you don't need to know much biology there if the problem is defined well.

2) Bioinformatics as a tool to answer biological questions: here I've seen ML-background people really struggle, either developing stuff that's not useful or reinventing-the-wheel-but-now-it's-deep-learning. I've seen ML people present their fancy plant disease image detector which turned out to be pretty good at spotting 'yellow' - very good at training accuracy and benchmarks, does not add anything to what people in the field are doing.

qumpis · on April 27, 2023

In regards to 2), it sounds a bit directionless to be proposing stuff people don't need? Isn't that more of a problem of selecting relevant problems to solve, and getting supervision on your ideas?

a_bonobo · on April 28, 2023

Yeah that's the problem! The ML people and the biology or agriculture people don't talk, they're not in the same building. A biologist might see the ML-person's work only after it's published.

cratermoon · on April 26, 2023

On the flip side, software development and engineering rigor is largely absent in academics, as has been discussed previously here on HN. This is enough of an issue to make replicating research even when the code and data are provided, but it's an even bigger issue trying to turn academic code into a product.

VirusNewbie · on April 27, 2023

That's because academia is ass backwards with it's attempt at gatekeeping high knowledge/low skill fields.

kenjackson · on April 26, 2023

Chemistry is a centuries old discipline, that people study undergrad a full four years before getting a PhD in the field of chemistry.

ML is a, practically speaking, 15 year old field that PhDs often begin to study after a couple of AI courses in undergrade and a specific track in grad school (while they study other parts of CS as part of their early graduate CS work).

There's just way less context in ML than Chemistry.

smaddox · on April 26, 2023

I don't think that's really true, but you don't need the full context in ML in order to apply it to your field of interest. Just like you don't need the full context in Physics to apply it to an engineering problem.

Some of the most successful ML researchers have read several decades of research papers. It's not uncommon to see references to papers from the 70s or 90s.

Edit: No doubt that the relevant parts of stats, random matrix theory, and ML is a newer field than Physics or Chemistry, though.

PartiallyTyped · on April 27, 2023

DL is a 15 year old field, ML is far older than that.

kenjackson · on April 27, 2023

That’s why I said practically speaking. And realistically I don’t think you can even generously consider anything prior to perceptrons.

_delirium · on April 27, 2023

> realistically I don’t think you can even generously consider anything prior to perceptrons

I mean, sure, but for the practical reason that perceptrons (proposed 1943, implemented 1958) are about as old as digital computers, and ML as a field is tied up pretty strongly with the existence of computers.

kenjackson · on April 27, 2023

I wasn’t comparing ML to digital computers though — but rather chemistry. Theory of computation and algorithms (bedrocks of CS) far predate digital computers.

PartiallyTyped · on April 27, 2023

ML is so much more than DL / NNs. I recommend that you open a book like Bishop or ESL, the overwhelming majority of ML has been non DL going back to the 70s.

kenjackson · on April 27, 2023

Most of that I learned doing non-ML stats work. But regardless, my point was the history of it is far less than chemistry. Let’s say ML started at the turn of the 1990s to be generous. Still much younger than chemistry.

knorker · on April 26, 2023

> I’ve seen repeatedly that it’s much harder for a ML PhD to learn chemistry than for a chemist to learn ML.

Haha, I've seen that for so many topics. "It's much easier for someone used to circuit switched phone networks to learn IP than the other way around", says the person who started with circuit switched.

I just thought "dude, you're literally the worst at IP networking that I've ever met. Your misunderstandings are dug into everything I've seen you do with IP".

Fomite · on April 27, 2023

Ironically, in my field I'd argue this isn't necessarily true - so long as the ML PhD sees value in that field.

Part of this is because public health as an undergraduate discipline is extremely new, so the field is used to having to teach Folks From Elsewhere about our field, rather than the fields built on the assumption of a large foundation of undergraduate coursework.

j45 · on April 27, 2023

Interdisciplinary and intersectional skillsets are critically valuable.

Separation of concerns especially at the beginning innovation stages can be more of an inhibitor than accelerator of success. Scaling and growth is another thing.

A similar pattern existed in the late 90s with web developers coming from many different industries with their domain knowledge and domain insight.

The code and frameworks were early but the insights of what was a problem most pressing to solve.

theGnuMe · on April 27, 2023

The ML folks pretty much nailed protein folding though. Where any of them phds in molecular bio or chemistry?

Quekid5 · on April 26, 2023

Do you have any anecdata on topologists?

fancy_pantser · on April 26, 2023

Topologists will occasionally come to work with their clothes on inside out and insist nothing is wrong.

cratermoon · on April 26, 2023

Topologists can't tell the difference between a coffee cup and a donut, and will occasionally make a mess trying to dunk their donut in their coffee.

michaelrpeskin · on April 26, 2023

obligatory XKCD: https://xkcd.com/793/

Nevermark · on April 27, 2023

An introductory book on neural networks, written a few decades ago, starts with a short history of the field at that time.

This included a phase where some physicists began having opinions about the subject, anticipating they might quickly find a model for the brain, with fields, or other physics-like paradigms.

Their expectations were that with their superior understanding of all things fundamental, they would rush in, and rush out. Leaving the stunned machine learning researchers dazzled, frazzled, and asking "Who was that masked physicist?!?"

Except their ideas went nowhere.

I can't find the book, but this story made for a memorable foreword.

fknorangesite · on April 26, 2023

Similarly: https://www.smbc-comics.com/comic/2012-03-21

CogitoCogito · on April 26, 2023

https://xkcd.com/1831/

javajosh · on April 26, 2023

That's a great xkcd, but there are 2 upsides to this arrogant approach. First, arrogance is a nerd-snipe maximizer. Second, there is a small chance you're absolutely right, and you've just obviated a whole field from first principles. It doesn't happen often, but when it does happen and there is no clout like "emporer's new clothes" clout.

EDIT: The downside, of course, is that you appear arrogant, and people won't like you. This can hurt your reputation because it is apparently anti-social behavior on several levels. I think its fair to call it a little bit of an intellectual punk rock move that is probably better left to the young. It's an interesting emotional anchor to mapping a new field, though.

anonymouskimmer · on April 26, 2023

> Second, there is a small chance you're absolutely right, and you've just obviated a whole field from first principles.

Mostly when I read about things like this happening, it's happening to a formerly intractable problem in mathematics. Do you have examples outside of math?

passer_byer · on April 26, 2023

Alfred Wegener as the initial author on the theory of plate tectonics comes to mind. He was a trained meteorologist who observed the similarities between geological formations between the South American east coast and African west coast. He was lucky, in that his father in-law was a prominent geologist and helped him defend this thesis.

anonymouskimmer · on April 26, 2023

Oh yeah, revolutionary insights are very important for the advancement of knowledge and the elimination of wrong ideas. But as you wrote, this was the work of a thesis, not a random commenter from another field.

deltree7 · on April 27, 2023

LLMs weren't invented by Naom Chomsky but Math/CS guy with a simple "Attention is all you need"

DoughnutHole · on April 27, 2023

Creating a good chatbot was never a goal of linguistics, and I don't see what insights into human language have been derived from LLMs.

ramraj07 · on April 27, 2023

The only time I’ve seen this happen is AlphaFold. Every other time some math or software guy came into biology it worked exactly like in xkcd.

deltree7 · on April 27, 2023

The biggest innovation of this decade, LLMs was created by Math/CS not linguistics

anonymouskimmer · on April 27, 2023

That's a good one.

jltsiren · on April 27, 2023

Biology has already been transformed by mathematics, statistics, and CS for decades. These days, if something like in that xkcd happens, those people are probably not even familiar with the relevant parts of math / stats / CS.

ramraj07 · on April 27, 2023

What exactly has it transformed? Nothing has fundamentally changed in biology, you still have to run western blots and give mice cancer. A great example of “math” solving biology is super resolution - which was just a dud.

jltsiren · on April 27, 2023

Genomics, for example. The entire field could not exist without extensive algorithmic research done over several decades. Even today, many of the key people in the field have a background in CS and mathematics.

Theoretical ecology is a bit more old-school answer. Many mathematicians have been involved in that field.

ramraj07 · on April 27, 2023

I am not sure I have seen any meaningfully brilliant breakthroughs in those fields though.

I take that back. One can never forget the brilliance of HiC by Erez. But other than that TBH all other “mathematical” or computational breakthroughs I’ve seen are just at best meticulous application of obvious math and computational algorithms to biological problems (and may I say poorly? Thinking back to the microarray nightmare years).

jltsiren · on April 27, 2023

I guess that depends on what you count as a breakthrough. From my perspective, HiC is a small detail, and even AlphaFold is just the latest improvement to an existing process. A true breakthrough would be something like ancient DNA, which gave the humanity a new tool for studying the past.

Pretty much everything in genomics depends on shotgun sequencing, which in turn depends on CS. The algorithms used for assembling something useful out of the sequence reads are highly non-obvious. In fact, most developments in string algorithms since the 80s have been driven by the needs of DNA sequencing. (Information retrieval used to be another contender, but word tokens turned out to be a better tool for that field.)

jojosbuddy · on April 26, 2023

Not laughing! /s (physicist here)

Actually most applied physicists like myself go down that path cause we're pretty efficient, lazy folk & skip through as fast as possible--I call it the principle of maximum laziness.

simonster · on April 26, 2023

I work for Google Brain. I remember meeting Brian at a conference and I have nothing but good things to say about him. That said, I think Brian is underestimating the extent to which the Brain/DeepMind merger is happening because it's what researchers want. Many of us have a strong sense that the future of ML involves models built by large teams in industry environments. My impression is that the goal of the merger is to create a better, more coordinated environment for that kind of research.

gopalv · on April 27, 2023

> Many of us have a strong sense that the future of ML involves models built by large teams in industry environments

The gradient of current moment is that whatever approach is optimized to use more data and more compute is much easier to invest in than something which can do more with less, but with a significant number of possible dead-ends.

At some point, this will have diminishing returns, but until that is hit, this makes sense as a purely return-on-investment for both research progress and business returns.

ugh123 · on April 27, 2023

Would you say the majority (super majority?) of Google Brain and DeepMind employees are happy about the merge?

cratermoon · on April 26, 2023

I'm having trouble keeping Brain/Brian straight.

Nevermark · on April 27, 2023

The first is an object that extends into a natural number of spacial dimensions. It arises in string theory.

The second is our Lord and Savior.

cratermoon · on April 27, 2023

Follow the gourd!

drawkbox · on April 27, 2023

Monty Python's Life of Brain

gowld · on April 26, 2023

The goal of the merger is for execs to look like they are doing something to drive progress. Actual progress comes from the researchers and developers.

anonylizard · on April 26, 2023

Well, where exactly is this progress? Where is Google's answer to GPT-4? Why weren't the 'researchers and developers' making a GPT-4 equivalent?

Turns out you sometimes you need a top down, centralised vision to execute on projects. When the goal is undefined, you can allow researchers to run free and explore, now its full on wartime, with clear goals (make GPT-5,6,7....).

oofbey · on April 26, 2023

Google is fundamentally allergic to top-down management. Most googlers will reject any attempt to be told what to do as wrong, because lots of IC's voting with their feet are smarter than any (google) exec at figuring out what to do.

Last time Google got spooked by a competitor was Facebook, and they built Google Plus in response. We all know that was an utter failure. Googlers could escape that one with their egos in tact because winning in "social" is just some UX junk, not hard-core engineering like ML.

It's gonna be super hard for them to come to grips with the fact that they are way behind on something that they should be good at. Plan for lots of cognitive dissonance ahead.

cubefox · on April 26, 2023

I don't think they are behind just because they have released less stuff. They had LaMDA way before ChatGPT, and they had multimodal models (both by DeepMind and by Google Brain) well before OpenAI.

oofbey · on April 26, 2023

They are behind because what they have released sucks. Have you tried Bard? It's dumb. Like you're talking to some 20th century gimmick dumb. GPT4 is far from perfect, but when it makes mistakes and you point them out, it understands and tries to adapt. Bard just repeats itself saying the same stupid things like a casette-tape answering machine.

If you ask a googler about this, they typically assume GPT is just as stupid as bard. Or say something like "so GPT is just trained on more data - we can do that." As if nothing's wrong.

cubefox · on April 27, 2023

Bard uses a smaller model currently, which was announced before release.

> We’re releasing it initially with our lightweight model version of LaMDA. This much smaller model requires significantly less computing power, enabling us to scale to more users, allowing for more feedback. [1]

> Bard is powered by a research large language model (LLM), specifically a lightweight and optimized version of LaMDA, and will be updated with newer, more capable models over time. [2]

[1] https://blog.google/technology/ai/bard-google-ai-search-upda...

[2] https://blog.google/technology/ai/try-bard/

stolsvik · on April 27, 2023

”We’re releasing something useless and uninteresting, so that we can get more user feedback”

Is it only me that sees a problem here?

fakedang · on April 27, 2023

"We are releasing something that's too underdeveloped, because Wall Street demands we release something. We're using a smaller training set because we have to rush to market with our smaller, stupider model ASAP."

qumpis · on April 27, 2023

I remember the CEO of google saying that more capable models will be releases a few months (a month?) ago already. Either they delayed or the model is just as bad (maybe they released after the announcement of coding in 20 languages?)

cubefox · on April 27, 2023

First link is the CEO, it's from February.

m00x · on April 27, 2023

GPT-4 and chatGPT are developed from mostly Google papers and Facebook code/frameworks.

OpenAI just focused on making it a great product.

gradstudent · on April 27, 2023

A car is basically 4 wheels and a combustion engine. Henry Ford just focused on making it a great product.

(In case it's not clear, I think you might be underestimating the size of the subsequent contributions)

vl · on April 26, 2023

>PyTorch/Nvidia GPUs easily overtaking TensorFlow/Google TPUs.

TF lost to PyTorch, and this is Google’s fault - TF APIs are both insane and badly documented.

But nothing comes close to performance of Google’s TPU exaflop mega-clusters. Nvidia is not even in the same ballpark.

belval · on April 26, 2023

There is a first mover handicap there though. TF1.0 included a bunch of things that were harder to understand like tf.Session(). PyTorch was inspired from the good parts and "we will eager-everything". Internally I'm sure there was a lot of debate in the TF team that culminated with TF2.0, but by that time the damage was done and people saw PyTorch as easier.

disgruntledphd2 · on April 26, 2023

Nope, Pytorch was inspired by the Lua version of Torch which well predates Tensor flow. To be fair, basically every other DL framework made the same mistake though.

Also, tensorflow was a total nightmare to install while Pytorch was pretty straightforward, which definitely shouldn't be discounted.

andyjohnson0 · on April 26, 2023

> Also, tensorflow was a total nightmare to install while Pytorch was pretty straightforward, which definitely shouldn't be discounted.

I think this is a very important point, and I remember sweating blood trying to build a standalone tf environment (admittedly on windows) in the past. I'm impressed by how much simpler and smoother the process has recently become.

I do prefer Keras to Pytorch though - but thats just me

throwaway2037 · on April 27, 2023

    tensorflow was a total nightmare to install while Pytorch was pretty straightforward

Hat tip for this comment. On HN, I read some great commentary about "time to achieve first HTTP 200 with your REST API". Regarding installed software libraries, lower friction to achieve "Hello, World!" is important.

bertday · on April 27, 2023

PyTorch examples were also cleaner. torchvision had ResNet training batteries included while TF had role your own or clone some weird Keras repository.

bitL · on April 26, 2023

I think the main problem was debugging tensors on the fly, impossible with TF/Keras, but completely natural to PyTorch. Most researchers needed to sequentially observe what is going on in tensors (histograms etc.) and even doing backprop for their newly constructed layers by hand and that was difficult with TF.

bertday · on April 27, 2023

Nah, TF has had dynamic execution since TF2 and it’s still losing users, it seems. The execution model and API is simply more complicated. What’s a session, placeholder, constant, tensor, …? PyTorch was sold as numpy with GPU support and it is pretty close to that. JAX is an attempt to approach language simplicity and purity.

gillesjacobs · on April 26, 2023

I have used both but ended up dropping TF for PyTorch after 2018. Mainly it was the larger PyTorch ecosystem in my field (NLP) and clear API design and documentation that did it for me.

However, TF was still a valid contender and it was not clearcut back in 2016-17 which framework was better.

jdlyga · on April 26, 2023

I can speak from experience on this. Getting started with TensorFlow was very complicated with sparse documentation, so we dropped the idea of using it.

vl · on April 26, 2023

I had to use TF when I worked at G, when I left I immediately started to use PyTorch and never looked back.

aix1 · on April 26, 2023

Even internally at Google/DeepMind, all the cool kids have long moved to JAX.

dekhn · on April 26, 2023

Once the ads model runs on Jax instead of TF, it's curtains for TF.

why_only_15 · on April 27, 2023

You are not correct about TPUs being drastically better for GPUs at this. If you look at public benchmarks, both have a similar cost per hardware flop ($0.88/hr for 312tflops A100 on GCP, $0.97/hr for 275tflops TPUv4) and both achieve similar model flop:hardware flop ratios (40-60%).

An existence proof that GPU mega-clusters are possible is that GPT-4 cost ~$100m over ~3 months, so ~100m a100-hours / (3 months * 30 days/month * 24 hours/day = 2160 hours) = ~45k a100s collaborating, which is the equivalent of ~10 TPUv4 pods on a single training run.

vl · on April 27, 2023

I assume/suspect internally Google has v5 already.

The thing about TPU clusters that they have hyper-torus optical interconnect between TPUs. This allows for extremely efficient weight updates. To replicate this with A100s you need very custom hardware/software deployment.

But to be fair, I don’t know what is latest and greatest available from NVidia or other clouds in this area right now.

EDIT: Looks like NVidia has NVSwitch, which provides interconnect for 256 GPUs. Pretty cool!

rerx · on April 27, 2023

External NVSwitch (for more than 8 GPUs) isn't available for purchase yet. But even now you typically buy your A100 or H100 training GPUs in servers where each GPU has an individual 400 Gbps Mellanox networking adapter, then connect them in a sophisticated switched network that provides full throughput between all GPUs of the cluster. The deployment is not that custom, typically you would let Nvidia and your vendors implement their reference design for you: https://www.nvidia.com/en-us/data-center/dgx-superpod/ Then run an open-source software stack on top of CUDA etc.

I don't know how well the TPU hyper-torus interconnect performs, but the networking topology seems to be less general than switched NVLink or InfiniBand.

vl · on April 27, 2023

How is network adapter connected to GPU? They just seat on the same bus?

It looks like in TPU v4 cluster each pod with 2 or 4 (?) TPUs has 6 optical interfaces, which directly connect to next pods. I have no idea how they route though this configuration, but my guess most messages are weight updates, which are essentially broadcasts, so it should work out fine with some basic forwarding.

rerx · on April 27, 2023

Can't find documentation online yet for this year's H100 systems, but here's a schematic for an A100 server: https://docs.nvidia.com/dgx/dgxa100-user-guide/introduction-... There 2 GPUs, 2 NICs and 1 NVME are connected via one PCIe switch.

The torus topology makes sense for ring algorithms: Allreduce, Allgather, Reducescatter. For purely data parallel training you could put all model replicas into the same ring (although Nvidia also uses hierarchical algorithms that benefit from lower lately). With added model parallelism one will need smaller rings running concurrently. I guess the TPU cluster layout will then put constraints on the most efficient model architectures (as does the network topology of a GPU cluster).

AaronFriel · on April 27, 2023

NVidia acquired Mellanox in 2019 in order to own the interconnect - exactly the custom hardware/software stack you thought needed.

tdullien · on April 26, 2023

TF in its first version was stellarly misdesigned. It was infuriatingly difficult to use, particularly if you were of the "I just want to write code and have it autodiffed + SGDed" school, I found it crazy to use Python to manually construct a computational graph...

dekhn · on April 26, 2023

You need something to construct a graph. Why not pick a well-known language already used in scientific computing and stats/data science? The other options are: pick a lesser known language (lua, julia) or a language not traditionally used for scientific computing (php, ruby), or a compiled language most researchers don't know (C++), or a raw config file format (which you would then use code to generate).

What's really crazy is using Pure, Idiomatic Python which is then Traced to generate a graph (what Jax does). I want my model definitions to be declarative, not implict in the code.

oofbey · on April 26, 2023

TF1 was pretty rough to use, but beat the pants off Theano for usability, which was really the best thing going before it. Sure it was slow as dirt ("tensorslow") even though the awkward design was justified on being able to make it fast. But it was by far the best thing going for a long time.

Google really killed TF with the transition to TF2. Backwards incompatible everything? This only makes sense if you live in a giant monorepo with tools that rewrite everybody's code whenever you change an interface. (e.g. inside google). On the outside it took TF's biggest asset and turned it into a liability. Every library, blog post, stackoverflow post, etc talking about TF was now wrong. So anybody trying to figure out how to get started or build something was forced into confusion. Not sure about this, but I suspect it's Chollet's fault.

tadeegan · on April 27, 2023

Ehhh, no. Even inside google TF2 migration was a total disaster and is still ongoing :)

oofbey · on April 27, 2023

Sure, my point is that only a googler would even consider this kind of breaking change as a sensible option. People in the real world with regular code tooling would reject the proposal before it got started.

The analogy to Angular that others have made is spot on. It's not just first-mover disadvantage. Google has particular blind spots for certain pain points, like deprecating APIs. Also q.v. Google Cloud.

ukuina · on April 27, 2023

Echoes the AngularJS 1.x to Angular2 transition.

Every attempt at a "clean break" new version of a commonly-used platform leads to such long-term weakness, yet the temptation to piggyback off of the mindshare/existing branding forces companies to avoid calling it a new platform.

bertday · on April 27, 2023

The APIs were messed up early on, which is a reason TF2 happened. Every team started making their own random implementations of stuff. You had the TF Slim API, you had Keras, etc. The API just got fatter and fatter and then libraries would make cross dependencies to bake in the API mistakes.

alasdair_ · on April 27, 2023

>Google really killed TF with the transition to TF2. Backwards incompatible everything?

They did the same thing with their AngularJS -> Angular switch. The main asset (community knowledge) was lost and React ate their lunch.

1024core · on April 26, 2023

There's a reason why the TL behind TF (Rajat Monga) got booted out.

piecerough · on April 26, 2023

What's this based on?

1024core · on April 26, 2023

Check Rajat Monga's LinkedIn profile. He's no longer with Google.

whymauri · on April 26, 2023

Python is the least of my concerns with Tensorflow... especially TF 1.0. What a mess it was, and kinda still is.

metadat · on April 26, 2023

What's the meaning of SDG in this context?

Edit:

Hypothesis: Stochastic Gradient Descent

ragebol · on April 26, 2023

Parent typed SGD, which means Stochastic Gradient Descent. An optimization method.

nl · on April 27, 2023

The post says the same thing:

> Unfortunately, this early lead would be completely squandered within a few short years, with PyTorch/Nvidia GPUs easily overtaking TensorFlow/Google TPUs. ML was, and frankly is, still too nascent to have significant technical barriers to entry. The sustained eye-popping funding for AI companies generated a surge in supply, with the number of ML researchers growing ~25% YoY for the past decade. I taught myself enough ML to blend in with the researchers at Brain over a relatively short 2 years, and so have many others. Nobody, not even Google, can afford to throw money into a bottomless pit.

sytelus · on April 27, 2023

NVidia A100 DGX Superpod is equivalent to exaflop TPU pod. No?

vl · on April 27, 2023

Perhaps NVidia is close now. A bit hard to say without specific hardware info.

Google’s were already available 5-6 years ago. And probably current versions are even faster. They have super fast optical interconnects in torus or hyper-torus configuration that allow synchronous weight updates on 1k+ TPUs. This leads to dramatically lower training times and less noise, which leads to better-performing models. I.e. you can’t even train model to the same level on traditional GPUs.

Once they started to get deployed, models that trained for 3 weeks on 30 GPUs were trained in 30 minutes on 1k TPU cluster.

All this reiterated main point in the article - Google had tremendous lead and wasted it due to the lack of vision and product execution ability.

rerx · on April 27, 2023

GPU cluster scaling has come a long way. Just check out the scaling plot here: https://github.com/NVIDIA/Megatron-LM

dekhn · on April 26, 2023

because once Jeff Dean had solved Google's maslow problems (scaling web search, making ads profitable, developing high performance machine learning systems) he wanted to return to doing academic-style research, but with the benefit of Google's technical and monetary resources, and not part of X, which never produces anything of long-term value. I know for sure he wanted to make an impact in medical AI and felt that being part of a research org would make that easier/more possible than if he was on a product team.

dgacmu · on April 26, 2023

I generally agree with this though with some tweaks. I think Jeff wanted to do something that he thought was both awesome (he's liked neural networks for a long time - his undergrad thesis was on them) and likely to have long-term major impact for Google, and he was able to justify the Awesome Thing by identifying a way for it to have significant potential revenue impact for Google via improvements to ad revenue, as well as significant potential "unknown huge transformative possibilities" benefits. But I do suspect that you're right that the heart of it was "Jeff was really passionate about this thing".

Of course, this starts to get at different versions of the question: Why did Google Brain exist in 2012 as a scrappy team of builders, and why did Brain exist in 2019 as a mega-powerhouse of AI research? I think you and I are talking about the former question, and TFA may be focusing more on the second part of the question.

[I was randomly affiliated with Brain from 2015-2019 but wasn't there in the early days]

dekhn · on April 26, 2023

It grew from the scrappy group to the mega-powerhouse by combining a number of things: being the right place at the right time with the right resources and people. They had a great cachet- I was working hard to join Brain in 2012 because it seemed like they were one of the few groups who had access to the necessary CPU and data sets and mental approaches that would transform machine learning. And at that time, they used that cachet to hire a bunch of up and coming researchers (many of them Hinton's students or in his sphere) and wrote up some pretty great papers.

Many people outside of Brain were researchers working on boring other projects who transferred in, bringing their internal experience in software development and deployment, which helped a lot on the infra side.

choppaface · on April 26, 2023

I agree that the OP makes a bunch of interesting points, but I think historically Brain really grew out of what Dean wanted to do and the fact that he wanted it to be full-stack, e.g. including the TPU. Also, crucially, Brain would use Google data and contribute back to Ads/Search directly versus Google X which was supposed to be more of an incubator.

But it's also notable how the perspective of an ex-Brain employee might differ from what sparked the Brain founders in the first place.

leoh · on April 26, 2023

Waymo?

mechagodzilla · on April 26, 2023

Waymo is kind of like DeepMind - they're costing Alphabet billions of dollars a year for a decade+ with no appreciable revenue to show for it, but they're working on something neat, so surely it must be good?

dekhn · on April 26, 2023

waymo hasn't produced anything of long-term value yet. And everything about it that worked well wasn't due to Google X

ra7 · on April 26, 2023

Can you expand why being part of Google X hinders a team? I believe Waymo "graduated" from Google X to its own entity.

dekhn · on April 26, 2023

X exists as a press-release generation system, not as a real technology creation system. They onboard many impractical projects that are either copies of something being done already in industry ("but with more Google and ML!") or doesn't have a market (space elevators).

alphabetting · on April 26, 2023

Waymo has developed the modern autonomous vehicle from the ground up. It's basically a matter of scale now. It's a mindblowing tech stack. The first time riding in one is much more otherwordly than using GPT for the first time. The value of the technology is far greater whatever PR they have generated (not many people know about it)

dekhn · on April 26, 2023

I have infinite respect for the process that Waymo followed to get to where they are. And I'm impressed that Google continued to fund the project and move it forward even when it represents such a long-term bet.

but it's not a product that has any real revenue. and most car companies keep their distance from google self-driving tech because they're afraid. afraid google wants to put them out of business. It's unclear if google could ever sell (as a product, as an IP package, etc) what they've created because it depends so deeply on a collection of technology google makes available to waymo.

alphabetting · on April 26, 2023

I was just disputing "X exists as a press-release generation system, not as a real technology creation system." Definitely agree the path to profitability will be tough.

cratermoon · on April 26, 2023

If the technology is not in production, is the creation complete?

alphabetting · on April 26, 2023

I visited Arizona, downloaded the app and did two complex rides with no driver. So yeah I'd say it's currently in production.

modeless · on April 27, 2023

Waymo predates X.

marricks · on April 26, 2023

[flagged]

hintymad · on April 26, 2023

Controvery of what? Did you read Gebru's paper? For instance, her calculation of carbon footprint of training BERT assumes that companies will train BERT 24x7. Gebru is a disgrace to the community because she always, I mean literally always, attacks her critics by motives. You think bias is a data problem? You're a bigot (See her dispute with LeCun). You disagree with my assessment on an ML model? You are white male oppressor (her attacking a Google's SVP).

Gebru is not a researcher. She is a modern-age Trofim Lysenko, who politicizes everything and weaponizes political correctness.

And yeah, she deserves to be fired. Many times.

marricks · on April 27, 2023

Crazy that her boss and people she mentored all loved her. It’s almost like the system screwed her and she’s rightly angry at it.

I get it, easier to type cast her as an angry black woman though!

erenyeager · on April 26, 2023

Ok but the lack of underrepresented minorities in the field and the important role people like Gebru played in extending the political and status of minorities is ok to extinguish? We need more than just white male / Chinese male / Indian male monoculture “STEM lords”. This is already recognized in fields like medicine, where minorities treating minorities results in better outcomes and the greater push to open positions of status to minorities.

hintymad · on April 26, 2023

> important role people like Gebru played in extending the political and status of minorities is ok to extinguish?

No, she didn't. Attacking everyone for baseless motives and identities is the worst kind of activism. She alienated people by attacking them without basis. She disdained those who truly fought for the fairness and justice of every race. She left a bad taste in people who truly cared about progress. Yes, it's totally worth "extinguishing" her role, as her role is nothing but a political charade.

As for under-repented minorities, do you even know the Chinese Exclusion Act? Do you know how large the pipeline of the STEM students in different races and why there was a gap? Do you know why the median GPA of the high school students in the inner city was 0.5X out of 4? Why was that? The questions can fill a book. Yeah, activism is easy, as long as you have the right skin and shameless attitude. Solving real problems is hard.

erenyeager · on April 27, 2023

Take a look at this lecture about the history of slavery and racism in the US then [1]. This will provide an answer many of your questions. Dismissing a prominent PoC’s position for being outspoken and critical of existing unfairness really demonstrates who is maintaining political power here. Do you really think that centuries of unfairness and trauma that still is perpetuated today will be swept under the rug by your fantasy of “equivalent” treatment? Equity is about Justice which is fundamentally about ethics. So tell me why are you so rooted and uncomfortable with confronting historical injustice?

1: https://youtu.be/BGjSday7f_8

logicchains · on April 26, 2023

>Ok but the lack of underrepresented minorities in the field and the important role people like Gebru played in extending the political and status of minorities is ok to extinguish?

Yes it's okay to extinguish it if hiring underrepresented minorities means hiring bad actors like her who contribute nothing of value. Scientific truth is scientific truth; if you hire people for the color of their skin or their sexuality instead of their ability to produce truth, you slow the progress of science and make the world worse for everyone.

belval · on April 26, 2023

This seems like a pretty bad faith argument that illustrates exactly the point the parent comment was making. Firing Gebru for insubordination is not "extinguishing" anything, it's getting rid of an employee that was actively taking pot shots at the company in her paper and somehow equated getting fired with anti-minority bias. In practice, Google is already much more tolerant of activism than the average tech company and she was unable to play by the corporate rules.

Silverback_VII · on April 26, 2023

I personally believe that racial or diversity quotas are even more racist or sexist. We should expect minorities to develop their own culture of intellectual excellence. After all, they are no longer children. Giving them a shortcut is a form of insult. Providing someone an advantage based on their race or sex at the expense of someone else who is more qualified due to their race or sex is nonsensical. Companies may fail as a result of such practices. Ultimately, what truly matters is how innovative and efficient a company is.

qumpis · on April 26, 2023

Yes, uplifting minorities is great, but anyone should be accountable equally when it comes to workplace conduct

dekhn · on April 26, 2023

That's a very simplified version of the story, but I would say that Dean greatly reduced his stature when he defended Megan Kacholia for her abrupt termination of Timnit. Note that Timnit was verbally abusive to Jeff's reports, anybody who worked there could see what she was posting to internal group discussions, so her time at Google was limited, but most managers would say that she should had at least been put on a pip and given 6 months.

Dean has since cited the paper in a subsequent paper (which tears apart the Stochastic Parrots paper).

marricks · on April 26, 2023

Google has since fired other folks on her team and was in crisis mode to protect Dean. Like, I’m not really going to give them the benefit of the doubt on this.

When people brought Dean up Timnit came up as something to consider, it’s interesting to see how all anyone has to say in these threads is reverence towards him. People should try to see the whole picture.

jeffbee · on April 26, 2023

Timnit and the other ex-ML ethics crowd who got fired from Google seem like some of the most ignorant people around. I don't defend Dean reflexively, it just seems like he is on the right side of the issue. For example, here is Emma Strubell accusing Dean of creating a "toxic workplace culture" after he and David Patterson had to refute her paper in print.

https://twitter.com/strubell/status/1634164078691098625?lang...

The thing is if David Patterson and Jeff Dean think your numbers for the energy cost of machine learning might be wrong, then you are probably wrong. These ML meta-researchers are not practitioners and appear to have no idea what they are talking about. Keeping a person like Timnit or Strubell on staff seems like it costs more than its worth.

dragonwriter · on April 26, 2023

> Timnit and the other ex-ML ethics crowd

Timnit is ex-Google, but very much not ex-ML ethics (fouded Distributed AI Research Institute focussed on the field in late 2021). Very much also true of Margaret Mitchell, who has been at Hugging Face since 2021.

opportune · on April 26, 2023

Being somewhat involved in one bad thing doesn’t justify cancelling someone.

To my knowledge Dean was essentially doing post-hoc damage control for what one of the middle managers in his org did. Even if they did want Timnit gone (as others mention, you are getting only one side of the story in media) they did it in a bad way, for sure. At the same time I don’t think one botched firing diminishes decades of achievements from a legitimately kind person.

calrissian · on April 26, 2023

> Also not surprised at the immediate down votes for questioning Googles new AI lead!

That's because you are wrong to pretend he did anything wrong by firing T.G. And also, because you added this weird lie/mudslinging/whatever on top of it:

> while she was on vacation

mupuff1234 · on April 26, 2023

Or maybe it's just not perceived as controversial?

Her boss told her to do something, she refused and got sacked.

qumpis · on April 26, 2023

Wiki doesn't seem to give detail into the situation, nor the paper in question

renewiltord · on April 26, 2023

These safety people guarantee a useless product that never does unsafe things. ChatGPT proved that you can have a product do unsafe things and still be useful if you put a disclaimer on it. Overall, as a user, I couldn't give a damn if things are unsafe by the definition of this style of ethicist. They were a ZIRP and my life is better for their absence.

Silverback_VII · on April 26, 2023

She appears to be a symbol for everything that went wrong at Google. These are the kind of problems that arise when life is too easy, just before the downfall. In other words, decadence. How else can one explain that Google's AI research was dethroned by OpenAI?

hintymad · on April 26, 2023

Yeah, Dean's fault is hiring such people in the first place. If you hire an activist, you get activism. And if you hire someone whose livelihood depends on finding more problems, well, they will scream more problems, one way or another. Otherwise, why would state U of Mich got one DEI officer per three staff members?

Traubenfuchs · on April 26, 2023

A lot of people, especially on hacker news, feel disdain for researchers of ethics, bias and fairness, as they are perceived as both holding technology back and profiting from advances in it (that they can then analyse and criticize).

calrissian · on April 26, 2023

I don't think you're necessarily wrong in your assesment about HN and AI enthusiasts, but also in this case I think it's more accurate to talk about a Twitter agitator and race-baiter [1], rather than a "researcher of ethics, bias and fairness".

[1] https://twitter.com/timnitGebru/status/1651055495271137280?s...

liuyipei · on April 26, 2023

Google has good engineers and a long history of high throughput computing. This, combined with a lack of understanding what ML research is like (versus deployment), led to the original TF1 API. Also, the fact that google has good engineers working in a big bureaucracy probably hid a lot of the design problems as well.

TF2 was a total failure, in that TF1 can do a few things really well when you get the hang of it, but TF2 was just a strictly inferior version of pytorch, further plagued by confusion due to TF1. In alternate history, if Google pivoted in to JAX much earlier and more aggressively, they could still be in the game. I speak as someone who has at some point knew all the intricacies and differences between TF1 and TF2.

activitypea · on April 27, 2023

> google has good engineers working in a big bureaucracy probably hid a lot of the design problems as well.

I feel like this is true of every google product in the last decade, maybe more. Their customer products, but especially their dev tools like Angular and k8s screams "We were so preoccupied with whether we could, we didn’t stop to think if we should."

nologic01 · on April 26, 2023

> it is becoming increasingly apparent to Google that it does not know how to capture that value

To paraphrase, its the business model, stupid.

Inventing algorithms, building powerful tools and infrastructure etc is actually a tractable problem: you can throw money and brains at it (and the latter typically follows the former). While the richness of research fields is not predictable, you can bet that the general project of employing silicon to work with information will keep bearing fruits for a long time. So creating that value is not the problem.

The problem with capitalizing (literally) on that intellectual output is that it can only be done 1) within a given business model that can channel effectively it or 2) through the invention of totally new business models. 1) is a challenge: These billions of users on which AI goodies can surface are not customers, they are product. They don't pay for anything and they don't create any virtuous circle of requirements and solutions. Alas, option 2) inventing major new business models is highly non-trivial. The track record is poor: the only major alternative business model to adtech (cloud unit) was not invented there anyway and in any case selling sophisticated IT services whether to consumers or enterprise is a can of worms that others have much more experience in.

For a industrial research unit to thrive, its output must be congruent with what the organization is doing. Not necessarily in the detail, but definitely in the big picture.

DavidSJ · on April 26, 2023

> … the publication of key research like LSTMs in 2014 …

Minor nitpick, but LSTMs date to 1997 and were not invented by Google. [1]

[1] Hochreiter and Schmidhuber (1997). Long short-term memory. https://ieeexplore.ieee.org/abstract/document/6795963

Scea91 · on April 26, 2023

Seems more than a nitpick to me. I find the essay interesting but this line raised some distrust in me. How can someone have these deep insights into Google's ML strategy and the evolution of the field and simultaneously think LSTMs were invented by Google in 2014?

brilee · on April 26, 2023

sorry, I had a mind fart. I was thinking of seq2seq https://research.google/pubs/pub43155/

Pushing the fix now...

DavidSJ · on April 26, 2023

seq2seq was indeed a big deal.

jll29 · on April 26, 2023

You're lucky you could push a fix before Schmidhuber [1] noticed! ;)

[1] https://arxiv.org/abs/2212.11279