Hacker News new | past | comments | ask | show | jobs | submit login
The Machine Learning Job Market (evjang.com)
265 points by sebg on April 25, 2022 | hide | past | favorite | 259 comments



> The most important deciding factor for me was whether the company has some kind of technological edge years ahead of its competitors. A friend on Google’s logging team tells me he’s not interested in smaller companies because they are so technologically far behind Google’s planetary-scale infra that they haven’t even begun to fathom the problems that Google is solving now, much less finish solving the problems that Google already worked on a decade ago.

^ this is increasingly a choice in your career -- 'where can you go to solve big problems', and 'big problems are increasingly complex'

scale is real, and tools matter. you can spend your whole project burn at the wrong company building something that you could buy somewhere else, or which already exists at a competitor

slight grain of salt here is that G's logging system, from my perspective as a gcp user, is slow as balls and the UX is the incarnation of scroll jank. and also this (very good) article led to the outcome of the author building soft hands happy-ending robots


I think this reasoning is flawed. Joining Google does not mean you get to solve interesting problems because you will most likely contribute incrementally to a vast body of work. You want to build the next-gen database? You have to come up with something way better than BigTable and Spanner. You want to build a queue service? You've got to come up with something way better than Google's PubSub, which optimizes itself all the way down to the level of GBP protocols. You want to build a machine learning framework? Have you checked out TensorFlow and JAX in particular their ecosystem? You want to build a file system? Are you sure you can overcome all the organizational inertia that Google has built on Colossus over the years? You want to build a 10X more productive framework for data processing? Are you sure you can beat DataFlow or Flume4J-ecosystem in Google?

The point is, Google is a mature company. Mortals like most of us don't get to break ground in a technically advanced but mature company like Google. Instead, we find fast growing new problems to solve, to hone our skill, and to get to scale.

P.S., I personally know a number of prominent professors used to work on the Borg projects just to optimize for a few percent of gains. It's deep and interesting work, but nonetheless hard for mortals like me to get much out of.

That is, it's a more sure bet to work for a baby Google than to work for a middle-aged Google.


Yes or work on things that are fundamentally at odds with Google's business model, or on systems that in some way are difficult for Google to do because they have company-spe ific constraints or legacy systems


It’s definitely a tradeoff. I personally have gone through alternate phases where I wanted to be responsible for a huge piece of a small pie (startup) versus a tiny tiny piece of a huge pie (big company). There are pros and cons to both. The big slice of small pie feels cooler and more satisfying to my ego, but in the absolute sense the tiny piece of the big pie probably is actually making a bigger difference. 1% improvement at some of these companies means north of billions of dollars and improving hundreds of millions of peoples lives.


The absolute numbers may feel like you are improving the lives of billions but a 1% increase in some matrix that gives google billions rarely helps millions and often hurts.


This might be true in some cases, but given the level of generality being discussed, how can you justify the “rarely” and “often”?

I agree that sometimes our actions have unintended consequences. These are often hidden effects, which makes it easier for someone to overlook, even when they are trying to do good. But that’s just a generic idea - the specifics matter.


On the other hand, there’s a lot of real problems that real people actually deal with that just need a logistic regression to save million bucks here and there. I like that space more.


This describes 95% of machine learning at FAANG+, unfortunately nobody likes to talk about. Context: I work at a FAANG.


God, and it's such a snooze-fest.

Writing TFX code at Google is like having your soul-sucked through your rear-end! Imagine TF1 with all the broken APIs, but now it's all distributed! Fun.


> ... having your soul-sucked through your rear-end!

I laugh-snorted reading that! I am stealing that phrase.


Probably not the best analogy. Sounds like a good time.


Probably 99%. The people doing next-level stuff are largely being humored so the FAANGs can hire more early 20s grunts who think they're going to be working on ML (they won't be) or even given the time of day by leads on such projects (ditto). You give some famous professor a high-6 or low-7 figure salary in a position where his job is to publish papers--or not, because no one who matters in the company will ever read them--but you get incrementally more effort out of the youngsters who think they've got a chance (and they don't) of working on something more exciting than Jira tickets.

I don't think the FAANGs will ever come up with AGI and I am glad for that. If the private sector gets there first, I am going to "accidentally" die of hypothermia on a hiking trip because I will do anything not to live in their world.


This space isn't sexy to write about, there's no fame and glory in it.



Far sexier and larger than most care to admit.


If you have a few minutes, can you list a few of these problems? Just curious here!


I’m a consultant. I do not work in tech. Typical opportunities look at a decision that gets made many many times. This includes systems where everything gets treated the same despite some 80/20 kind of situation, which is a lot of them. Lots of older businesses have these kinds of setups where stuff is run on gut instinct or decent enough but risk adverse rules. Don’t think crazy neural net image recognition whatever. Really just look at what the business spends a lot of money on and think “could they do that smarter?”

A common thing I do is say company X has a fleet of Y assets. They repair them every N years. A good solution would be to predict which ones need repairs. Do those ones more often. Do the healthy ones less often. Pay more attention to the ones that are valuable.

Better outcomes, millions less spending. Probably don’t even need a live model in prod. Just a semi annual manual export run to excel for some planner guy who’s been keeping the schedule for 2 decades


I've heard about this, in different contexts. What it mainly comes down to is that incremental improvements can have massive impacts when you can apply them at a scale available at FAANG. I first read about this outside the context of machine learning, but it certainly would apply here.

For those of us who don't work at such scale, can you (maybe with a little fuzziness to avoid telling too much about an internal project) give a few examples of the kind of projects where a fairly simple model can have a 1M+ impact?


Here's a ~5 minute talk with 5 such examples (where relatively simple ML models made a 1M+ impact at a FAANG) :) https://youtu.be/zyOEOd1HkSY?t=946 Happy to talk about more details if you message me through my profile!


Thank you for the link! They were all interesting, and yes, all the result of having a high scale. For anyone curious and thinking about watching the video (I recommend watching it), the topics were 1) should you immediately re-run a failed ad payment (getting paid vs transaction costs/flagged for repeated billing), 2) should you send an IM immediately after a login failure (cost of text message vs possibility user will give up and not reset password), 3) should you fetch data for pre-loading in a web page (higher engagement with page vs cost of unnecessary loading), 4) video upload quality, 5) taking screen real estate for less commonly used UI features.

Interesting examples, and yes, they're all the kind of thing that might not justify the effort for an ML model (and might not have enough data to train) for a small website or operation, but can easily justify the cost and effort when you have a huge number of transactions.

On another note, this is why I often like lightening talks. So many people think that what they're doing falls below the threshold for what is an interesting presentation, when in fact it's the most relevant thing a lot of people will see at a conference.


Wow, of those five I'd only call 3) not evil, maybe 1). (based on the video, where the twisted reasons for them are explained.)


I’m a consultant. I actually do a very different kind of thing. Yeah big tech hyper optimized content so that a tiny boost of engagement improves a billion users by a tiny amount so the net effect is huge. I’m skeptical of it tbh. I think it forgets emergent effects and externalities over time.

What I refer to for my work is the low hanging fruit. Old problems that businesses solve with manpower or overly unspecific rules. Something where just a little clarity can help them hone their efforts on the 80/20 of it all. I made a slightly more detailed post in an adjacent response


Would you kindly tell what types of projects are such nice successes?


See adjacent responses


This has become increasingly important to me too. I am employed by a small (2-4 engineers at any time) company and I'm often disappointed because we're just so far behind in manpower & technical expertise that we have to dramatically reduce the scope of any problem we want to tackle.

On the other hand, I also worry about getting sucked into the bureaucracy of FAANG sized companies & not having any accountability or agency over what I work on. (I realize this is a sweeping generalization of FAANG, but some of my peers have had this experience even a few years into their jobs)


I'm surprised. Apart from DALLE, I haven't seen any AI approach that's off limits for 4 highly motivated people with 3090 GPUs.

At that compute level, you should be able to at least replicate SOTA in optical flow, structure from motion, speech recognition, text to speech, translation, text summary, sentiment analysis, image classifications, image segmentation, and of course playing video games or optimizing processes with reinforcement learning.

I mean thanks to KiCad even custom sensor hardware is cheap these days.

Can you give more details about what you tried to do and why that wasn't possible?


Pretty much everything in the high end large language model area is off limits to people without access to a supercomputer (we're talking hundreds of A100s or several $100k in cloud computing equivalent). Open Source efforts like BigScience may open up downstream tasks for normal people, but the forefront of this research is no longer accessible to individuals.


You might be surprised to hear that the KenLM language models that are used for speech recognition are actually trained on-disk using CPU. With a €149 monthly bare metal server, I could train my own LM on OSCAR DE and EN.

Where I do agree with you is that transformer-style text generation models in the billion parameter range are off-limits for hobbyists. But that's only a tiny part of the useful applications of AI. And you can train them with gradient checkpointing, it's just 100x slower than what Google can do.


KenLM ist not a neural network and instead a purely statistical n-gram model. So it's no surprise that it would be faster on a CPU in many cases. However, as soon as you have to deal with noisy data, KenLM gets blown out of the water by DL architectures like LSTM and, more recently, Transformers. There's a reason why purely statistical models have seen very little progress in the last 10 years (KenLM was published 11 years ago) and that reason is that this "noise" is basically just a consequence of the central limit theorem applied to data with a huge amount of nuance - much more than any human coded feature vector could ever account for.


If you want to try to train a StyleGan for instance with image sizes of 1024 to acceptable quality, you need either a lot of GPUs or a lot of time, or both.


That's an oddly specific example, because I was in fact training my own StyleGAN and using it to sell T-Shirts around 2018 as indivicia.com . A GTX 1080 TI was good enough for 300 DPI A4 prints (roughly 3500px on the longer edge) and I just trained a regular low-res StyleGAN model and then a styled 4x upscaler. Execution was hand-coded multicore C++ and took about a second per user upload.


Very cool usage of StyleGan! I think it’s a great idea but maybe allow poster prints? I wouldn’t put a lot of these on a shirt but family pics or nature photos would be nice to have styleized.

But back to my original point: training stylegan takes a lot of resources at higher resolutions. For something like thispersondoesnotexist on stylegan 3 you not only need to have lots of quality data, you also need many cards for many days.


Ah I wasn’t being clear, you’re actually correct when it comes to ML (which we use from time to time). My comment on the lack of manpower was mostly in regards to pure programming output or research efforts.


I see this argument a lot and it always feels like an over-literal interpretation.

Let me ask you a question -- what proves your ability to work on scale as an engineer? Is it the scale of the problem, or the scale of your solution?

Outside of "Google scale" logging is a nearly trivial, solved problem. If the only thing that creates the "challenges of scale" is your data cardinality, guess what. You're only solving problems of scale in the most literal (and maybe trivial) sense. Working on that at Google isn't going to prepare you for how to architect a platform for a startup that scales from 0 to 1 million active users over night without breaking a sweat.

I am biased in that I care about the latter kind of scale and effectively couldn't care less about the former, because the latter is generally an existential problem to have, while the former is a nice problem to have.


Not every problem is one of scale. And with the advent of serverless, problems relating to scalability will be largely abstracted from 99% of developers in the future and more of a niche knowledge domain. Just as the inner workings of the OS are largely not well understood by most developers

Obviously the principles and theory behind scalability is still important for properly structuring your app, but there won't be many novel problems to solve, and increasingly obvious architecture choices as time goes on


Well, I'd say they won't fully be abstracted away because devs will still have to pay for scalable systems (hw + sw) with dollars.

Granted, they won't have to think about designing a solution, but they still won't have the computational power which can be afforded by larger companies (and cloud computing is ridiculously expensive), unless they have a lot of money .


Yes, Google has a quadrillion internet search queries, but that only helps if your interested in solving problems that require a quadrillion internet search queries.


You are talking about 2 commodities that can easily be upgraded: logging and UX. Hell, I wouldn’t even waste too many precious resources on improving those things past a point.

Google has massive data and scaling advantages that can never be duplicated or fixed by smaller companies.


>G's logging system, from my perspective as a gcp user, is slow as balls and the UX is the incarnation of scroll jank

does GCP use G's internal tools?

Inquiring minds would like to know


> you can spend your whole project burn at the wrong company building something that you could buy somewhere else, or which already exists at a competitor

Oh look it's me. I'm sitting here building a worse version of TestComplete and Ranorex for automated testing of a Windows desktop application.

Looks good on my resume though.


This post feels like author is insecure about his position and wants to establish some validity. Having going through it all, it feels delusional at best.

The glorified pattern matching can only take us so far. You know it's working as long as there is a pattern. I wouldn't call it a general intelligence per se. There is no "juice" in these algorithms.

If we use these tools, we can immediately see where they fail and where they do not. These are just a new tools in software engineer's box.


> The glorified pattern matching can only take us so far.

This argument is becoming less and less convincing year by the year. We're amazed that things we were sure couldn't be done are actually done.


This argument has been on going, on and off, since the 1960s. Show me a research paper.


1960s were 60 years ago, we're talking about AI, a fast paced field in the last 15 years.


Definitely! There was a time when compilers were part of AI research. Now they are just another tool. Same with DL, they are amazing tool. We need them and they provide value if used correctly.

I just didn't want to call it as "intelligent" and use this as basis for defining "intelligence." We can call them something else. It's learning to do a specialized job as intended and in "intelligent" manner. But it is not intelligence. Even a small ant is intelligent than our current AI systems though they aren't sophisticated and can't perform human task, they are intelligent than AI system.

I hope that made sense.


> I just didn't want to call it as "intelligent" and use this as basis for defining "intelligence." We can call them something else.

TL;DR We're also mostly brute forcing our way to discoveries. We're not that smart.

People too are relying on cultural handouts, maybe most of our intelligence is also "something else". Before electricity was discovered we had superstitious ideas about electrical phenomena. Before germ theory was discovered we were getting sick and dying like animals, helpless. Not so smart, even though it was a life and death situation for us.

It's easy to be "intelligent" when you're given the solutions beforehand by culture. ML learns from the same culture, like 99.99% of us who can't discover new things even to save our lives. And many of our discoveries are a gradual work of trial and error, we don't go directly to the target but stumble/brute force our way to it.

There was a news story recently title "Elegant Six-Page Proof Reveals the Emergence of Random Structure". The funny part is how the authors stumbled onto the amazing solution after many many unsuccessful trials by all the math community. Not a great sign of intelligence when you have to rely on chance so much and so many fail before one succeeds.

This tells me we're also mostly doing "something else". Intelligence means solving novel problems with few attempts, not spamming our attempts to death until something comes out. ML research looks more like spamming than intelligence too.

You know what else looks like spamming? Evolution. It's a blind search process brute forcing the problem of self replication for billions of years. It created us and everything else in one run but it's not very intelligent, it just spams a lot.


Pattern matching can solve everything, if given enough storage and training data. Memorizing trillions of sentences is basically what makes GPT3 amazing.

You're absolutely correct that patten matching AIs won't ever be truly intelligent. But then again, many humans also never exceed what can be simulated with good pattern matching. And an AGI household robot only needs to be as smart as the maid that it's replacing.

I'm optimistic that pure pattern matching will get us to usable AGI AI.


Pattern matching can solve everything, if given enough storage and training data.

There's never going to be training data for "how things are going to be next year". A lot of large scale systems involve emergence [1], patterns which previously were not visible suddenly appearing. I think even today's AI can do things that a bit beyond pattern matching (learning to learn, etc). But pure matching as such is inherently limited.

[1] https://en.wikipedia.org/wiki/Emergence


I would be surprised if AI predictions for "how things are going to be next year" would be worse than expert human predictions at the 90% percentile. I mean most trends for next year will already be around this year, they'll just be too weak to notice.


I believe pattern matching is an important part, but intelligence comes from how you organize these patterns and relate them to one another. E.g. you can learn pictures of a dog by pattern matching but you can’t learn if a dog can beat up a bear, if there’s a bear outside a human just knows not to let the dog out.

What we need is a pattern detector + a the ability to create basically infinite ANNs (or be able to multitask on them) + an event loop that takes input feeds (from cameras, microphones etc,) does some kind of reasoning and then pushes to its output feeds (wheels, etc.)

I think you use pattern matching to extract unique objects, store these objects as a node with its own simple neural net + long term storage where it only stores pictures of this object plus a dataset about it e.g, how often you see it. You then you organize them into an object hierarchy. Each new object is compared against all other objects we’ve stored using their pattern marchers. The higher the output the more weight we give their “connection.” Each object is made up of of sub objects so they are the top of their own tree as well, so you can run this pattern finder on the dataset of individual objects itself and if you find new objects the tree recurses. You can then check these objects against existing ones etc.

A general intelligence does this constantly, in real time. Then it’s a quick algorithm.

1. Have I seen this object before

2. No, but it shares characteristics with animals (an object that groups together all things that look like animals.)

3. It’s much larger than my dog, and I’ve seen large animals attack small ones more often than not.

4. My dog is also a dick, and attacks other animals more often than not

5. It’s probably a threat

Just scaling modern compute won’t get you there unless you’re willing to dedicate a few orders of magnitude more energy than a human being to do so. You need a completely different, distributed, architecture if you are going to be able to compare billions of objects against billions of objects every time you see something new and in real time.

Machine Learning is great but it’s only the learning part. Intelligence is reasoning about multiple things in relation to one another not detecting a pattern. You might trick yourself into thinking you’re getting there because pattern matching is powerful but it’ll get you to the intelligence of microbe at best. Even then you need something that’s driving the actions.


What gave you the impression of insecurity and validation-seeking?


AGI is very unlikely to happen within the next 50 years. Dangerous limited AI exists now and it's going to get worse. I don't worry about malevolent AI, because we don't even know what consciousness is, nor the limits of a (presumably) nonconscious entity's attempts to emulate intelligence.

I worry quite a lot about malevolent humans using enhanced technology (note that most things in technology, once accomplished, cease to be AI) will do. Authoritarian states and employers can already learn things about you (that may or may not be true) that no one should be able to know from a basic Google search. This is going to get worse before it gets better, and if corporate capitalism is still in force 50 years from now, we will never achieve AGI in any case because we will be so much farther along our path to extinction.


> I don't worry about malevolent AI, because we don't even know what consciousness is, nor the limits of a (presumably) nonconscious entity's attempts to emulate intelligence.

It doesn’t need to be malevolent. You’re made of atoms and if the AI has uses for those atoms goodbye you. There’s no reason to believe consciousness has any impact on the ability to maximize an objective function, i.e. try for a goal.


AGI is very unlikely to happen within the next 50 years. Dangerous limited AI exists now and it's going to get worse. I don't worry about malevolent AI, because we don't even know what consciousness is, nor the limits of a (presumably) nonconscious entity's attempts to emulate intelligence.

I worry quite a lot about malevolent humans using enhanced technology (note that most things in technology, once accomplished, cease to be AI) will do. Authoritarian states and employers can already learn things about you (that may or may not be true) that no one should be able to know from a basic Google search. This is going to get worse before it gets better, and if corporate capitalism is still in force 50 years from now, we will never achieve AGI in any case because we will be so much farther along our path to extinction.


The post is not only too braggadocious for my taste, but some of the figures quoted are highly unlikely. I would personally not work for someone with this kind of ego, but there are many such people in positions of power.

This article is representative of an attitude I'm seeing around the tech industry, and if this is indeed the level of "confidence" in the Bay, I don't think that's a good sign.


Eric Jang is top ML talent, these numbers are accurate. I work in ML and have followed his work for years


He claims to be solving general intelligence in 20 years. Your advocacy is not enough to convince me.


General intelligence has been 20 years away since the 60s, along with fusion power and a bunch of other things.

In marketing, they say 5 years when it's actually 20 years away.


In my view, AGI is farther away than fusion.

We know how to do fusion. We know the physics behind it. We haven't yet figured out how to build profitable fusion plants, and we probably won't for a long time, if for no other reason than improvements in fission--modern fission plants are the best .

When it comes to AGI, we have no clue. It's a constantly moving target, because our conceptions of intelligence evolve. Most things that were once "AI" became "non-AI" solved problems after we got good at them using a couple key insights, e.g. that image processing could be sped up with CNNs due to the existence of a topology on the inputs. We still have no idea what makes us tick, and moreover there is not a strong economic incentive to replicate all of our intelligence... although, of course, automation will continue and that itself will be disruptive enough.


I disagree with part of this. Nature has proven that general intelligence can be achieved in a compact, energy-efficient form: humans.

Has nature ever proven that nuclear fusion can be sustained at human scale?

I would bet that agi comes first.


I think they're both sort of in the same place.

We're fairly certain it can be done given unbounded resources, we have some idea of the principles involved, but then there's a rather significant element of "draw the rest of the fucking owl" between where we are and where we imagine we could go.


He says AGI could happen in 20 years, not that he will single handedly manifest it into existence. That seems like a reasonable timeline given the field's current pace and may even be conservative.


If he's actual top talent as opposed to a poseur who's good at self-promotion, he should stay in academia for his own sake because he'll be crushed in the corporate world. Actual high IQ people get clobbered in corporate, while OKR-ing charlatans climb the ranks effortlessly... yes, even at FAANGs.


First I've heard of him tbh.

I'm not aware of anything he's accomplished but can see the delusion. ML people seem to think the output of their work is not mediocre. Yeah, you bred monkeys till something resembling shakespeare appeared to some reproducible consistency and it is better than something someone can code - but that's an incredibly low bar.

Acknowledge that were still very much in the stone age of AI and what were doing is large scale analytics at best.


Oof


Have you seen his previous employment though?

He's exposed to enough corporate work. https://www.linkedin.com/in/evjang/

If he doesn't like this place, he can just make another post like this and I am sure ML startup CEO and ML division heads will be flooding his inbox.


Academia will be difficult for him as he does not have a PhD.


What makes you think academia is any different?


It probably isn't as bad as the corporate world, but I'd be curious to hear why you think academia is impure in this way.


Still.. do you think being a top talent in ML guarantees success for your own company, for example? I think there are a lot of valuable skills to have, being an expert in X is just one of them.


Which figures do you think are unlikely?


A lot of opinions and unverifiable statements (this and this company is X years ahead of everyone), and the whole piece is essentially about one person's job market. Skip


I’m interested in understanding the ML job market for traditional software developers.

Does the opportunity exist to transition into any particular ML roles then grow from there?


Especially if you have Python experience, then yes the opportunity definitely exists.

For example when I hire MLEs (which I am doing now if anyone wants to apply - supportlogic.io) I am willing to look at people who are solid Python/backend engineers and who have been "ML adjacent" or who we believe could learn the ropes of ML enough to contribute. The stronger an engineer, the more flexibility we have in ML knowledge. Some ML engineering is task-specific but a lot of it is automation, data engineering, and improving data scientist code (for which you do need ML experience

I've found it's a lot easier to teach an engineer enough DS/ML fundamentals to do ML Engineering than it is to teach a data scientist engineering skills. A lot easier...


Interesting. Honestly to me Python and backend engineer are effectively orthogonal skillsets though. I would expect any decent programmer to pick up Python in about a week... (slight exaggeration but you get the point).


Pick up to what point?


I've been seeing a lot of Data Engineering/Platform roles that support ML without requiring past ML experience. How much future lateral movement would be available to you will vary widely, but this would be a fairly easy inroad.


You need 3-4 years of intense study to transition from software engineering to ML. It's different enough.


Sure, there are many sides to ML. One is the data science bit, curating data, picking a good model. Adjacent to this is research into new models or training methods.

The other side is deploying it efficiently, and that becomes a more routine software engineering problem. Fundamentally you have some code that you want to run as fast as possible on the cheapest hardware you can feasibly use. Large companies like Google have the luxury of splitting this out into several distinct roles - from pure researchers (people publishing papers), to people who train models for business purposes (eg the Google Lens, computational photography, Translate), to people who optimise the ML library code underneath, to people who build out the end user application with the ML model as a black box service.

Most of those people don't need to know much ML, but the exposure can help you transition into a more ML focused role.


"FAANG+similar : Low 7 figures compensation (staff level), technological lead on compute (~10 yr)"

I don't know where OP is getting these figures from, but I doubt that FAANGs offer 7-figure comps to Staff-level people. It's probably more in the higher 6-figure level (400K - 600K).


The author is a skilled research scientist in a very competitive space with some high-profile publications (e.g. Gumbel Softmax). He is absolutely an outlier, but not a unicorn -- AI researchers with good publications and reputation will attract a lot of interest from companies with lots of money to spend. Low 7-figures for an ~L7 research scientist with competing offers from FAANG research labs is not crazy.


"Staff" is L6 though (at least at Google, afaik).


People are conflating SWE (or "research scientist" in name) bands with research scientist bands at labs like Brain. This guy is an outlier.

"You can only be level X with compensation Y after Z YOE" is one of the greatest infohazards in tech.


Fwiw, the pay bands at Brain are the same as the pay bands everywhere else, with the sole exception of stock grants in initial offers.


That probably matter a lot.


Currently on the job market in the AI space in the Bay Area - 400k to 600k is senior level at FAANG + similar. Low 7 figures at staff wouldn't shock me (although I don't have any actual data on that)


Can we pause and admire a data scientist drawing conclusions while being utterly unconcerned with underlying data? ;)


that's ok, they are bayesian


Pay definitely doesn't increase by 600k+ as you go from senior to staff. Yikes. Are people larping on HN?


ML/AI initial offers can be fairly inflated (https://aipaygrad.es/)


They're paying for a mascot. It's a marketing expense. $1.2 million per year is nothing when one considers the increased ease of hiring young grunts who do the ugly but necessary work, and who do it well because they think that if they pay their dues, they'll be picked to do the upper-caste "interesting" work, which of course they won't.

In those jobs, you don't even have to show up (although most of themdo, in order to stay relevant but also because these ex-academics tend to be very driven people). You're getting paid well for letting the company say you work there, because this makes it easier for the rest of the company to fill out their chain gangs of early-20s Jira jockeys who'll be used for three years and then PIP-raped because "yellow zone".


OP is probably in the best situation to judge this since they likely had competing offers or at the very least know peers with competing offers at FAANG+ staff level.


I'd be interested in a credible source for this as well, even though ML work pays more. From what I know from people in the industry it's 6 not 7 figures.



looking at levels.fyi it seems there are at least several companies paying 7 figures to ML engineers with less than 10 years experience, although majority are at 15+

https://www.levels.fyi/Salaries/Software-Engineer/Machine-Le...


If you drill down, you'll find out that those 7-figures for positions lower than principle engineer are lopsided towards stocks, and are likely a reflection of luck-of-the-draw stock-appreciation relating to when RSU grants were issued.


600k is a senior engineers wage


No, it's not. If you get lucky with stock market movements, that might be your total annual comp, which is not your wage. I made close to what you're claiming at senior level, at the recent peak of stock market insanity, but I'd stick to the dictionary definition of "wage."


This all day. Sure if you joined FB in March 2020 your total comp today will be a multiple of that 700k, but FB is not giving out 600k packages today to e5s ( and e5 imo is staff+ at many lower tier companies ), more like 450k all in compensation.


People are hopping on this guy for saying he’ll solve agi in 20 years, but I’m already laughing at him thinking he’ll be creating value with humanoid robots in 1 year.


I was expecting something along the lines of how AI/ML was considered a sexy career path, but there are very few jobs available and high competition for those jobs. So, as a result, you end up with the only available jobs as data engineering/ML ops/backend that supports ML teams. I am happy for this author and their success, but they clearly are not representative of the majority of the people in the ML job market.


I think I have a decent CV, with quite a bit of experience for a master's student. I have been searching for a job in MLE, for a bit now with very little to show for it, as I am either getting no responses, or responses claiming that they are looking for more experienced people, and particularly those that had experience with a particular stack.

In all honesty, after 6 years of studying, with 4 of those years studying ML, to be told that I lack experience with some particular stack as the ~~excuse~~ reason for rejection feels like a slap in the face.

And all that ignoring everything that expects 3+ years experience for entry level positions.


I suspect they are looking for experienced ppl because they don't have one and don't exactly know how to manage ML and what to do with it and hope someone else will come and show them.


Start building projects on your own with the most common tooling. Having something to show cuts down barriers.


Reading comments like this is a bummer. I am currently doing my masters with a focus on NLP. At this point I'll be pretty happy to get a job even if it just boils down to only deploying ML models. Even for such a role of companies expect years of experience or a PhD I don't know how I can even get a job in the first place


I sometimes interview candidates for ML engineering roles, and let me tell you most of them have trouble with basic concepts. It's great when I find someone fluent, for a change.


All these fields have that problem. There are people who are great at aping the signals (and even beating publication metrics) but really don't know anything. When they show up on a job, they get put into a PM (or actual management) role to limit the damage they can do. They learn enough lingo to be able to convince 90% of people they know what they're talking about (congrats on being in that other 10%, though) and will move up forever. Meanwhile, people who actually know their shit are usually too busy actually being smart to play the games necessary to get themselves hired in a hyper-competitive job market. So it goes.


Basic concepts referring to what precisely?


I can only speak for Boston area, but I've been on teams who regularly welcome those with only 1-2 years of experience (even if it wasn't that relevant, as long as they had relevant schooling). I don't know if you consider that positive or negative evidence, but there it is.


I have been in a similar position with NLP - either you already have a lot of experience (maybe even a PhD) or you're a mathematician or statistician. Otherwise you're left out.


Funny enough, my experience is NLP and DeepRL.


Your best chance might be to apply at small companies. They often have data and want to take advantage of it but haven't so far. Of course they're not doing any research or are Google, Amazon, etc. but hey.


You have to be doing something wrong or applying for the wrong types of jobs (lots of MLE jobs are just glorified dba jobs at certain job companies).


That is why I am doing a Post Doc


May I ask what you're working on?


On causality

There is this interpretation of Bayesian networks that the parents are the true causes of a node and to predict what happens if you change a variable you need to remove the edges from the parents to that variable from the network. And then I study what you can do with that method


where are you located?


Europe. I am open to relocating pretty much anywhere in Western Europe or Scandinavia so my horizons are pretty open.


So central or eastern Europe ... yeah, thats gonna be tough!


No, actually in Scandinavia.


As a fellow Scandinavian I was in a similar position as you 2018, except that I was looking for MLE job with a (Software) Robotics background.(because beside ABB there were no robotics interest there).

Tl;dr I moved to Japan and worked in ML (ish) job. Once you start working it becomes remarkably easier to not be scoffed for lack of experience (even if you learned very little in that job)

Job security is high in Scandinavian countries and as an consequence people hire very risk-averse. risky/not so established jobs such as ML positions, they'll be actively looking for reasons NOT to hire you


That was my pre-conception as well. I know some people who have tried to switch from other engineering to ML/AI, some taking as much as a year off work, and the only successful ones already had a network in the Bay Area from their previous associations (prestigious university/employer).


Yep, at that level, there is a ton of competition. The self-driving industry is almost like the video game industry and actually has a ton of employees who previously worked in video games. They are definitely much better compensated in ADAS but the work culture is similar (anecdotally so YMMV).

ML is the sexy wing of the tech industry, so it tends to attract the people who are willing to put in the hours (for interviews as well as towards work).


This is because there isn't really much demand for genuinely innovative ML in the tech world. It's a third-circle nice-to-have, not part of the core business.

It enables the business to say they invest in R&D (tax writeoffs, marketing) and it also makes it easier to hire the grunts who'll do the scut work, thinking they'll one day be working on something more interesting (which they won't be).

Competition for bullshit "data science" jobs isn't that tight. For genuine research that actually matters, it is, and that's because most of these positions exist as recruiting tools (you're getting paid to let the company say you work there) and that's always going to be a thin ledge to try to perch on.


The OP actually started his career as a ML engineer (or SWE even) and transitioned into research without a PhD. Can you elaborate on your "which they won't be" comment?


The fact that it is possible to jump from MLE to a research role in Deep Learning means that this pathway is accessible to anyone who is a decent SWE and has some math skills. But there are much fewer research positions. So there is an element of luck and being at the right place at the right time.

Today, the competition for research positions in a "sexy" STEM field (think quantum computing, black holes, DL, RL) is quite high. DL just happens to be a field where it is possible to get into research without investing too much time while also making a lot of money, so this is a dream job for many.


After how many years of Facebook, instagram, youtubers, LinkedIn do people still not get it? This is the internet. How on earth can you verify the veracity of the authors statements? Everyone is discussing this like a blog post on themselves is an accurate portrayal of reality. It might be, but it might not. I'm not saying they're being dishonest, with they may be, but they are definitely not incentivized to give you an accurate portrayal of their life. This is instagram for IT nerds and there is no way they have a Badonkadonk that big.


To me, outside of this world, it feels like when one of the guys at a shop just like ours wins the lottery a few cities over. Everyone's just dreaming big for the fun of it.


Ah but its much more fun when everyone hosts their own blog.


- This really isn't representative of the ML job market because the author is such an outlier.

- The fact that it isn't representative is what makes the article an interesting read.

- The fact that they claim to have a plan for solving AGI in 20 years really detracts from their credibility.


As someone who is in a somewhat similar position as the author (looking for senior ML roles), I found this part enjoyable:

> I’m not like one of those kids that gets into all the Ivy League schools at once and gets to pick whatever they want.

Followed by "FAANG + similar" and a deluge of options. Also, I feel like their message is pretty liberal with using future projections and implying it to be the present. For instance, the author has 6 years of experience with 2 at the senior level. This is pretty far from "staff level" (at 1M+ compensation, I think this is L8) which they imply is/was an option at a FAANG company. I don't doubt that in 5 years they would be at that level, but they almost certainly did not get offered a "staff" position at a FAANG.


It's ambitious bordering on delusional. They're also doing the dirty trick of putting "2016 - 2022 Senior Research Scientist at Robotics at Google" on their resume even though they've been in the senior position only since 2020. Like, dude, you're doing great, your resume doesn't need any more artificial pumping up. Or I guess it does if you're aiming for those positions that are kind of out of reach.


I don't think that is unusual to list the latest level on a resume. I'm certainly not going to dedicate space on a resume to list time ranges for every promotion.


So if this person had been promoted to staff level in 2022, they could have changed their resume to "Staff Research Scientist, 2016-present" and that would be okay with you? Because it seems deliberately misleading to me.

It's different if the level isn't represented in the title - if they went from one band to another but the title was the same, I don't see a problem with putting down something like "Software Engineer, 2016-present" without wasting space on each promotion.


In general, yes, I think that would be okay. I think it would be a mistake to create separate sections for each level. Overall your achievements within a single company should not be organized chronologically, but by what you want to show off. This may still be mostly chronological as you take on more responsibility/leadership.

Now, I don't object to adding a line like: "promoted twice from Software Engineer 2 to Staff Software Engineer" or whatever, which I think is a good middle ground (and I would put this as the very last, least important entry for that company)


But it undeniably misleads the reader into thinking this person has held senior responsibilities since 2016, which is outright false, and may instill in the reader more confidence than is due.


I'll add another data point in opposition. I expect to see only the latest title for each company on a resume and my resume is organized the same way. It's practical, not deceptive.


As a hiring manager, I don't assume that. When there's a single title over a long range of time, I assume it's a terminal title. I look to the details for the position to see what kinds of work they've done. In the interview, I'll dig into trajectory and experience at various levels.


As a hiring manager I have learned that some people are trying to be deceptive that way and can thus no longer assume.

Thus I have to make sure either way and probe a lot unfortunately. Did they hold the title for the last 2 months and are jumping soon after? Will they do the same here? I want to know about and see the progression. There are situations where it is sort of irrelevant but in others it is detrimental if I have to probe.

If you are say in year 5 of your career and at senior level at just one company I want to know if you were a junior when hired out of college, were super awesome and made intermediate after one year and have been senior since year 2.5. You can show that to me right on the CV by listing it individually. If you just put the end title and that's it I will assume that you made senior this week and are trying to jump ship. This doesn't mean we can't figure it out together in the interview if it gets to that stage. But it sets a certain tone and connotation for the entire conversation. A bias to overcome.


Then how am I to communicate to you, the hiring manager, that I held significant responsibilities for a longer period of time (eg 6 years) than an applicant who ducked out the moment their new title kicked in (eg 6 weeks)?

How much does that matter to you, anyway?


It's clear on his LinkedIn. It's not unusual to to list the latest on a resume.


These outliers are rare but they do exist.

I knew someone at Google who was hired as L3 straight out of college (as all non-PhDs are) and got promoted once a year to L6 (Staff) so 3 years. He got promoted to L7 2 years after that.

It's a rare combination of talent and the right circumstances but it does happen.


I tried to hint at this by using quotes, I don't doubt that L6 is possible. But, please elucidate, are there L6s at Google making "low 7 figures"? From levels.fyi, there are no such reports. The average is about half and matches what I know from other companies. Those that are approaching 7 figures have at least a decade of experience. Anyway, the pay he describes is much closer to L8.


The only way L6s are making "low 7 figures" is because of large movements in the share price after they were given grants.

Low 7 figures is realistic for L8s without the share price noticeably appreciating. With discretionary grants some L7s may squeak into that club. But L6? No way.


You can see some examples here https://aipaygrad.es/


Are those SWE paybands or a Research Scientist pay band?

That might matter.


Ironically, he graduated from Brown University, an Ivy League school.


And he worked in one of the most exclusive teams, Google Brain.


Kids who graduate from Ivy League schools get a lot of job offers at once and get to pick whatever they want.


FAANG staff in 5-6 years out of school is not impossible. I know a couple. They are significant outliers in terms of focus, dedication (i.e. hours worked), and raw intelligence. If I had to guess, I'd say 1 in 30 from the population of Google-level engineers.


> They are significant outliers in terms of focus, dedication (i.e. hours worked), and raw intelligence.

As someone who has worked at FAANG for 5 years right out of school, getting to staff is less about raw intelligence and more about being lucky with working on projects that did not get canned and finding supportive managers. My friends much smarter than me have not had a good growth purely because they were unlucky with initial team assignment and PA / reorgs cancelling their projects.


That is true, but one "young staff" I know was initially assigned to a dead-end project. She made it her life's mission to find something more interesting and promotable, and succeeded. But there's definitely luck involved.


Is there a large org where commitment/getting things done is more important/valued than social skills/network/luck? I.e. is there a fair model to measure an individual‘s contribution?


On the plus side, the article accidentally produces the most useful definition I've seen of AGI. If you just define AGI as the the union and convergence of all hard problems, then you can just say "I'm working on AGI" to let people know you're doing the smartest, hardest thing, without sweating any details.


> - The fact that they claim to have a plan for solving AGI in 20 years really detracts from their credibility.

I see they're going the cold fusion route.


The problem with AGI is that it is people like this who are developing it.


There are some groups defining AGI in a way where a 20 year timeline is aggressive but not impossible. The real question is how is AGI being defined by the author.


Can someone with only 6 years of experience make credible predictions about things 20 years in the future?

I'm around 15 years of experience, and my appreciation for my own lack of knowledge and ability to make predictions still grows with every year.


I’ll bite on the above loaded question: Define experience. However you do, it should at least include work, education, and life in general, given that such experiences relate to the context or situation.


Yeah, it was a bit snarky, but still an honest question. In some domains, like astronomy or geology or climate change, I guess it seems reasonable to make predictions 100 or 1,000 years in the future based on available data rather than experience. In other domains, like politics or economics or finance, it seems like there would be much more value in having worked through a bunch of election and business cycles over the years. I'm not sure where computing and AI sits on that spectrum. I can imagine a young AI researcher failing to realize that some approach was tried and found to be a dead-end 30 years ago.

On the definition of experience, I agree that education and life experience counts. I said "around 15" years for myself because the definition is fuzzy. I got some very specific career preparation and training in college, so that sort of counts, and I probably spend more personal time than many of my peers learning about relevant history and current events.


As a relatively unremarkable data scientist/machine learning engineer of about 5 years, I've been keeping an eye on DS/ML positions as they tend to give a sense on what is important to companies in that space, although I'm not actively looking for a new role. More and more positions seem to require Ph.D. credentials even for non-senior roles, even though modern DS/ML tooling doesn't require it.

If I ever left my job I might have to quit DS/ML and do something else entirely.


Nah, you wouldn't have to quit. If you've got 5 years experience, even on non-cutting edge projects, the PhD won't matter. Sure, you won't be able to get any job you want, but there are lots of ML jobs that list a PhD requirement that will nevertheless jump at the chance to hire someone with practical experience.


Hiring managers are more generous, but HR screening the resumes aren't.


That'll only be a problem if you're up against significant number of candidates that have both experience and a PhD. HR will certainly filter resumes based on something as clear-cut as a degree when they can. But they can't just filter everything out and tell the hiring manager he's SOL. So you don't have to check all the boxes as long as you check enough of them, and you're competitive with the other candidates.

At the same time, nobody is going to do all that well if they just apply online and cross their fingers. You need some kind of human contact, either though an introduction to an insider through your network, or through a recruiter of some kind. It takes a bit of time to develop the relationships, but it's quite doable and worth it, even for introverts. Best to start before you're interested in changing jobs.


DE/ML ops/ Software engineer(data) is many ways new DS. Lots of greenfield projects and less competition.


"Crypto community has weird vibes"

that's what I call an understatement.


One person's weird vibes is another person's great vibes.


> Low 7 figures compensation (staff level)

Odd choice of level, since the author worked at one of these companies and was not at that level, certainly did not make 7 figures.


They spent 2 years at the senior level at one FAANG. Why would they switch to another for anything less than the staff (senior + 1) level?

(Not saying anyone "deserves" that or that's how it should be, but that's just how it is here in the valley.)


and not just generic senior SWE level.


Or you can go into academia to work on the really hard problems on a 5 figures salary.


I appreciate the post in the sense that it is an insightful perspective that he didn’t have to share. If you are elite it is difficult to talk about your options, pay or way of thinking without it coming across as nothing but hubris to the rest of us.


Not sure why this gets so much hate. I think he's a bit too optimistic about AGI prospects. But in his position these are all interesting and reasonable options.

I'm a bit sceptical on this 10,5,1 whatever year ahead metric he pulled from wherever.

Interesting read regardless. My opinion about the next few years is that most value will come from finding your niche, creating Datasets, iterating and building your ML model (a bit like he wrote but without this AGI...)


"Product impact is even slower than robotics due to regulatory capture by hospitals and insurance companies."

The author apparently does not understand regulatory capture and is throwing around catch phrases to sound smart. Regulatory capture would imply that healthcare encounters less regulation than it should due to influence over the relevant government agencies. This should increase product impact and reduce time to market, the opposite of what he suggests.


Doesn't the author's sentence mean: hospitals and insurance companies have coopted regulators for the benefits of their own businesses, at the detriment of medical device companies (developing AI)?

I think the author's point still stands.


Possibly that was the intended meaning, but the point is still invalid. Hospitals want more cool gadgets. They want to be able to treat more conditions and charge more for it. If anything, the FDA is a constant annoyance to a healthcare provider because it hamstrings them from providing care. This is why so many patients are enrolled in clinical trials, to get care ahead of the FDA approval time frame.


Right, the FDA is what's wrong with American healthcare. Not the insurance agencies or the for profit hospitals making as much money as possible. The lack of single payer healthcare is one of America's greatest failings on that end.


I work in theoretical machine learning and what I don't like about this piece is how casually the author assumes that what goes us to the present state (hacky approaches where you converge to something "good enough", unless the model is very simple, e.g. a feedforward net---and even here there are still open questions---and you have some theoretical guarantee) will also get us to "AGI".

I believe that is unlikely. Here's metaphor for that, that I often like to use when I speak to engineers doing empirical ML work: In ancient times people build a lot of nice structures, such as pyramids, cathedrals etc. By trial and error many rules of thumb were devised and they more or less worked. But it's safe to say things like earthquake-resistant skyscrapes and modern bridge cannot be build without deep theoretical insights into structural mechanics. These are highly optimized, intricate structures. The same is probably necessary to build highly optimized, intricate models that deliver what we now consider to be "AGI" - but then again, the world of ML is full of surprises.


You forgot to write that crypto startups compete with publicly traded FAANG on compensation on both cash and non-cash compensation, and there is no liquidity issue whatsoever on the non-cash they pay you with. Vesting schedules are more competitive than FAANG.

And the publicly traded crypto companies compete with FAANG on compensation too.

Non-crypto startups are the only ones sitting in the doldrums left out to dry right now.


This doesn't sound quite right to me; Coinbase pays well, but looking at levels.fyi it's not FAANG money; is there someone else you have in mind?

Would love to hear more about the startups; I tend to turn down such opportunities far before we talk non-cash comp.


Levels fyi shows its FAANG money at all levels I have it listed alongside Google and Amazon and Facebook right now, what did you compare that seemed different? (They’re not reporting consistent 7 figures for any of them) did you see something more granular?

Outside of publicly traded crypto companies you need to talk to a third party recruiter in that space

Solana Labs, for example, one of many, was paying engineers $650,000 back in 2019-2020 (and still is) to mostly write in Rust. Compensation was ~$200k cash and $1.6-$2 million in Solana tokens vesting 3-4 years with 1 year cliff. Solana tokens were $.10 cents back then, so those engineers are sitting on like $100 million+ as Solana trades at $100/sol now, down from $250/sol.

For more typical results, companies that pay in crypto only have a few employees so giving them all a few million dollars in their much smaller less successful crypto still results in being able to liquidate close to the notional value they started with, derisking your time and coming out ahead in general. A “tiny” crypto is still like a $30 million marketcap. Even the $300 million marketcap ones are considered tiny. Market depth / liquidity is usually enough to support a few million dollars of periodic employee sell pressure.


It’s important to keep in mind that working at the next Solana Labs, Alameda Research etc is roughly equivalent in probability as getting drafted in the NBA. That is to say, there aren’t a lot of cases that happen.


I did say that

> For more typical results,


so you are saying there's a chance these solana engineers are sitting on a billion in crypto?


collectively? yeah, sure. this is absolutely probable in any organization of that valuation/marketcap, the most interesting thing here is just how fast crypto organizations can accrue and extract value.

tech sector is fast, crypto subsector is like an order of magnitude faster. its similar to tech employment in the 90s where there was fast vesting (mostly due to quick exits), liquidity at super low valuations that then rose extremely quickly and attractive compensation. the main difference now is that the valuations are much much higher. you can tap in sometimes/often at very low valuations - of the token - and also ride them up all the way to billions valuation very quickly. if they solve a market need (within the crypto space) then they attract value very quickly, sometimes that market need can just be the entertainment coming from hype, but most times its bandwidth since there is not enough blockspace to go around, periodically.


I'm just multiplying 1.6M by 1000. assuming a Solana engineer held on to every token they were granted.


Ah, nice, its possible. Yeah earning crypto has always been a greater way to make a lot of money quickly than trying to buy and trade cryptos, because there is no financial risk with your pre-existing capital.

Team and advisor allocations have been this, and have been my best trades. Vesting grants for employees can be lucrative too. Often times these are also discounted prices to whatever any buyer can get. So things amplify very quickly, and there are less ways to lose.


probably, that's what tends to happen when you literally make money


This post reeks of someone who doesnt ‘need’ a job and has made enough money already. Good for op to go after ‘exciting problems’ rather than the mundane will I make rent and fees


I don't think many machine learning folks working for FAANG are worrying about such things.


Well, who knows.. maybe some overworked fellow probably is


Author is going to Halodi Robotics.

I always thought that human shaped robots are a terrible form factor. Why limit yourself to the awkward design that 3.77 billion years of evolution accidentally landed on?


There is a school of thought in robotics/AI that believe embodiment is necessary (or at least the fastest way) for us to learn abstract thought. Embodiment can really span the gamut of meanings, but there are definitely researchers that believe humanoid robots are the best path forward to that goal.

If you have a more specific goal in mind, e.g. solving a small set of industrial/commercial use cases, that changes the calculus dramatically.


Perhaps because that form factor has heavily informed civilization's current challenges?


It really sounds like the author's long term goal is AGI. I believe he said something about how being at a company with a 5 year head start on human shaped robots would give him the data advantage he believes is needed for AGI.


Why do you think it is an accident? I thought evolution is an adaptation mechanism. If anything I'd say we've got a pretty cool form factor (peak human form, not like me who is out of shape lmao).


The human body has been optimized for a very complex objective function, and in a very different environment to a robot. If you specify what the robots are doing, and the set of constraints like power source, size, weights, etc., the optimal design will unlikely be humanoid.


I used the word accident to emphasize that there is no design or designer. I agree that evolution is adaptive and often creates optimal solutions.


Human shaped robot is not a technical solution, it is a feature


>In the future, every successful tech company will use their data moats to build some variant of an Artificial General Intelligence.

Its rare that an article loses my faith in the first sentence.



Off topic a bit - "Halodi Robotics (company he's joining) intends to produce thousands of humanoid robots by 2023"

Are humanoid robots just around the corner? Musk claims Tesla will have a "prototype" humanoid robot this year. I dismissed that as Elon hype, but have I missed this coming?


The only thing you need for a humanoid robot is for it to be human shaped and have some electronics inside of it. Doesn't mean it has to even remotely be human in interaction.


The robots on their site look pretty clunky to me, not to say they aren't doing smart things.


As someone whose intention is to go to Medical School and pick up programming (+ math skills) to potentially work at the intersection of ML + Healthcare, the knowledge of the regulatory hurdles expressed is discouraging. Not sure if it really is worth the effort to study tech on top of medicine. Are there any people with experience within ML + Healthcare/Medicine or know of startups that are making great strides within this realm?


I used to work in the healthcare vertical. While there are regulatory hurdles, they are there for good reason. Move fast and break things does not work in this industry and will get you fired.

You will have better luck working with one of the larger companies who have a good history with the FDA, and more importantly, have good relationships with hospitals and physicians. They are aware of the time and resources it takes to push something out. Pay will not be FAANG level or anywhere close, but they usually have great work culture and WLB.


Thanks for the comment. Do you have any feedback for a future physician interested within the intersection of ML + medicine (does not strictly need to be "healthcare")? Would learning ML and all the other things needed to know be helpful (huge time investment needed here)? Ultimately, would be interested in leveraging medical knowledge into a startup capacity.


As a traditional physician, I imagine your value add would still be your domain expertise. The physicians I worked with were purely consumers of AI tech and had a say in the design of the software. But AI and more generally software is becoming an important part of medicine so I guess it couldn't hurt to learn some basic ML.


When I was an MD/PhD student in 2017-2018, there were only a handful of labs specializing in applied medical ML and computational biology. Since leaving to work as an ML eng/backend eng, I have been surprised by how the relationships between academics, practitioners, and investors in pure software contribute to a positively reinforcing loop. On the medicine side, the academics lean towards software skepticism, the investors make fewer and safer bets to compensate for historically lower margins/growth/return multiples compared to pure software, and the subset of practitioners get payed orders of magnitude less and have correspondingly less engineering development. The differences affect everything from managerial quality, skill and career development, location flexibility, upward mobility, product scope and impact.


Speed of iteration is all that matter.

That is why all the big like Google no longer push things forward, even with the best engineer and Phds and with so much money.

That’s why I think that Optimus at Tesla will crush all the other robotic platform.

On the positive side success of Optimus will help startups to get funds or get acquired by corporations that want to get a slice of the newly proven market.


You are confusing speed of iteration with speed of announcements. There is a lot of stuff that happens, like research and process building, at a big "slow" company like Google that Tesla doesn't even realize is needed yet. Tesla makes a stream of wildly optimistic announcements that makes it feel like it's closing the gap with Waymo for example, but there's no evidence that it is


The iterations happen way faster too. Look at Munro teardown of Tesla, he has never see a tenth of that rate of change ever.

Look at what SpaceX accomplished. Look where OpenAI is given the time it’s been operating. Look at Tesla rate of production increase.

It’s going to be very hard to compete with Tesla at this point. So much ressources, so much bright engineers, all the knowledge in manufacturing, all the training on vision, etc.


I guess the difference is you think they have more resources, more bright engineers, and I'm quite certain they don't. I'd guess Tesla has several times less engineers working on FSD than Waymo and making much lower salaries, and they are spread across large parts of the stack that Waymo can just tap into Google for.


No the main difference is the mindset of the place, the culture. Nokia, Blackberry didn’t overtake Apple, why? Culture.

But before Tesla had more limited resources, now it’s almost a non issue.


Say what you will about the OP and the claims he makes in his career announcement... I was struck by how casually the author mentioned that Academia is behind various private companies. Call me romantic, for in an ideal world Academia is just as close to the cutting edge of knowledge as private R&D laboratories.


In the future, every successful tech company will use their data moats to build some variant of an Artificial General Intelligence.

Is that what one gets paid 7 figures for - to go out in public and claim they "know" things like this with a straight face?


If you're interested in AGI take a look at my effort currently in development. It focuses on natural language understanding: https://lxagi.com


I'm curious what the author's take on scale.ai is since he was considering building an ml-ops / data labeling company and they seem to be the most successful startup in that space.


See you in Rome


That's some godawful typography for such a smart guy.


"Technological Lead Time" - I have thought about this before. I wonder if anyone actively tracks it for different domains.


sounds like you are coming up with excuses not to start a company


Interesting that Tesla gets its own row in the pro/con table while the faangs get lumped together.


Why is this interesting? Tesla is... not a FAANG company?

Not only is there no 'T' in FAANG, but the industry/product is completely different.

Or maybe the author just wanted to make a joke about coffee. Who knows.


Not an internet based company might be also factor


If Musk is to be believed their Teslabot is going to be in a class by itself.


If Musk is to be believed, Flint would have clean water.


There's no t in faang


I don't want to derail the conversation, but OPs career path really stood out to me.

He graduated in 2016, worked at Google in Bay Area, and now is joining a startup at a VP level.

I graduated in 2008, obtained a PhD in 2014 in a no name EU university, worked in odd companies for a while and joined FAANG 4 years ago as a mid level developer, where I am still ATM.

Looking at this disparity I wonder what could be possible explanations:

* OP is a beast and has grown very quickly in a short time.

* I'm particularly inept and I'm growing very slowly.

* Working in the right conditions (e.g. Bay Area, Big Tech, right team) can greatly accelerate your growth.

* Startups have a big title inflation.


Don't despair. I work for a FAANG and have previously worked at startups. Title inflation at startups is a huge factor. In fact, titles are not equivalent between any two companies. I have seen startup CTOs (even series A) transition to senior engineer IC roles at a FAANG.

As you identified, location is the next big factor. If you are still in Europe, my advice is to leave or to start your own company there. If you are working for primarily US based companies in Europe there will always be a limit to the level of exposure you get to leadership and to how fast you can rise up the hierarchy.

Finally, don't discount Eric's profile. Through some combination of his public profile and professional work, he's established a reputation and following. That is just as important as any hard engineering work in securing a senior/leadership role.


Hey thanks for the comment :)

I'm glad someone validates my belief that "location matters". I moved to the US 2 months ago after 2 years in Covid VISA limbo working remotely for a US team. Settling down has been extremely painful so far but I hope it's worth the effort in the long run.


Yeah, and the way the author presents themselves speaks volume too. There is a real pride in this essay. You can see how the author casually drops big names and insights like it is a fact.

Why does valley culture makes it seem like everything is possible and anything innovative can happen soon? The innovation in AI really seems like it is being made on a thin line of engineering and compute. It doesn't happen overnight. It requires some people working through and through to pull it off. These days it requires collective contribution.


> The innovation in AI really seems like it is being made on a thin line of engineering and compute.

This perfectly echoes my own thoughts. The advances being trumpeted in AI are functions of hardware advances that allow us to have massively overparameterised models, models which essentially 'make the map the size of the territory'[0], which is why they only succeed at a narrow class of interpolation problems. And even then nothing useful. That's why we're still being sold the "computer wins at board game" trope of the 90s, and yet somehow also being told that we're right on the verge of AGI.

(OK, it's not only that. There's also a healthy amount of p-hacking, and a 'clever Hans effect' where the developer likely-unconsciously intervenes to assure the right answer via all the shadowy 'configuration' knobs ('oversampling', 'regularisation', 'feedforward', etc). I always say: if you develop a real AI, come show me a demo where it answers a hard question whose answer you - all of us - don't already know.)

[0] Or far larger, actually. Google the 'lottery ticket hypothesis'.


Eh, if you boil all research in AI/ML down to the binary of "AGI or bust," then sure, everything is a failure.

But, if you look at your smartphone, virtually every popular application the average person uses--Gmail, Uber, Instagram, TikTok, Siri/Google Assistant, Netflix, your camera, and more--all owe huge pieces of their functionality to ML that's only become feasible in the last decade because of the research you're referencing.


Sorry, I should have been clearer. I obviously concede that stuff like applying kNN over ginormous datasets to find TV shows people like, or doing some matrix decomposition to correlate ('recognise') objects in photographs, is obviously useful in the trivial sense. It has uses. It wouldn't exist otherwise. I was more thinking on a higher level, about whether it has led to any truly epochal technological advances, which it hasn't.

Machine learning / neural nets also (like I said) get to claim credit for a hell of a lot of things which are just products of colossal advances in hardware – simply of its becoming possible to run statistical methods over very very large '1:1 scale' sample sets – and not due to a specific statistical technique (NN) which is not remotely new and has been heavily researched for about 40-50 years now.


These are engineering marvel! This is engineering at it's finest. Applied math at it's finest. So it's not a failure.

The way people hype AGI/AI/ML whatever undervalues the actual effort behind these remarkable feat. There is so much effort being made to make this work. Deep learning works when it is engineered properly. So it is just another tool in the toolbox!

Look at how graphics community is approaching deep learning. They already had sampling methods but with MLPs (NeRFs), they are using it as glorified database. So it's engineering!

I want to underscore that AI/ML/DL research requires ground breaking innovation not only in algorithms but also in hardware and software engineering.


I disagree, there are plenty of amazing advancements in the last 2 years you can't write off like that (especially Instruct GPT-3 and Dall-e 2). For example I have worked on a ML project in document information extraction for 4 years, and recently tried GPT-3 - it solved the task zero shot.


> show me a demo where it answers a hard question whose answer you - all of us - don't already know.

For that, we need artificial comprehension, which we do not. Artificial comprehension, the ability to generalize systems to their base components and then virtually operate those base concepts to define what is possible, to virtual recreate physical working system, virtually improve them, and with those improvements being physically realizable is probably what will finally create AGI. We need a Calculus for pure ideas, not just numbers.


I'm not really sure what you mean. This seems to be another instance of the weirdly persistent belief that "only humans can understand, and computers are just moving gears around to mechanically simulate knowledge-informed action". I may not believe in the current NN-focussed AI hype cycle, but that's definitely not a cogent argument against the possibility of AI. You're confusing comprehension with the subjective (human) experience of comprehending something.


> ‚make the map the size of the territory'[0], which is why they only succeed at a narrow class of interpolation problems

I take it you have not seen the recent Dall-E 2 results? Clearly that model is not just working on a narrow space.

See https://openai.com/dall-e-2/ and the many awe-inspiring examples on Twitter


Tech has a bad habit of conflating comp/prestige with skill. I have no doubt the OP is quite good at what they do, but you not being where OP is does not therefore imply you don't have skill.

Unfortunately the tech world is not really a meritocracy.

When I look at my own circle of technical people the most incredible ones from a pure technical ability are divided between working at FAANG making 500k+ and working a relatively unknown companies making ~200K or less. One of the most mindbendingly brilliant people I know is working in relative obscurity, known very well only among other people that are top in the field, but their resume looks very ordinary compared to their behind the scenes contributions to major projects.

Managing a career in tech is largely independent from technical skills and abilities. I have met a shocking number of people making lots of money at prestigious institutions that are "meh" as far as technical ability goes (of course there's some great ones as well), and have met plenty of brilliant people working relative obscurity.

The success is largely a function of both background (Brown does beat a "no name EU university") and personal desire to have a prestigious career. There is a lot of self promotion going on in this piece, in fact the OP has already convinced you that they might be just a wildly better person than you. If they can convince you they are this amazing, then they also can convince the leadership team at a start up. But do recognize that their skill demonstrated so far is only in convincing you of this.


There is more to being a great tech employee than just being 'brilliant' at the hard skills. Soft skills are just as important, and play a role behind why I have been promoted more than peers who surpass my skills ten-fold. Some people also don't want to be in management.

We all have different trajectories and choices. This comment makes it seem like if you aren't a technical wizard then you might as well be useless. This is not reality


Probably a combination of 1, 3 and 4. There are tons of talented people even at the top but the stars have to align to achieve more than your expected value.

PhD -> low-level dev -> FAANG mid-level is nothing to scoff at so you're doing pretty well.


Also, the author grew up in the Bay Area (early exposure to how the SV ecosystem works), went to an Ivy league school (opens doors to top internships), landed those internships, and got a masters. All those things, but especially the internships, can help fast track your early career.

I’d take some things here with a grain of salt, like “low 7 figures compensation (staff level)” at FAAN (can eliminate G because they’re not likely to hire him back at L+1 immediately after he quits). ML is still somewhat hot, but 7 figures is an outlier for staff-level comp.

AAPL: ~$450k https://www.levels.fyi/company/Apple/salaries/Software-Engin...

AMZN: ~$600k https://www.levels.fyi/company/Amazon/salaries/Software-Engi...

FB: ~$600k https://www.levels.fyi/company/Facebook/salaries/Software-En...

Nobody there is reporting $1M+ offers for staff level. While I’m sure it’s happened, it’s pretty far outside the staff pay band (excluding equity gains during the 2020-2021 market run-up, which are sadly behind us) and would be a truly exceptional offer even in the current climate. That, plus the fact that it sounds like he didn’t get many formal offers (“I did not initiate the formal HR interview process with most of them”) and wasn’t pitting offers against each other, makes me skeptical.


Without knowing the deatils, it is mostly #4, with a good doze of #3, and potentially a decent amount of #1.

Basically, yeah, small/not-yet-massive startups have insane overinflation in titles. Had plenty of former college classmates who became "VPs" or "staff engineers" at super small startups a couple years out of college. Getting plenty of recruiter messages on linkedin myself for "staff engineer" positions at random startups, despite me not even being a senior at a FAANG yet, and only being about 4.5 years out of college.

Another thing is, no matter how smart or hard working you are, being in the right place at the right time is extremely important. It won't help much if you lack skills, but being in the right place at the right time is like a force multiplier on your skills and the work you do. Which is partially why most of the big opportunities are still heavily concentrated in a few geographic spots (despite there being no tangible technical need for that).

Don't beat yourself up over it, titles don't mean that much. You are able to start a one-man-shop LLC and call yourself a VP, a director, or whatever else you want. The real question is, with that title, are they being compensated as much as you are? If they decide to quit and get a job at a "regular" tech company after, will that VP title translate into anything more than an L4/L5? Just some food for thought.


If it makes you feel any worse, a fresh-college grad just worked his first month at any job on my team at a midwest company and got an offer at Google making more than I am now. I have over a decade of experience and endless Cloud Architect certs (all 3 clouds) as well as a background in finance.

Right place, right time + talent + willingness to take risk.


I'm the same deal as you. I don't really have any problems with it. I worked at a startup and titles are much fancier but you don't get to ship to a billion users. Also the infrastructure can suck real bad.

Competing within a giant company for perf ratings feels like school and I'm over it. But the other parts of the job are great.


Probably 1 + 3 + 4. He didn't just work at Google, he worked at Google Brain. And with only a Bachelor's, so I assume he's is both very smart and got very lucky.


He got a BS+MS from brown I think


I was a VP at a startup (~200 people) before I was 30 without a PhD. It was all BS and I had less manager qualifications than a FAANG line manager. I got lucky.

It's clear from the blog post that the author is in the same boat. They lament CEOs not having time to do research but took a VP position. An actual VP doesn't have time do research so they're clearly not an actual VP. So they're likely a tech lead with an inflated title.


true


A few comments:

1) The PhD takes a huge hit on your opportunity cost.

2008-2014 is 6 years of time; for me, it was the delta between starting my career as a junior engineer and becoming a tech lead at a hot unicorn which let me pivot to a CTO role at a small startup.

2) Academic credentialism has real effects.

This guy did a CS degree at an Ivy in the US. He has been set up for commercial success in the US tech industry through a halo effect you cannot also access unless you gained access to that institutional grooming at the same age. By choosing to do that PHD in EU (and a no name one at that), you forfeited that access.

In my experience, while the effect of this goes down over time, it has extremely strong launch + early compounding effects.

3) Risk tolerance can work to your benefit or against it.

You are working at a FAANG which is the safest and most cash lucrative option. In all likelihood, you have a great WLB and now a great blue chip brand on your resumé. However, the cost of this is that you're generally not going to get access to projects or culture that, by virtue of your participation, set you on an extremely steep growth path.

To get access to that, IMO, there's no real alternative to achieving strong outcomes working at a startup. Of course, that can be hard to do -- how do you figure out which ones are future winners, and how do you get them to let you come on board? I have no great answer rather than early career trial and error (accepting some of it will work out poorly and uncomfortably so).

I wouldn't say that "OP is a beast" per se, but it's much more likely that they have been groomed (working in the right conditions) in ways that you may not have. And yes, startups titles are not comparable to big company titles. It's apples and oranges.

The company he joined is a Series A startup, so absolutely an early stage company where whether you're VP/CXO, you're functionally going to be doing a player/coach role at most with tons of strategy baked in. But I wouldn't call that inflation, per sé. Sure, it's not the equivalent of being an experienced people leader and executive at a big corporation manning a giant organization at its helm. But you are often times in charge with significantly more responsibility and do not have bureaucratic friction and slow pace to hide behind. Doing a startup is just different. It's insanely risky, overall has poor risk adjusted rewards, and often is a magnet for shady characters. But if you can filter out the wheat from the chaff, you get access to the best career opportunities available, bar none.


> Working in the right conditions (e.g. Bay Area, Big Tech, right team) can greatly accelerate your growth

This is the answer. I have grown more in ~7 years* of random SFBA startups than I did in the previous 13 years of career in Europe. Just because the kind of startup that's a dime a dozen over here is a once in a lifetime opportunity back home.

To put this contrast into numbers: In 2021, during the pandemic while "SFBA is dying" was the mem, the Bay Area raised as much startup investment as all of Europe.

*I wasn't as career aggressive as I could've been, mostly for visa-related reasons.


Everyone starts from a different position. I don't think it's worth letting yourself get irritated or depressed by other's success. Just try change the position you're at to a better one.

and.. of course startups have title inflation ;)


Machine Learning expert is the new Web Developer, so expect the former title to become diluted very quickly.


How about Anthony Levandowski?


I know a former Amazon Engineer. After working at Amazon as a mid level engineer, co-founded his own startup in Mexico, as CTO.

It's a startup... titles in a 50 people organization don't compare to 50,000 people organization titles.

I'm sure you can go and be a VP at a startup too, if that's what you want to do. Just go and network at Incubator, Investor, & Entrepreneur events/meetups/organizations, and come up with an idea & customers, then execute and try to get customers on board... rinse and repeat.


This guy is another Dan Luu. Insufferable posts.


Ouch




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: