Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Has anyone successfully pivoted from web dev to AI/ML development?
154 points by ent_superpos 6 months ago | hide | past | favorite | 97 comments
I am currently working as a senior full-stack web software engineer. I have a BSc in Computer Science, and on my own, I've been learning more about AI/ML/deep learning. I really enjoy working with it, and I'd love to find a way to work on AI stuff professionally. The problem is that I've been working as a web developer professionally for about 10 years now, and I have no idea how I would pivot to more of a AI/data science role.

Does anyone have an experience of making this transition? As a web dev, I am senior level, but I'm sure I'd have to start from scratch on some things in the AI space. At least I have a good foundation of programming in general, math, and computer science.




Do you want to train models from scratch, or do you want to build cool things on top of AI models?

If the former, I suggest digging into things like the excellent Fast AI course: https://course.fast.ai/

If the latter, the (relatively new) keyword you are looking for is likely "AI Engineer" - https://www.latent.space/p/ai-engineer

There's an argument that deep knowledge of how to train models isn't actually that useful when working with generative AI (LLMs etc) - knowing how to train or fine-tune a new model is less useful that developing knowledge of the other weird things you have to figure out about prompting, evals and using these models to build production-quality apps.


This is the right question. There are relatively few people doing "Capital A Capital I" work building and training generative language/image models. Especially now that people have realized that calling the OpenAI API is going to work better than dedicating a team to tuning your own BERT model.

So some options are: * Building ML models for traditional applications - forecasting, ranking, recommending, all of that. Data Scientists and ML Engineers haven't gone away. * Working as an "AI Engineer" building on top of existing APIs and models. Very hip, but also in flux - I couldn't tell you whether that role will still be considered rare & valuable in a few years time, or what skills will be core to it. * ML Ops, engineering work for building & serving AI & ML models. It's always good to be selling shovels.

I would try to get into a team you're interested in as a SWE and then upskill or pivot. In my experience this is more effective than trying to completely reskill and sell yourself as an unproven prospect for MLE or AI Scientist work. AI/ML teams still need software, and in fact many of the best researchers are not great software engineers.


> Especially now that people have realized that calling the OpenAI API is going to work better than dedicating a team to tuning your own BERT model.

At 100x the operating cost.

If you have any kind of scale, using a model that's appropriately sized to your task is going to be better.


> If you have any kind of scale

I think that's the key also in terms of team size, if you don't have a dedicated team to take care of training / running a model then it makes sense to use an existing API and pay for it until you have enough scale.


That is a fair point.

Lots of times building the inefficient thing to prove market value is better than building the scalable thing that doesn't solve anyone's problems.


Gemma2 might be good enough for a drop-in option as well. The benefit for those who built on ChatGPT API is that as more open weight models get released, they can silently migrate their API usage to self-hosted models … ChatGPT allowed them to prove product-market fit early!


I want to agree with this, but after reading the whole article, I have no idea what the skills of an AI Engineer are.

Why is that not the job of a software/product engineer?

> none of the highly effective AI Engineers I named above have done the equivalent work of the Andrew Ng Coursera courses, nor do they know PyTorch, nor do they know the difference between a Data Lake or Data Warehouse

they're explicitly not trained in ML/AI. any software engineer can write a good prompt, call an API, and deploy that on an http server.

why is that not just software/product engineering?


An AI engineer wires up APIs to each other and returns the result as JSON, the same process as any other web dev.

Like any other web dev job there are differences across domains that can make it valuable to hire someone with past experience in your particular industry (in this case LLMs), but the only reason this gets a brand new title and others don't is hype.


So much less complicated than your average web developer job in 2024?


If you're only doing backend AI work, yeah, probably.


It basically IS just software/product engineering. An AI engineer is a product engineer who's ahead of the curve from everyone else in terms of building on top of LLMs and similar.

It might not even still be a speciality in a few years time. Right now though there's a lot of depth to the field that people who aren't focusing on it are missing out on.


The fact that in 2024 you can still get sub-1-hour applicable reply from the creator of one of the top web frameworks ever is everything that makes HN amazing.


One thing I've encountered more than once in the industry right now is that a lot of companies want to hire "AI Engineers", but they task their staff Very Serious Data Scientist with handling the interview. Inevitably, all the questions will be minutiae about different ANN designs and training processes. And without fail, no one involved will be actually dealing with that stuff on a day to day basis.


I agree with your overall point. I'll add that the Fast AI course really does focus on building practical things, and I suggest it for anyone who wants to learn the basics of neural networks, even if you'll never train one from scratch.


This is an extremely good insight and piece of advice. Already we're starting to see what looks very similar to a "frontend" and "backend" kind of development model similar to the previous tech generation.

While there are advantages to going "full-stack" in this analogy most people focus on one or another


I looked at the latent.space site and it didn't seem to be updated/maintained. What I would like is a reputable playlist/roadmap of things to consume to go from Senior Engineer to AI Engineer.

Foundation you need what, a primer on Linear_Algebra+Calc+Stats+Prob. At a minimum you have to know the language. Then, do you need anything of classical AI? Anything of classical ML? Do you even need to have any knowledge of Deep Learning other than it exists? Or transformers? OR do you just need a bunch of tutorials on how to implement GPT API's, structuring and managing prompts, etc.

I know all this information is out there but has anyone linked it together? If so, is it paywalled? I did a quite search on AI Engineer and before I knew it a site was asking me for 20k+ and 1.5 years of my life. I already have a master's in CS, so I assume I can make faster strides and do it for less. Can anyone advise?


It's not too uncommon. I started off working with Angular and Java. But I studied math.

It depends on what type of role you want. If you'd be happy building the application layer and doing prompt engineering, just build applications that call LLM APIs.

If you want a research position at the top labs, the interviews really are actually passable by people without PhDs. They are really focused on having strong fundamentals. I've seen people make this leap but it can be years of preparation. Like actually reading textbooks, implementing low-level details like backprop, re-implementing papers, and doing non-trivial personal projects. Essentially, you're self-studying a Masters degree. Blog about it. Post about it here. I've found people to make this transition just generally love learning.


AI problems turn into data problems. The happiest and best compensated people I know in that area have gone into data engineering, because data engineers are the ones selling shovels in this gold rush.


Could you please elaborate a bit more? I am your typical web-dev/frontend/backend developer, don't really know much about what is involved in data engineering.

Do you mean collecting, cleaning data? Or setting up databases (if yes, how is it different from me managing my employer's databases, except for size)?

In other words, what does a data engineer do all day?


Several data engineers that I know in this space (AI for business applications) are doing the usual ETL + mapping pipeline related work. Nowadays a lot of them are also having to deal with unstructured data such as textual reviews of products, service quality feedback, policy documents. Data engineers are developing the pipelines for chunking, vectorization, ingestion into RAG pipeline and for LLM training fine tuning. So it's still collecting, cleaning data, but quite different at the next level of detail, in my opinion.


Thank you. How does one get into this field? Web dev isn’t that exciting anymore


The data engineers I know started out as data scientists. They discovered that their ability to make models that did anything useful was hampered by low-quality or absent data. Then they discovered that there was a lot more job satisfaction (not to mention much better pay) in making useful data available than there was in clicking the "train" button over and over in slightly different configurations.


> How does one get into this field?

Network and meet someone who will give you, a web developer (a profession that is heavily stigmatized in the industry), a chance in a very competitive field.


I didnt realize web devs are stigmatized in the industry, as a web developer


They are not. There is a stigma associated with the ecosystem I think.

People who can create beautiful and user-friendly interfaces are extremely valuable and respected.

But my experiences trying to set up a typescript project and the unbelievable number of frameworks and crazy things to navigate (mjs/cjs) made me yearn for the pains of the Python ecosystem again.


Web developers are, front end engineers are not.


I would give the exact opposite advice - data engineering is one of the least rewarding or respected career paths.


If you find that this is the case in your organization, watch out. It's a sign that you're trying to build performative models that aren't grounded in reality. Depending on your organization's goals, this may take a long time to catch up with you.


Data engineering feels like the DevOps of AI/ML. Important, but you have to have the right interests for it.

This comes from someone doing web dev after studying data science, so I don't know how well it reflects reality as I've never worked as a Data Eng. myself


> Data engineering feels like the DevOps of AI/ML.

I wouldn't label it that way, AI/ML has its own ops and devops which is very distinct from data engineering.


a large organization that can pay stable wages over time is going to look for formal credentials when making a hiring decision. Further from that kind of (big,bureaucratic) group, are quickly assembling and then dissolving entities .. in other words "nice work if you can get it" .. with far less stability. Another term for this is "the Wild West" environment. There are some applicable analogies to the way Hollywood makes movies also perhaps, but with far less grounding.

In the Hollywood example from what I know of.. small "tiger teams" assemble with fundamentals, then quickly farm out the sexy work to disposable contracting firms, who then hire even more disposable people with various skill levels. In other words, lots of fun and excitement but also lots of work place abuses and low stability. Over time, Hollywood formed unions for a surprisingly large number of roles (like writers) because the real truth of business is not pretty. Needless to say, Silicon Valley has moved very quickly, and the Hollywood stories are not exactly applicable.


Yes, AI team was created in our company to bring it in house, and I was invited mainly to integrate it into the web app, and to do some ops work. A year later and I'm fine tuning models, building datasets, working with PyTorch. Much different than webdev, not as rewarding sometimes, more unknowns, longer feedback cycles. The main issue is getting enough data quality and quantity, which can be a grind. Happy to have taken this opportunity though. Endless things to learn.


How good are you in math? Do you know linear algebra and calculus?


AI, ML, and data science are all different things. And there are different types of jobs in each of those categories.

If you want to apply AI, there are lots of really useful projects that are just calling the Anthropic or OpenAI API for the AI part. Or replicate.com image models etc. That wasn't the case a few years ago before we had the general purpose models. I have been doing a lot of those types of projects and I don't have a machine learning background.

There are ML Ops jobs that don't require a lot of machine learning knowledge.

There are ML researcher jobs that are just training LLMs which are more practical rather than theory.

To do novel machine learning research or at least significant variations of popular neural network architectures, I think that is the only thing that really requires years of study. But I think there is a very large gap between that type of work and web development. Which is why I was very happy to see the progress in general purpose models.


Totally possible. About 30-40% of the people in our AI team don’t have a formal AI background. Especially with LLMs a lot of work has shifted towards “data literate software engineering”. We call them AI engineers / AI developers. Good development skills are very transferable to those roles.

Feel free to reach out if you’re in the EU (email in profile), we’re hiring. Also happy to give some pointers on how to approach these conversations.


I'm always skeptical of these roles. Very often they do not involve anything AI related which is why a formal AI background is not required.


Depends on your perspective. I think it’s like front-end development likely means you work _with_ React, not that you’re writing front end frameworks from scratch. In a similar way AI devs in our team work with LLMs, but they don’t create them from scratch.


The difference is the frontend role would require a real understanding of React, but these "AI dev" roles don't require a real understanding of AI.


Exactly.


This post shows why programming as a career overall sucks. Sure it’s great if you really enjoy programming. However, staying relevant to earn a decent living your entire life is difficult.


I would argue this post (and the majority of resultant comments) have demonstrated that programmers staying relevant isn't as difficult as it seems. They were curious about how to pivot, had a forum in which they could ask, and in minutes started getting a wealth of practical, actionable advice from folks who have done the same or similar. The theme so far is that those programming skills aren't obsolete just because you want to change the vertical you work in and learning materials to help you achieve that goal are abundant.


Right, it’s a lot easier to shift than it would be to go into an entirely different kind of law, or become a different specialty of doctor.


Some people like learning new things. If you go into tech you should know that it's a career where you basically have to retrain every 5 years. But in return for that, you get high wages and low barriers to entry. If you're someone who enjoys learning new skills, this is a profession tailor made for you.


How do you know what to retrain in?


Follow your interests.


I prefer the framing of "the intersection of things you enjoy doing, things you're good at, and things that other people find useful". There are unhappy failure modes if any of the above are missing. The high-flying corporate lawyer who's a miserable divorced alcoholic is what you get when you're good at something others find valuable that you don't enjoy. The underperformer who gets fired from every job they like is when you enjoy doing a job others find useful but aren't very good at it. The broke Millennial guitarist who followed their musical passions is what you get when you enjoy doing something you're good at but nobody else finds it useful.

If you follow your interests, though, it at least guarantees you'll be interested in what you're doing.


At least in the U.S it's been one of the greatest careers possible imo in most objective measures - money, comfort, working conditions etc etc. I'm saying has been because job security plummetted in the last year or two and I'm not sure if its even going back to what it as befoer.


Maybe in what we think of as traditional tech, but there is enormous job security working as a developer in government or government adjacent organizations. You obviously take a pay hit, but also don’t have to live in an area with a huge cost of living relative to CA, NYC, etc.


Ignoring industries built on regulatory capture / credentialism gatekeeping like law and medicine [by the way, even those both have continuing education requirements], are there actually exceptions to this?

Plenty of careers just go away. Might as well pick one where you can stay relevant by picking up incremental/adjacent skills continuously.


Electrician and plumbers have it pretty good in this regard. I don't see that being replaced anytime soon and there's not all that much to learn and a lot less changing on you.

Maybe we'd get bored though.


Not as sure about electricians, but plumbers at least pay for it with their body. Even though PPE can reduce the strain, there's still a reason why these trades haven't just shot through the roof. Bad knees, bad backs, respiratory dangers, etc.

Trades in general are fraught with physical perils for the unaware.


Very true, though, those jobs also present significant occupational hazards, unlike software where the biggest threat you face is a sedentary lifestyle.


I’m a 15-year SWE. If I sat at a desk or in an office/cube all day I’d lose my marbles.

The software I write and ship also requires me to be able to grab a socket set, take panels off things, fish out DB9 connectors, run wiring, etc. testing also happens in heavy snow in the winter and high heat in the summer. Oh, lots of travel as well.

I’ve been interested in finding a different job, I’m just worried I would get very bored very quickly, tying together api calls or stitching together libraries.


Embedded software?


Kind of, yeah. Not in the traditional sense, it’s about 80% x86_64.


> the biggest threat you face is a sedentary lifestyle

Which is a serious threat, to be clear. It's just more easily mitigated.


It's absolutely a serious threat, but being easier to mitigate is the crux of the distinction I'm making.


i personally love the expectation of constant learning, growth and innovation in our field

but yes, anecdotally, compared to all of my friends and family, i don't know any profession with those same expectations. to name a few - market researchers, psychologists, primary school teachers.


I think it's the opposite, there is a growing bubble of interest in hiring for AI stuff. In 5 years time your RAG pipeline model training career is going to pop and become my new `brew install ai-thing`. If I need image recognition/generation or LLMs, I'll call openai APIs the way one might call stripe or Spotify APIs. Don't trust them? Claude. Don't trust anyone? Self-hosted with RAG will be good enough without model training and easy to use by then.


I am a web dev and I think I’ll stay relevant to earn a decent living as a web dev until I retire (it’s hard to predict 20 years ahead, but definitely the next 5-10 years at least).

It seems the author just want to change their career, not necessarily because of they won’t be able to earn money if they don’t.


I don't think web development is becoming irrelevant anytime soon.


I enjoy letter-writing. So should I go on writing physical letters to everyone today? Moving with technology is essential, not just for developers, but for public as well. The trap is that we have to do it even if we don't want to. E.g: your neighbor country has nukes and you don't.


> However, staying relevant to earn a decent living your entire life is difficult.

You either stay relevant (or close to relevant) or lose your job. Unfortunately, this is the field in the 2020s.


This is one of my favorite aspects. I love learning new things. I love being in a field where there are new things to learn every decade or so.


The fundamentals of how computers work are essentially unchanged since the 60s or 70s. If you have a strong foundation (typically a CS or CE degree), picking up new stuff shouldn't be particularly difficult.

I mean, I just had a PR merged in a language I had literally never used before. It took me five minutes to pick up the basics. Sure it would take much longer to be fully productive, but it would be a comfortable transition.


Not sure if my experience is relevant, but I did a couple of internships in web dev during my bachelors degree in CS and quickly realized it wasn't for me. I then did a masters and now a PhD in medical imaging where I extensively use machine learning (design and train my own models, doing both supervised and RL) but I wouldn't say I am a researcher in AI/ML.

Because I am still in the academic process, I had the opportunity to take a couple of classes on the subject. Three books that I would recommend going over to make sure your foundation in ML and mathematics are solid are

-Pattern recognition and machine learning by Christopher Bishop

-Mathematics for Machine Learning by Peter Deisenroth

-Deep Learning by Courville, Bengio and Goodfellow

All three are legally available online in some form. I can't say I have any experience in finding a job related to ML though.


There is so much in the ML world that's being built in the open which you can use as a foundation for your learning. Some folks here already mentioned Jeremy Howard's fast.ai course which is absolutely a great place to start for anyone that a) has motivation to learn DL, and b) already knows how to code.

Beyond that, there are hundreds of open source projects you can fork to start building intuitions around what the inner loop of AI dev project cycles look like. You'll be surprised at how much of your web dev skills remain relevant in these projects, particularly in UI-related tasks. Data scientists and folks of similar ilk default to notebooks, gradio, streamlit et al. to ship interfaces for their experiments. You though have the ability to do that on your own if you choose to (sometimes a notebook is enough), which can be a valuable differentiator for you as a candidate if you also have all the other skills needed to be productive in this space.

My own background is in distributed systems with some full stack and embedded work mixed in over the years. I started tinkering with ML projects back in 2012 when I first discovered AlexNet and resources were far more limited. I was still able to get productive relatively quickly even though most of what I built wasn't really applicable to my work in a practical sense on day one. Where my background became relevant was when I needed something approximating an MLOps pipeline for data processing, training, and eval. Most of the code you're writing for that isn't really specific to ML, its nearly identical to CI/CD systems but with the obvious infrastructure caveats native to ML workloads.

Nowadays though, especially if you're intrepid/resourceful, there is so much more learning material by comparison and much of what you work on can likely augment your day-to-day web dev tasks as well if you're creative enough.


Yes. Use your CS and math background to get a job in Operations Research. Use the OR techniques to do V&V on applications of AI/ML models.

Ie, Verification that the application/model do the job correctly and Validation that the app/mdl is appropriate for the business case.

This is not exciting but it pays well and people with the skill set are extremely needed.


Could you kindly give some resources to read more about this area? Sounds fascinating


I made this transition. I am developing a project in the area of computer vision -- with intensive use of drones -- to serve Brazilian agribusiness. At the moment the challenge is to gather a dataset of images of the livestock breeds and crop varieties most produced and cultivated in the region, for detection, classification, monitoring, counting. The field of work is new and promising here. From a technical and professional point of view, it wasn't particularly difficult to make the transition, because I have an extensive background in data analysis and science. The difficult part is working on sales with a new type of customer.

From where I stand Computer Vision seems like a really good area to start in machine learning. Good luck!


Here’s a thread also asking about this from last week: https://news.ycombinator.com/item?id=40797858


For me, the shortest path i found from full-stack Web Dev to full-on AI Rockstar was this one YouTube video: https://www.youtube.com/watch?v=Yhtjd7yGGGA which clarified for me, in just 40 minutes, exactly how to 1) leverage my knowledge of Postgres into getting my hands dirty tokenizing a huge repository of any content into pure magical numerical data inside a vector db, 2) apply my years of experience building local site search UIs into the actual nuts & bolts of transforming a search query into an effective AI prompt, and 3) do all that on a cheap little ec2 server without a single GPU in sight! The video is over a year old now (!) and there are more recent updates and similar how-to's elsewhere, but this short, practical, and honestly amazing story of a guy crushing on the perfect first use case for AI on a website spoke directly to my web dev soul, and really demystified all the AI hype for me, so much more so than the firehose of AI-wrapper content ever could!


>full-on AI Rockstar

What is your daily work look like? What is your salary?


One workable pathway is to join a team working on AI products as a full-stack engineer. AI practitioners notably are less inclined to do frontend or even backend work, even more so when talking about SWE best practices and moving code to production. So there is plenty for you to offer given your 10 years tenure in dev.

Then learn by symbiosis, while having AI on your resume :)

Feel free to get in touch if you want to chat more about this.


> I'd love to find a way to work on AI stuff professionally

Working on AI can mean many different things.

If you're looking to pivot into a more research-y position in DS and/or AI research, I'd suggest getting an advanced degree in these fields.

If you're talking more about ML engineering, there're full stack software engineer positions at some startup AI companies that require an assortment of skill sets such as web development, MLOps, and sometimes a bit of data engineering. You could look into these roles. Alternatively, since ML engineer is still an emerging position at a lot of organizations, some do not require prior experience but instead focus more on the candidate's portfolio. Create some projects and build a strong portfolio, you might have a good chance.


Hi all. Wow! I'm blow away by the wealth of responses this post has gotten. I truly appreciate all of you sharing that knowledge. I especially appreciate the separation of disciplines that have been explained.

I think I would probably fit more into the AI Engineer category since I would have to do a lot more study for AI research, but I do enjoy trying to use existing models and libraries to accomplish tasks. I can also create toy models myself (I actually built a ConvNet in PyTorch to detect popup dialogs on my screen and alert me), but I'm no where near good enough to create entirely new novel approaches or architectures.


This is essentially what I’ve done, having started 15 years ago with Ruby on Rails and now leading development of most ML-related applications at my current job. I’ve been interested in ML since 2017 or so, just found it interesting.


Successfully pivoting means learning and shipping things yourself on GitHub and talking about it on your own YouTube journal.

Specifically with AI/ML the urge might be to start from scratch but I think there might be enough tooling to start where you are with web, build solutions using existing AI tech, start customizing it and going deeper and deeper.

Makes for a natural story and journey too. It’s completely choose your own adventure so you and direct the vector of your path where you want and learn & build in that direction.


I agree with your general advice about shipping stuff, starting from stuff you'd want an LLM or agents to do for you. Journaling is good too.

I'm not sure if I agree with you on the tooling end. If you're talking about tooling as in calling OpenAI APIs vs local llamaing/fine tuning your own stuff, I definitely agree with you. If you're talking about going with abstractions like LlamaIndex and Langchain, I strongly disagree. Honestly, I'm almost entirely against all abstractions outside of something like LiteLLM.

Langchain, for instance, abstracts the entire learning process away from developers, which honestly translates into little communicable skills after building apps with it.

Pretty much all of my learning in this space was from building a product head first and figuring out how to give myself the tools to get me there. You pick up a ton along the way. For instance, you learn a ton from trying to figure out how to build your own text splitter, chunker, and document extractors. Building from scratch is what makes you useful to companies. Understanding the why and how behind the tools is the difference between developing just another chatbot versus developing hugely value-driving solutions for businesses. There's diminishing demand for the former, tremendous demand for the latter.


Appreciate your thoughtful reply - the best input that can be given is general anyways to find a way to apply back to our own world :)

That's fair about your concern about the tooling. In case your'e interested, happy to share a bit more since I come from a web background too a long time ago, but quietly kept my skills at least passable for hobby learning.

Knowing how the tools work is relative to how fast they're evolving, the impossibility of keeping up, let alone finding an application for it (shipping).

I spent pretty much 8-10 hours a day last year keeping up, and it felt like 1 month worth of normal tech progress was happening in 1 week, and stepping away quickly caused a gap.

One viewpoint I opened up to was by solving a problem, with web app, and increasingly adding more and more AI tooling, first staring with api.. and then going beyond is critical because there's a new programming language.

Prompting. Chain of thought, all that stuff. I remember playing around with GPT and having conversations and getting much better results quietly for a very long time until a CoT paper came out and I scratched my head.

All to say, trust your abilty to learn and get to it. The lake of all this stuff is becoming a sea and much bigger very quikly, but the tooling is improving. If I tried to learn everything from scratch, it would be harder 2 years ago than now.

Now, I agree that tying ones self too much to one framework is risky. Framework in AI are early, and don't have a shelf life of years... maybe 6-18 months. Langchain is even starting to fall out of favour for a few other that are simpler from the results of just building and shipping.

You do pick up a ton a long the way. I have known python for a long time and glad I kept it around instead of taking best career practice and floating into management and leadership and getting off the code. The combination of business / problem solving experience in the real world helps a lot to balance out what to do.

I'm not sure I really agree with folks looking down on chatgpt "wrappers"... web apps in a lot of ways are database wrappers then. The value is in putting computers to work for people to make their life better. So I guess part of it is deciding what this all means for you, and base your participation on it.

Lots of places and ways to grow intrinsically, extrinsically and also make a difference in a way that means something to you.


Oh yeah, i did not mean to summarily dismiss “wrappers”, a word I have a strong distaste for. The thought there was that I have seen lopsided interest in AI from companies looking to deploy it versus normies looking to use it for any of their daily tasks.

> The combination of business / problem solving experience in the real world helps a lot to balance out what to do.

The pace at which this space is exhausting and exciting, but the thing that makes it exhilarating is that the newness makes it possible for anyone to have an original thought on how to generate better solutions. The problem solving experience stuff is definitely a big contributor to that. It’s fun figuring out something cool yourself, kind of a secret, and then seeing an inevitable paper or langchain tool 6 months later reflecting similar thinking.

I think we’ve progressed along this space in a similar way. The learning path probably is different now from than from two years ago on account of how much catch up there is to do. On top of that, thing can get stressful when you benchmark yourself against well-funded teams.

Your last several sentences resonate with me because that meaning keeps the stress at bay. I’ve always loved automation as a means of empowerment in everything I’ve done, but in the prior two decades, it all felt like a toy hobby. LLMs seem close to be able to unlock that potential for everybody, and i find that incredibly energizing.


zooming out, the AI space is filled with insiders trading well-paid opportunities among others with very specific credentials.. in a high stakes and heavily constricted network of networks..

contrast this to an open market.. for example I make a poster for a grocery store in my town, that poster is well-liked and I have the rights to reproduce it, or make a similar new one. In every town there could be such activity, and for larger towns, many people could do that activity. That naturally scales for participation, with the transactions of pay and consumption in an open market environment.

The AI space seems much more like building large projects in tight teams with serious resource requirements. The end products are more varied than most people realize, but there is a common thread of replacing skilled humans in jobs with some kind of automation, or extracting value from humans with monitoring and some kind of enforcement. In other words, really contrasting to the open markets ideas.

Honestly I cannot be enthusiastic to put the word "job" and "AI dev" in the same sentance. The real-world dynamics appear to be coalescing into high powered, competing silos, with a side-effect of replacing jobs in some cases.


I've mentored senior software developers transitioning to ML - everyone comes from slightly different background and experience. Feel free to schedule an intro call to discuss your specific situation https://cal.com/studioxolo/intro.


I've gone back and forth between them, but it helps that I have a Physics PhD so I am not intimidated by the math.

I got my PhD in 1998, did a postdoc in Germany for a year, came back to the states, started doing remote work and consulting projects for web sites, worked on the arXiv preprint server for a few years, then worked on a pretty wide range of projects for pay and for side projects until I got interested in using automation to make large image collections on my own account circa 2008 or so.

I had a conversation with my supervisor that called into question whether I could ever be treated fairly where I was working and then two days later I got a call from a recruiter who was looking for a "relevance architect" which had me work for about a year and a half for a very disorganized startup. Then I got called by another recruiter who needed somebody to finish a neural network search engine for patents based on C++, Java and SIMD assembly.

After that I tried to put a business to develop a next-generation data integration tool and did consulting projects, learned Python because customers were asking for it. When I gave up on my own business I went to work full-time for a startup that was building something similar to the product I had in mind as a "machine learning engineer". That company was using CNNs for text, I had previously worked for one using RNNs, that summer BERT came out and we realized it was important but not quite so important.

After that I wound up getting a more ordinary webdev job where I can actually go to an office, I still do ML and NLP-based side projects though.

Funny enough I am working on text analysis projects now that I first conceived of 20 years ago, I think technologically some of them could have worked but they work so much better now with newer models.

---

My take is that the average 'data scientist' is oriented towards making the July sales report, not making a script that will make the monthly sales report. If you want to get repeatable results with ML it really helps to apply the same kind of organizational thinking and discipline that we're used to in application development. Also I believe getting training data is the bottleneck for most projects: I mean, if you have 5000 labeled examples and a 20 year old classification model you might get a useful classifier, you can get a much better classifier with a two year old model with little more work, or you can try a model out of last week's arXiv paper and spend 10-100x the effort, risk complete failure, and probably add 0.03 points to your ROC.

If you don't have those 5000 examples on the other hand all you can do is download some model from huggingface and hope it is close enough to your problem to be useful.

My spurt of doing front-end heavy work built up my UI skills so I have done a lot of side project work towards building systems that let people label data.


How are you financially?


Could be better, could be worse.

I am living in rural upstate NY close to a college town and bought a 70 acre farm for a fraction of what a small ranch house costs in the Bay Area. My wife teaches people to ride horses, her business is probably the most profitable horse operation I know of because (1) it was bootstrapped up from 2 horses to 8 and (2) we never bought large amounts of heavy equipment; the average horse barn has a big-ass truck and a horse trailer, we can pay somebody to haul a horse somewhere for a fraction of the monthly payment on a big-ass truck. We have two houses on the lot and get usually get rental income from two units in the second house (though we are renovating it now and I'm using it to gentle a feral cat right now)

I've had some terrible years where I ran up $20k of AWS bills and had no income, I did run up a balance on my HELOC which I later paid off. The good years paid for the bad years, I consistently put money into retirement funds except for the absolutely worst years. I always had health insurance except for a summer that I worked at a web dev shop that didn't offer it.


Interesting, thank you for sharing! I have about 9 acres and have been seriously considering getting horses, but the big-ass truck and trailer has been a point of hesitation for me. I did not know you could pay somebody to haul them somewhere when needed!


If you go this route you should make friends with local farmers who have the equipment and can help you out with little random things. If you want to go to shows or visit distant trailheads you need the trailer but we have an area of trails on our farm that I’d compare to Het Vondelpark in Amsterdam and can ride out on the road to some nice dirt roads.


I made the transition a few years ago by going back to school for my MS. This allowed me to re-enter the internship pipeline


I pivoted from "full stack" to a software and infra engineer on an AI team. There was some luck involved, but I believe it helped that I'd taken some courses. I am not one of the data scientists, but work very closely with them. This seems like a good path to full time data science.


As other commenters have noted, there are different kinds of ML/AI archetypes out there:

- "Real AI Scientist/Applied Scientist/Researcher" aka you do actual training/fine tuning of bleeding edge models. Very hot right now but competition is incredibly intense. Probably you need a PhD or some serious experience to compete. Get ready to do a bunch of independent learning if you're serious about this.

- "Fake AI Scientist/Applied Scientist/Researcher" - You work in a big corporation or maybe a confused startup who wants to staff out some internal AI teams but doesn't really have true expertise in the area. Maybe if you really knock it out of the park something you build will provide real customer value...one day.

- "Real AI-ML Engineer" Scientist work under a different name, or deployment/infra for custom models. More approachable than Real Scientist work but probably more focused on engineering chops, C++, CUDA, etc. Similar to Real Scientist in that you need to have some actual legit skills.

- "Fake AI-ML Engineer" Calling the OpenAI API and massaging the output into something that is possibly valuable but more likely is just a "AI Feature" on top of an existing application that provides real customer value.

- "Non-AI ML Engineer" You work with traditional ML like xgboost, probably in the financial world, and don't really interact with any of this stuff, unless your boss asks you to create a new AI Feature. You can now put this on your resume and hope to get a Fake AI-ML Engineer job, if you want to.

- "Real Data Science" this role is trending toward inhabiting more of a BI/Analytics space. IMO in Big Tech they are getting more serious about the stats/probability background here. In some ways I think this would be more difficult to upskill on than Deep Learning math, if you're starting without a math background.

- "Fake Data Science" Kind of a dying role, this is like "I just learned how to do pandas and scikit learn and I'm creating linear regressions for boomers". Honestly still some alpha here if you are a product-focused person in the right org. But maybe the title here should be more like Data Analyst++

Hope this helps. Me myself I'm a Non-AI ML Engineer who is pretty screwed, because if you search ML Engineer now everyone wants you to know PyTorch.


For PyTorch, while I have not learned it myself yet but I am planning to learn it and I have heard that it is good course for it https://www.freecodecamp.org/news/learn-pytorch-for-deep-lea...


Anyone has advice for engineers working on distributed systems problems pivoting to MLOps/ AI Infrastructure?


Conversely, I'm a ML guy who has to do webdev to make ends meet. Tips for me?


This should be in the Who Wants To Be Hired thread


Woah, out of the cringepan and into the cringefire!


To begin, start hacking on open source AI projects, put them on your github, build up that track record. Get involved in various FOSS AI communities and collaborate with folks there [1]. The opportunities to transition to AI jobs will follow.

[1]:https://reddit.com/r/localllama




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: