Machine learning isn't comparable to software development. It is a statistical m...

p-e-w · 2024-01-24T02:52:07 1706064727

> Machine learning isn't comparable to software development. It is a statistical modelling exercise.

It's neither of the two. Machine learning isn't comparable to any other human endeavor because in many cases, much more value comes out of the models than (seemingly) goes in.

LLMs for example are punching way above their weight. The ideas underlying their software implementations are extremely simple compared to the incredibly complex behavior they produce. Take some neural networks that can be explained to a bright high schooler, add a few more relatively basic ML concepts, then push an unfiltered dump of half the Internet into it, and suddenly you get a machine that talks like a human.

Obviously I'm simplifying here, but consider that state-of-the-art LLM architectures are still simple enough that they can be completely understood through a 10-hour online course, and can be implemented in a few hundred lines of PyTorch code. That's absolutely bananas considering that the end result is something that can write a poem about airplanes in the style of Beowulf.

eru · 2024-01-24T06:45:01 1706078701

I'm not sure it's as simple as you make it sound.

Lots of problems have very simple solutions. And progress often means finding simpler solutions over time.

But coming up with those solutions, and debugging them, is what's hard.

For a comparison, have a look at how pistols got simpler over the last two hundred years. Have a look at the intricate mechanism of the P08 Luger https://www.youtube.com/watch?v=9adOzT_qMq0 and compare it to a modern pistol of your choice. (And the P08 Luger is already pretty late invention.)

Or have a look at modern Ikea furniture, which can be assembled and understood by untrained members of the general public. But designing new Ikea furniture is much harder.

wegfawefgawefg · 2024-01-24T03:42:39 1706067759

I dont feel like im doing statistical modeling when i do ml. Usually feels more like pipe alignment, followed by tremendous amounts of debugging.

salty_biscuits · 2024-01-24T03:58:15 1706068695

That is also what statistical modelling feels like. EDA and data cleaning.

laichzeit0 · 2024-01-24T09:10:12 1706087412

The heuristic I use for distinguishing between statistical modelling, machine learning and AI is is through feature engineering and model specification:

- Statistical modelling: Manual feature engineering, manual model specification (y = ax + b)

- Machine learning: Manual feature engineering, automated model specification (y = ax + b or y = ax^2 + b, I don't care, the algorithm should figure it out).

- AI: Automated feature engineering (e.g. CNN), automated model specification

kkoncevicius · 2024-01-24T12:52:50 1706100770

IDK about this. The model y = ax + b is not specified, parameters a and b have to be chosen by optimization. Now add regularization that some of those parameters are shrinked to 0 (lasso) and you have "automatated model specification" where only some parameters are left and others discarded.

And furthermore the models are always chosen from a predefined hypothesis set, so there can never be truly automated specification.

p1esk · 2024-01-24T06:03:11 1706076191

It depends on what is it you do when you “do ml”.