Hacker News new | past | comments | ask | show | jobs | submit login

The #1 thing that makes it ‘hard’ in real life is that nobody wants to make training and test sets. So we have 50,000 papers on the NIST digits but no insight into ‘would this work for a different problem?’ (Ironically the latter might have been exactly what academics would have needed to understand why these algorithms work!)



You’re not paying tribute to MNIST-1D and many other datasets (including the massive segmentation dataset released by Meta with SAM). Read the literature before lecturing the community.


We still don't have enough data and people are still wasting their time with trying to extend algorithms instead of making better training data.

I've worked on a dozen ml projects two of them before Alex net came out and I've never gone wrong by spending 80% of my time creating a dataset specific to the problem and then using whatever algorithm is top dog right now.

Labelled data is king.


Personally I am happy to use a model that isn't quite "top dog".

I have a classification task where I can train multiple models to do automated evaluation in about 3 minutes using BERT + classical ML. The models are consistently good.

Sometimes you can do better fine-tuning the BERT model with your training set but a single round takes 30 minutes. The best fine-tuned models are about as good as my classical ML models but the results are not consistent and I haven't developed a reliable training procedure and if I did it would probably take 3 hours or more because I'd have to train multiple models with different parameters.

Even if I could get 82% AUC over 81% AUC I'm not so sure it is worth the trouble, and if I really felt I needed a better AUC (the number I live by, not the usually useless 'accuracy' and F1) I could develop a stacked model based on my simple classifier which shouldn't be too hard because of the rapid cycle time it makes possible.

My favorite arXiv papers are not the ones where people develop "cutting edge" methods but where people have a run-of-the-mill problem and apply a variety of run-of-the-mill methods. I find people like that frequently get results like mine so they're quite helpful.


The issue is the obsession with benchmark datasets and their flaky evaluation


What else could you do to test it besides it works for me and this test said it's good at talking?


no, this is routinely cited in introductory remarks these days, but ignores some practical aspects of the competitive context, among other things.


what is "this"?


> The #1 thing that makes it ‘hard’ in real life is that nobody wants to make training and test sets.


Would there be enough of a financial incentive to do so? Seems like a prime startup opportunity.


>> Seems like a prime startup opportunity.

Sometimes it's just ... hard. Apply some thought maybe before blindly parroting "profit!"

Reporter: "Why is it hard to cure cancer?". Crowd: "Would there be enough of a financial incentive to do so? Seems like a prime startup opportunity!"

Reporter: "Why is it hard to end World poverty?". Crowd: "Would there be enough of a financial incentive to do so? Seems like a prime startup opportunity!"

Reporter: "Why is it hard to build a warp engine?". Crowd: "Would there be enough of a financial incentive to do so? Seems like a prime startup opportunity!"

Reporter: "Why is it hard to wipe your ass using the left hand?". Crowd: "Would there be enough of a financial incentive to do so? Seems like a prime startup opportunity!"

You get the idea...


A cure for cancer would be terribly profitable. For a while.


> Reporter: "Why is it hard to cure cancer?". Crowd: "Would there be enough of a financial incentive to do so? Seems like a prime startup opportunity!"

What you want to optimize for is the money amount that you make at some quantile of the probablity distribution of the profits; say, the profits that are guaranteed in the best, say, 3 %, 5 %, 10 % or even 20 % of all possible outcomes. With a probablity of 97 % (if you choose the best 3 % of the outcomes), you won't make sufficient money if you attempt to cure cancer to be worth the risk, so the financial incentive is not there.

TLDR: Financial incentives do matter, but work differently from how many people think that they are structured.


There is plenty of money in it but you need to sell b2b and tp enterprise. That is not fun and as such no one is doing it.

Put another way,if I were trying to do a start up in this space I'd spend 50% of my budget on marketing 25% on a third world data labelling sweatshop, 20% on data pipeline engineering and 5% on sexy ml stuff.


I was just thinking the same, but I'm skeptical.

When researchers want to publish a paper, are they going to pay extra money for extra difficulty in publishing their paper? No, they'll just use whatever toy environment is free or already established and get that paper published!


I believe that Scale.ai was founded to do exactly this.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: