Hacker News new | past | comments | ask | show | jobs | submit login

From what I have seen, most of the jobs that LLMs can do are jobs that didn't need to be done at all. We should turn them over to computers, and then turn the computers off.



They're good at processing text. Processing text is a valuable thing that sometimes needs to be done.

We still use calculators even though the profession we used to call "computer" was replaced by them.


But here reliability comes in again. Calculators are different since the output is correct as long as the input is correct.

LLMs do not guarantee any quality in the output even when processing text, and should in my opinion be verified before used in any serious applications.


> Calculators are different since the output is correct as long as the input is correct.

That isn't really true.[0] The application of calculators to a subject matter is something that does need to be considered in some use cases.

LLMs also have accuracy considerations, and although it may be to a different degree, the subject matter to which they're applicable has a broad range of acceptable accuracies. While some textual subject matter demands a very specific answer, some doesn't: For example, there may be hundreds or thousands of various ways to summarize a text that could be accurate for a particular application.

0: example: https://www.reddit.com/r/calculus/comments/upjdn4/why_do_all...


I think your point stands, but your example shows that anyone using those calculators daily should not be concerned. Those that need precision to the 6+ decimal places for complex equations should know not to fully trust consumer-grade calculators.

The issue with LLMs is that they can be so unpredictable in their behaviour. Take the following prompt that asks GPT-4 to validate the response to "calculate 2+3+5 and only display the result":

https://beta.gitsense.com/?chat=6d8af370-1ae6-4a36-961d-2902...

GPT-4o mini contradicts itself, which is not something one would expect for something we believe to be extremely simple. However, if you ask it to validate the response to "calculate 2+3+5," it will get it right.

https://beta.gitsense.com/?chat=43221de5-bff6-487a-8c0f-48ca...

By adding "and only display the result," GPT-4o mini was thrown for a loop; examples like this should give us pause.


Well, not every tool is a hammer and not every problem is a nail.

If I ask my TI-89 to "Summarize the plot in Harry Potter and the Chamber of Secrets" it responds "ERR"! :D

LLMs are good text processors, pocket calculators are good number processors. Both have limitations, and neither are good at problem sets that are outside of their design strengths. The biggest problem with LLMs aren't that they are bad at a lot of things, it's that they look like they are good at things they aren't good at.


I agree LLMs are good at text processing and I believe they will obsolete jobs that really should be obsoleted. Unless OpenAI, Anthropic and other AI companies come up with a breakthrough on reliability, I think it will be fair to say they will only be players and not leaders. If they can't figure something out, it will be Microsoft, Amazon and Google (distributors of diverse models) that will benefit the most.

I've personally found it is extremely unlikely for multiple good LLMs to fail at the same time, so if you want to process text and be confident in the results, I would just run the same task across 5 good models and if you have a super majority, you can be confident that it was done right.


Neither are humans, that's why we have proofreaders and editors. That doesn't make them any less useful. And a translator will not write the same exact translation for a text longer than a couple of sentences, that does not mean translation is a dead end. Ironically, it's LLMs that made translation a dead end.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: