"Pretty neat, but definitely watch out for hallucinations."
We'd never hire someone who just makes stuff up (or at least keep them employed for long). Why are we okay with calling "AI" tools like this anything other than curious research projects?
Can't we just send LLMs back to the drawing board until they have some semblance of reliability?
Not sure if this was posted as humour, but I don't feel that way. In today's world, where I certainly would consider taking the blue pill, I'm having a blast with LLMs!
It has helped me learn stuff incredibly faster. Especially I find them useful for filling the gaps of knowledge and exploring new topics in my own way and language, without needing to wait an answer from a human (that could also be wrong).
Why does it feel, that "we are entirely inside the bubble" for you?
In the early days of ChatGPT where it seemed like this fun new thing, I used it to "learn" C. I don't remember anything it told me, and none of the answers it gave me were anything that I couldn't find elsewhere in different forms - heck I could have flipped open Kernighan & Ritchie to the right page and got the answer.
I had a conversation with an AI/Bitcoin enthusiast recently. Maybe that already tells you everything you need to know about this person, but to the hammer the point home, they made a claim to similar to you: "I learn much more and much better with AI". They also said they "fact check" things it "tells" them. Some moments later they told me "Bitcoin has its roots in Occupy Wall Street".
A simple web search tells you that Bitcoin is conceived a full 2 years before Occupy. How can they be related?
It's a simple error that can be fact checked simply. It's a pretty innocuous falsity in this particular case - but how many more falsehoods have they collected? How do those falsehoods influence them on a day-by-day basis?
How many falsehoods influence you?
A very well meaning activist posted a "comprehensive" list of all the programs that were to be halted by the grants and loans freezes last week. Some of the entries on the list weren't real, or not related to the freeze. They revealed they used ChatGPT to help compile the list and then went down one-by-one to verify each one.
With such meticulous attention to detail, incorrect information still filtered through.
I guess the real learning happens outside the AI, here in real life. Does the code run? Sure, it's on my local and not in production, but I would've never have the patience to get "that new thing working" without AI as assistant.
Does the food taste good? Oops, there's a bit too much vegetables here, they are never gonna fit in this pan of mine. Not a big deal, next time I'll be wiser.
AI is like a hypothesis machine. You're gonna have to figure out if the output is true. Few years ago, just testing any machine's "intelligence" was pretty quickly done and machine failed miserably. Now, the accuracy is astounishing in comparison.
> How many falsehoods influence you?
That is a great question. The answer is definitely not zero. I try to live by with a hacker mentality and I'm an engineer by trade. I read news and comments, which I'm not sure is good for me. But you also need some compassion towards oneself. It's not like ripping everything open will lead to salvation. I believe the truth does set you free, eventually. But all in one's time...
Anyway, AI is a tool like any other. Someone will hammer their fingers with it. I just don't understand the hate. It's not like we're drinking any AI koolaids here. It's just like it was 30 years ago (in my personal journey), you had a keyboard and a machine, you asked it things and got gibberish. Now the conversation with it just started to get interesting. Peace.
>It has helped me learn stuff incredibly faster. Especially I find them useful for filling the gaps of knowledge and exploring new topics in my own way and language
and then you verify every single fact it tells you via traditional methods by confirming them in human-written documents, right?
Otherwise, how do you use the LLM for learning? If you don't know the answer to what you're asking, you can't tell if it's lying. It also can't tell if it's lying, so you can't ask it.
If you have to look up every fact it outputs after it does, using traditional methods, why not skip to just looking things up the old fashioned way and save time?
Occasionally an LLM helps me surface unknown keywords that make traditional searches easier, but they can't teach anything because they don't know anything. They can imagine things you might be able to learn from a real authority, but that's it. That can be useful! But it's not useful for learning alone.
And if you're not verifying literally everything an LLM tells you.. are you sure you're learning anything real?
I guess it all depends on the topic and levels of trust. How can I be certain that I have a brain? I just have to take something for granted, don't I? Of course I will "verify" the "important stuff", but what is important? How can I tell? Most of the time only thing I need is a pointer in the right direction. Wrong advice? I know when I get there I suppose.
I can remember numerous things I was told while growing up, that aren't actually true. Either by plain lies and rumours or because of the long list of our cognitive biases.
> If you have to look up every fact it outputs after it does, using traditional methods, why not skip to just looking things up the old fashioned way and save time?
What is the old fashioned way? I mean people learn "truths" these days from Tiktok and Youtube. Some of the stuff is actually very good, you just have to distill it based on the stuff I was being taught at school. Nonody has yet declared LLMs as a subtitute for schools, maybe they soon will, but neither "guarantees" us anything. We could as well be taught political agendas.
I could order a book about construction, but I wouldn't build a house without asking a "verified" expert. Some people build anyway and we get some catastrofic results.
Levels of trust, it's all games and play until it gets serious, like what to eat or doing something that involves life threatening physics. I take it as playing with a toy. Surely something great have come up from only a few piece of legos?
> And if you're not verifying literally everything an LLM tells you.. are you sure you're learning anything real?
I guess you shouldn't do it that way. But really, so far the topics I've rigorously explored with ChatGPT for example, have been better than your average journalism. What is real?
Saying you need to verify "literally everything" both overestimates the frequency of hallucinations and underestimates the amount of wrong found in human-written sources. e.g. the infamous case of Google's AI recommending Elmer's glue on pizza was literally a human-written suggestion first: https://www.reddit.com/r/Pizza/comments/1a19s0/my_cheese_sli...
> without needing to wait an answer from a human (that could also be wrong).
The difference is you have some reassurances that the human is not wrong - their expertise and experience.
The problem with LLMs, as demonstrated by the top-level comment here, is that they constantly make stuff up. While you may think you're learning things quickly, how do you know you're learning them "correctly", for lack of a better word?
Until an LLM can say "I don't know", I really don't think people should be relying on them as a first-class method of learning.
"Occasional nonsense" doesn't sound great, but would be tolerable.
Problem is - LLMs pull answers from their behind, just like a lazy student on the exam. "Halucinations" is the word people use to describe this.
Those are extremely hard to spot - unless you happen to know the right answer already, at which point - why ask? And those are everywhere.
One example - recently there was quite a discussion about llm being able to understand (and answer) base16 (aka "hex") encoding on the fly, so I went on to try base64, gzipped base64, zstd-compressed base64, etc...
To my surprise, LLM got most of those encoding/compressions right, decoded/uncompressed the question, and answered it flawlessly.
But with few encodings, LLM detected base64 correctly, got compression algorithm correctly, and then... instead of decompressing, made up a completely different payload, and proceeded to answer that. Without any hint of anything sinister going.
We really need LLMs to reliably calculate and express confidence. Otherwise they will remain mere toys.
I think as these things get more integrated into customer service workflows - especially for things like insurance claims - there's gonna start being a lot more buyer's remorse on everyone's part.
We've tried for decades to turn people into reliable robots, now many companies are running to replace people robots with (maybe less reliable?) robot-robots. What could go wrong? What are the escalation paths going to be? Who's going to be watching them?
It's given you some information and now you have to seek out a source to verify that it's correct.
Finding information is hard work. It's why librarian is a valuable skilled profession. What you've done by suggesting that I should "verify" or "proofread" what a glorified, water-wasting Markov chain has given me now entails me looking up that information to verify that it's correct. That's...not quite doubling the work involved but it's adding an unnecessary step.
I could have searched for the source in the first instance. I could have gone to the library and asked for help.
We spent time coming up with a question ("prompt engineering"! hah!), we used up a bunch of electricity for an answer to be generated and now you...want me to search up that answer to find the source? Why did we do the first step?
People got undergraduate degrees - hell, even PhDs - before generative AI.
Look up the tweet from someone who said "Sometimes when coming up with a good prompt for ChatGPT, I sometimes come up with the answer myself without needing to submit".
Verifying information is an order of magnitude easier than compiling it or synthesizing it in the first place. Prompt engineering is an order of magnitude easier still. This is obvious to most people, but apparently it needs to be said.
An entire day of generating responses with ChatGPT uses less water and energy than your morning shower. You seem terribly concerned about signaling the virtues of abstaining from technology use on behalf of purported resource misuse, yet you're sitting at a computer typing away.
You're not a serious person, and you're wasting everyone's time. Please leave the internet and go play with rocks in a cave.
Sometimes you don't need sources to verify something is correct, its something you can directly verify. To reduce it to the easiest version of this, I ask for code to do something, it writes me code, I run my unit test, it passes, my time is saved!
For other things, it depends, but if I'm asking it to do a survey I can look at its results and see if they fit what I'm looking for, check the sources it gives me, etc. People pay analysts/paralegals/assistants to do exactly this kind of work all the time expecting that they will need to check it over. I don't see how this is any different.
I don't think the library/electricity responses are serious but to move on to the point about degrees... people also got those degrees before calculators, before computers, before air travel, before video calls, before the internet, before electricity, yet all of those things assist in creating knowledge. I think its perfectly reasonable to look at these LLMs/chat assistants in the same light: as a tool that can augment human productivity in its own way.
I'm interested to hear more about how you can verify information without a source. What are you looking at when you search for the verification, exactly?
You can use them for whatever you like, or not use them. Everyone has a different bar for when technology is useful. My dad doesn't think EVs are useful due to the long charge times, but there are others who find it fully acceptable.
This doesn’t make LLMs worthless, you just need to structure your processes around fallibility. Much like a well designed release pipeline is built with the expectation that devs will write bugs that shouldn’t ship.
Yeah, I used to hire people, but then one of them made a mistake, now I'm done with them forever, they are useless. It is not I, who is directing the workers, who cannot create a process that is resistant to errors, it's definitely the fact that all people are worthless until they make no errors as there truly is no other way of doing things other than telling your intern to do a task then having them send it directly to the production line.
LLM are "great" in some use cases, "ok" in others, and "laughable" in more.
Some people might find $500 worth of value, in their specific use case, in those "great" and "ok" categories, where they get more value than "lies" out of it.
A few verifiable lies, vs hours of time, could be worth it for some people, with use cases outside of your perspective.
We'd never hire someone who just makes stuff up (or at least keep them employed for long). Why are we okay with calling "AI" tools like this anything other than curious research projects?
Can't we just send LLMs back to the drawing board until they have some semblance of reliability?