Hacker News new | past | comments | ask | show | jobs | submit login

> Where is this precise detail coming from exactly? I suggest it's from memorization.

Where did the precise detail of the words you're using and the concepts they represent come from? Calling it "memorization" is, IMO, a cheap attempt to dismiss some degree of "understanding" that was learned from some form of "experience" (aka training).




This is a pretty standard technical term in machine learning which isn’t necessarily a pejorative but a description of behavior. Memorization is different than generalization in that it doesn’t reflect some sort of synthesis from learned knowledge but rather repeating something from training directly. It usually indicates overfitting and is distinct than responding appropriately to something that wasn’t specifically trained on, which is generalization.


But by the standard machine learning definition (performance on train/test split), LLMs generalize extremely well. This isn't even in doubt.


They do generalize. The claim is that the fine details are not the result from generalization, but from repeating test data verbatim. That seems consistent both with my intuitive understanding of neural networks and with the behavior I've observed, so I'm inclined to agree. So what does that mean? It means that while LLM:s can produce impressive output, the most impressive results that people are touting probably have a significant amount of verbatim training data in them. In other words, they're good but not as good as they seem to be.


Some of the most impressive things I've seen LLMs do is take a grammar in, in a format that does not strictly conform to any specific variant of a formal grammar notation, then generate output that conforms to that grammar, and reason about why it conforms, and how.

Most people would struggle immensely with a task like that even if handed a textbook on the subject, and no amount of training data happens to contain text in or about a language governed by a grammar of random nonsense I fed in.

There are areas where their reasoning is really awful. Ironically that's often when they seem most human-like. E.g. I just had a lengthy "argument" with ChatGPT over comparing the theoretical computational power of Markov Decision Processes vs. Turing Machines with various assumptions about the decision maker in the MDP, and it's reasoning was riddled with logical fallacies that I could very well see a high school students confronted with trying to compare the two based on a Wikipedia level of understanding of either without sufficient understanding to reason about how different aspects can be made to model the other.

But there are plenty of areas where you can get them to produce good results where the "fine details" could not possibly be repeated verbatim from the test data because they didn't exist prior to the conversation.


Mmm the most impressive thing I see LLMs do is take a piece of unstructured input and transform it in some way - summarize, extract information as JSON, etc. This wouldn't be possible if it were repeating training data verbatim, since it works on new novel inputs.


A quick example of what I mean is the art gallery sora video; the individual paintings are not specified in the prompt, I'm pretty sure. Where did they come from?


That seems obvious - they were generated. Here’s an example with text:

Prompt:

Write a short story that is comprised of a boy reading poems he wrote to his deceased loved mom, include snippets of several of the poems.

Response:

In the soft glow of the evening, with the sun dipping below the horizon, casting long shadows over the quiet cemetery, a young boy named Eli walked slowly between the rows of tombstones. He carried a small, worn notebook clutched against his chest, filled with scribbles and verses that only his heart could decipher. Today marked a year since his mother had passed away, and in his solitude, he had found refuge in writing poems that he wished he could share with her.

Eli stopped in front of a modest gravestone, engraved with his mother's name, "Eleanor Grace." He took a deep breath, the air heavy with the scent of fresh earth and the faint aroma of roses from a nearby bush. Opening his notebook, he began to read aloud, his voice barely above a whisper, as if he feared disturbing the peace of those resting around him.

"The world spins on, a dizzy blur, But here I stand, thoughts of you stir. Your laughter, a melody in my mind, Leaves traces of joy you've left behind."

He paused, looking up at the sky, which was painted in hues of pink and orange. Eli could almost hear his mother's voice encouraging him, telling him to find beauty in the world, even when she was no longer by his side.

Turning the page, he continued:

"In dreams, I roam vast fields of green, Your hand in mine, a tranquil scene. Yet, as dawn breaks, and I awake, I face the world, my heartache to take."

Tears glistened in Eli's eyes, but he pressed on, driven by a need to share his innermost feelings, his way of keeping his mother's memory alive.

"Among the stars, I search for you, In every sparkle, your essence I view. A guiding light in the darkest night, Leading me home, making everything right."

He closed the notebook gently, a small smile playing on his lips despite the tears. Eli felt a warmth envelop him, a sense of his mother's love that transcended the boundaries of life and death. He whispered a thank you to the evening breeze, imagining it carrying his words to wherever she might be.

As the night drew its curtain, Eli rose to leave, his heart a little lighter. He knew he would return, again and again, to share his poems, his fears, and his dreams with his mother. In this sacred space, amidst verses of longing and love, he found a connection to her that time could not erase.

Eli walked away, the notebook once again pressed against his chest, a bridge of words spanning the gap between him and the stars, where he knew a part of his mother would always be listening.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: