You haven't heard about RLHF, have you?

TeMPOraL · on March 9, 2023

That's like teaching a baby to speak by recording it for a month while not reacting to it, and then forming a committee to analyze the recordings and conduct a high-intensity training session with the baby.

dingclancy · on March 9, 2023

Actually if ChatGPT gives you a bad/wrong answer, and you reply telling it is wrong and why it is wrong, it will answer with something that you think is correct.

So you are already doing a kind of real time RLHF in the chat. That is how DAN was prompted to exist.

TeMPOraL · on March 9, 2023

Well, correct me if I'm wrong, but if I do what you describe, all the learning will be gone when I close the tab, or even when I chat to it some more, so that my correction falls out of its context window.

int_19h · on March 11, 2023

That just means that it doesn't have a long-term memory. This is not an intrinsic limitation - you can give those things access to some larger storage of information and tell them how they can query it, and they will (try to) do that. It's exactly how Bing AI does relevant searches, for example.