from their Table-3 "the aha moment", can someone explain why the re-evaluation step worth to "aha"? It looks simply repeating the initial step in the exact same way?
I think the "Aha" is that the RL caused it to use an anthropomorphic tone.
One difference from the initial step is that the second time around includes the initial step and the aha comment in the context : It is, after all, just doing LLM token-wise prediction.
OTOH, the RL process means that it has potentially learned the impact of statements that it makes on the success of future generation. This self-direction makes it go somewhat beyond vanilla-LLM pattern mimicry IMHO.
My experience for Claude as therapist is - it's consistent better than human therapists i've met (well, maybe i haven't met a good human therapist yet) in terms of usefulness. And i can be completely honest & decide how much i want to share the context.
do you feel the less censorship yourself from their instruction tuned model, or is there some public reference to showcase? (i haven't used mistral model before). It's interesting if a major llm player adopt a different safety / alignment goal.
i just hope people don't claim "X model support Y context window", when the evaluation is done on "Needle in a haystack" only. It creates so much unnecessary hype.
There exist programs to export them to markdown or what have you. Dunno how well they handle embedded media. I do a lot of copy-pasting screenshots or embedding PDF pages… or entire pdfs.
[edit] “why screenshots?”
1) To record gui workflows, walkthrough-style.
2) to record whole screens of values from guis while preserving formatting perfectly (think: cloud dashboard vital stats screens for various resources)
3) to record short message exchanges from ephemeral messaging with all the formatting intact with zero extra effort. (Think: feature discussion in a periodically-cleaned chat channel; I can always turn it into text later if I need to, recording with screenshot is fast)
4) plus now that it’s almost as easy and reliable to copy-paste from images as from regular text, on macOS and iOS, why not?
Can someone verify - did he really the architect for A/B test at early Facebook? It’s undoubtedly one of the worst Pandora’s box at internet history, but I do have respect for who created this - it’s probably the largest scale statistic application ever.
reply