My first interpretation of this is that it's jazzed-up Chain-Of-Thought. The res...

logicchains · on Sept 12, 2024

>my gut reaction is this negatively impacts model usability, but i'm having a hard time putting my finger on why.

If the model outputs an incorrect answer due to a single mistake/incorrect assumption in reasoning, the user has no way to correct it as it can't see the reasoning so can't see where the mistake was.

accrual · on Sept 12, 2024

Maybe CriticGPT could be used here [0]. Have the CoT model produce a result, and either automatically or upon user request, ask CriticGPT to review the hidden CoT and feed the critique into the next response. This way the error can (hopefully) be spotted and corrected without revealing the whole process to the user.

[0] https://openai.com/index/finding-gpt4s-mistakes-with-gpt-4/

Day dreaming: imagine if this architecture takes off and the AI "thought process" becomes hidden and private much like human thoughts. I wonder then if a future robot's inner dialog could be subpoenaed in court, connected to some special debugger, and have their "thoughts" read out loud in court to determine why it acted in some way.

thomasahle · on Sept 12, 2024

> my gut reaction is this negatively impacts model usability, but i'm having a hard time putting my finger on why.

This will make it harder for things like DSPy to work, which rely using "good" CoT examples as few-shot examples.

rchaves · on Sept 13, 2024

yeah I guess base models without built-it CoT are not going away, exactly because you might want to tune it yourself. If DSPy (or similar) evolves to allow the same or similar than OpenAI did with o1, that will be quite powerful, but we still need the big foundational models powering it all

on the other hand, if cementing techniques in the models becomes a trend, we might see various models around with each technique for us to pick and choose beyond CoT without need for us to guide the model ourselves, then what's left for us to optimize is the prompts on what we want, and the routing the combination of those in a nice pipeline

still the principle of DSPy stays the same, have a dataset to evaluate, let the machine trial an error prompts, hyperparameters and so on, just switch around different techniques (possibly automating that too), and get measurable, optimizable results

m3kw9 · on Sept 12, 2024

The moat is expanding from use count, also the moat is to lead and advance faster than anyone can catch up, you will always have the best mode with the best infrastructure and low limits.