The output of most LLMs is stochastic. The core LLM is given token, and outputs a set of ranked tokens, with a “confidence”, to go next. Then there’s normally a filtering and search stage, where those ranked token are either feed back into the LLM to get more ranked tokens and used to for a short probability tree. I.e. if we pick the top N-ranked tokens and put them back in, each of those tokens results in a new set of N-ranked tokens.
By looking at that tree some basic filtering is done. Such as picking the branch that has the highest summed confidence, or the branch that has the fewest repeated tokens, or the fewest tokens that match with input tokens, or more often some combination of the above, plus a random choice weighed by summed confidences.
That how you can give a LLM with complete fixed weights, which is all LLM, the same input multiple times, but get different outputs.
So to answer your specific question, it can “change its mind”. Every token produced creates a new opportunity for the stochastic output filters to pick a new path through all the possible outputs.
By looking at that tree some basic filtering is done. Such as picking the branch that has the highest summed confidence, or the branch that has the fewest repeated tokens, or the fewest tokens that match with input tokens, or more often some combination of the above, plus a random choice weighed by summed confidences.
That how you can give a LLM with complete fixed weights, which is all LLM, the same input multiple times, but get different outputs.
So to answer your specific question, it can “change its mind”. Every token produced creates a new opportunity for the stochastic output filters to pick a new path through all the possible outputs.