Models like gpt3 get turned into models like ChatGPT through RLHF (reinforcement learning from human feedback), by fine-tuning the model further on prompts in the style we'd like them to respond in, typically
User: question
Bot: Response
This is done by handcrafting or modifying data from places like stack exchange.
Models like gpt3 get turned into models like ChatGPT through RLHF (reinforcement learning from human feedback), by fine-tuning the model further on prompts in the style we'd like them to respond in, typically
User: question
Bot: Response
This is done by handcrafting or modifying data from places like stack exchange.