how does the training procedure smooth the garbage out?

thomashop · on May 17, 2024

Through regularization techniques, data augmentation, loss functions, and gradient optimization, ensuring the model focuses on meaningful patterns and reduces overfitting to noise.

bigfudge · on May 17, 2024

It’s not obvious how any of those would do anything but better approximate the average of a noisy dataset. RLHF might help, but only if it’s not done by idiots.