Removing outliers is a common practice in data science. A quick search on "why r...

lars · on March 4, 2020

> They only had 36 samples, and a 3sd event occurs once in ~300 cases: this hints to the fact that that data point might have been erroneous

Or it hints that the distribution of learning rates is not gaussian. When there's an "n-sigma" event, it's usually much more likely that the model is wrong than that the event is that rare.