If you're interested in a more scientific treatment of the topic, the post links to a technical report which reports the numbers in detail. This post is instead an attempt to explain the topics to a more general audience, so digging into the weeds isn't very useful.
From: ‘Keep training it, though, and eventually it will learn to insert the None test’
To: ‘Keep training it, though, and eventually the probability of inserting the None test goes up to xx%’
The former is just horse poop, we all know LLMs generate big variance in output.