I think the idea here is that, in general, simple coherent properties of number ...

I think the idea here is that, in general, simple coherent properties of number tend to show themselves on smaller numbers more readily than on larger ones.

It would be quite unusual to have a problem stated in such simple terms and require a solution so far away. Thus this probability is used to infer just how large such a number must be, and eventually you're going to hit a point where there is more information encoded in the number than there is necessary to solve the problem, as which point no larger number could be a solution.

This is basically how induction works anyway: you produce a base case and an algorithm, and infer that the information contained therein meets the needs of the problem. Then any number which would encode more information is irrelevant (and thus sufficient) and you have an inductive proof.