"... how do they know that these various nodes are correlated with one another ..."
The OP alludes to this:
"... and the fact that someone runs such a large network fraction of relays “doing things” that ordinary relays can not do (intentionally vague), is enough to ring all kinds of alarm bells."
... and the OP is "intentionally vague".
I, also, am very interested to know how they correlated them and what the interesting behavior was that they exhibited ...
> Some of KAX17's relays initially had used that email address in their ContactInfo but soon after these relays were setup the email address got removed from their configuration.
They don't need correlation. Operating large numbers of nodes gives them a high enough probability that all three hops may be under their control thus they can observe the entire route.