I assume YC is presented with a glut of choices, there would probably be a triage stage that runs something like this:
1) nonsense (say 60%)
2) maybes (say 20%)
3) positives (say 20%)
Of the positves YC can due to resource constraints only accept a certain number, say for the sake of the argument that YCs resource constraints allow it to accept half of them, or 10% of the original group.
If after that you follow the 10% that got accepted with the 10% that did not as a control group then I expect both groups to do equally well in terms of success, in other words the 'hit rate' should workout roughly the same in both groups.
YC will find that it has invested in some duds, most of them will be so-so and there will be a very limited amount of take-offs, in the other group the ratios will be very similar.
A way to estimate what jacquesm is referring to might be to keep a record of which participants were marginal applicants and which were clearly in. Since you have limited places the in-out divide should be arbitrary. You can just measure only those on the 'in' side of it.
*If you numbered the top applications, you might be able to tease out some interesting conclusions about your ability to tell things in advance.
But do you keep tabs on your 'control group' or do you part ways?
The statistics would be very interesting.
I assume that plenty of projects that did get off the ground would have done so regardless of YC investing in them, and plenty of projects that YC invested in that have tanked would have done so anyway.
An effect does not even imply that it has to be positive, though I would assume that to be the case if there is one. But it might even be negative, though it would take a fair amount of research to prove that one way or the other.
Do you stay in touch with the (potential) founders that you reject and do you keep track of their eventual success or failure ?
Edit: And if you do keep track how many of the ones that you rejected have succeeded vs the ones that you have funded ?
Did you mean variation rather than correlation?