I'm participating in this project, as are at least a couple other HNers (surprising to no one, huh?).
I won't go into details about it because it's not my place to do so at this point (maybe after).
But I will confirm that for every response you give, you're required to enter a percentage estimate of likelihood. For example, you'd enter 90% or 72.212% or whatever on whichever question you're responding to. So there's a potential mechanism for further ranking of participants beyond the binary. The voting mechanism itself is more complicated, but again, I'll leave discussion for when it's over.
The interesting thing about this is it raises the [theoretical, in the absence of information about weighting/ranking systems] possibility professional intelligence analysts' relative underperformance against the measure is less due to inaccurately identifying high/low probability more frequently than amateur "superforecasters", and more due to professionals making the systematic error of overconfidence in the evidence they have when weighting their estimates - e.g. if the amateurs are a lot more likely to either pick 50% on events where there genuinely isn't enough information to forecast and less likely to assign single digit probabilities to events which no available evidence suggest are likely which nevertheless happen. In other words, it sounds highly plausible that if you asked simple binary questions about expected outcomes both groups would give almost identical answers and usually be correct, but the professionals are more confident when both groups are wrong.
If this is the case then it's reasonable to assume CIA's statisticians would have done the analysis and know that's the reason these "superforecasters" are better: doubt
I guess the reverse is also possible: professional intelligence analysts are systematically tending towards being overcautious and tend to pick numbers towards the middle of the range, either out of a desire not to look silly or because they're more aware of policy implications. But subjectively I'd assign that a lower probability.
I won't go into details about it because it's not my place to do so at this point (maybe after).
But I will confirm that for every response you give, you're required to enter a percentage estimate of likelihood. For example, you'd enter 90% or 72.212% or whatever on whichever question you're responding to. So there's a potential mechanism for further ranking of participants beyond the binary. The voting mechanism itself is more complicated, but again, I'll leave discussion for when it's over.