A sample size of 100 is fairly reasonable, isn't it? The worst-case standard deviation is 5%, and 100 is a small enough number that you can take the time to carefully inspect each account (by humans) to actually determine if it's a bot or not. So 100 is something like the sample size I'd expect them to use here. Not sure why that would need to be a secret.
(The key thing is that the sample has to be truly random, which is usually very hard to do, but Twitter can easily pick 100 random accounts in their own database.)
I'm not sure that I follow. Why is the worst-case standard deviation 5%? And for which random variable?
I vaguely do remember the standard way of coming up with a reasonable sample size.
Assuming:
1) We want the commonly used confidence level of 95% (corresponding to a a Z-score of ~1.96).
2) We want a margin of error of 1 percentage point (kinda reasonable, since their estimate is stated as a percentage without explicitly stating the margin of error).
3) We don't know anything about the expected percentage of bots a priori.
4) We ignore the error in determining an account is a bot or not.
Then, we'll need a sample size of 1.96^2/(2*0.01)^2 = 9604.
(See below for two answers about the math that gets to 5%.)
> We want a margin of error of 1 percentage point (kinda reasonable, since their estimate is stated as a percentage without explicitly stating the margin of error)
If the question they want to answer is "give me the exact % of bots on Twitter", then yes, you probably want a standard deviation of less than 1%. That would take many thousands of samples, as you say - much larger than sample sizes used in presidential elections, even.
OTOH, my guess is that the question for them is "are bots a big part of the Twitter population?" In that case, 97% of accounts being real versus 96% doesn't matter much, since both show bots are a very small group. (But maybe bots write a disproportionately large # of tweets?)
I don't think 100 is terribly reasonable. For a company with the size and resources of Twitter, it should be trivial to check several thousand.
I'm not sure why the standard deviation worse case is 5%. Presumably the accuracy of the bot decision plays into it. But I would imagine a couple thousand would get a far more accurate answer.
The goal is to estimate the proportion of accounts that are bots. Let that number equal p. The variance of the proportion is p(1-p). The highest that can be is (0.5)0.5. Then, the standard deviation of that is the square root, which is 0.5.
Now, we want to know the standard error for our estimate of the bot proportion. That is sqrt(p(1-p)/n). Suppose 50% of accounts are bots (I assume that would be very high), then our estimate of p would be 0.5 and our standard error with a sample of 100 would be 0.05. Hence, our 95% confidence interval is roughly 0.4–0.6 in the worst case (with a sample of 100).
If the proportion is under 0.1 (let's assume 0.05), then the standard error would be sqrt(0.05(1-0.05)/100) = 0.022. Our 95% confidence interval in this case would be roughly 0.01–0.09.
These seem like large ranges to me. Hence, I would expect them to use a larger sample too.
The worse case is 5% because "is a bot" is a boolean variable. That has a standard deviation of 0.5 in the very worst case (50% on value 0, 50% on value 1, so the mean is 0.5, and the deviation can't be more than 0.5). And the standard deviation of a random sample scales like 1 over the square root of the sample size, so 0.5 divided by 10 => 0.05 (5%).
For presidential elections it's common to see samples sizes like 1,000 or so, which have standard deviations of around 1.5%. That's better than 5%, and it makes sense since elections are often won by just a few %. But here, IIUC, the goal was to see if bots are a big part of Twitter or not, and so the answer "there are fewer than 5% bots" is enough. That is, we don't care if the % of bots is 3.5% or 4.7%.
I'm confused by your confusion. Pretty clearly, Musk, as acquirer, was made privy to the methodology Twitter uses to estimate and report the number of bot accounts. This methodology is a trade secret, and therefore was disclosed under an NDA. Therefore Musk cannot tell anyone the details (until he buys the company, then he can do whatever he wants).
People focused on the bot checking approach because the NDA question is pretty dull and whether he technically did or not depends primarily on how paperwork was filed.
NDAs can cover material mundane, dull or arbitrary. A password could be a secret, and it's all those things. The number of users sampled could easily be a trade secret.
It's not like the NDA says "keep our secrets unless you think they're too boring to keep". It says "anything marked in the following way is a secret you cannot reveal." If Musk thought that was public knowledge, he could fight them in court. Of course, that would torpedo his "this is new information to me, I need to get out of the TWTR acquisition" claims.
I meant it was mundane in the way that any stat new grad could have come up with this methodology.
If you give anyone 5 seconds to come up with a methodology to check for bot, they would probably come up with "yeah, let's sample 100 of our followers and check manually".
It is the most obvious simplest method that it surprises me twitter can even call this trade secret under NDA.
Here's an example of something more obvious. We have a new finance grad. He comes up with the methodology of "let's take all the corporate profit for the quarter, divide it by the outstanding shares, and distribute it". It's a bit simple, but it's fine. However, if you know that comes out to $7.22/share and announce that ahead of the announcement, you would breach your NDA. It's just a stupid number, but it means something.
Similarly, the fact that Twitter used 100, instead of 1000 or 2700 gives you error bars. It means something.
Semi-unrelated: a new stat grad would almost certainly not choose 100. They would feel compelled to run some math and come up with a different sample size that matched the results of some equations for power analysis, etc.
Twitter discussion is markedly different from HN discussion.
Therefore, I'd prefer discussing it here.
Your opinion is certainly valid, so is mine.
If there was only some sort of democratised mechanism to settle this as a community and a way to easily hide the threads one isn't interested in.... I could only wish.
(The key thing is that the sample has to be truly random, which is usually very hard to do, but Twitter can easily pick 100 random accounts in their own database.)