Hacker News new | past | comments | ask | show | jobs | submit login

> There should be no discernible patterns of any kind.

That's not a correct description of the model given.

The author writes "The slight drift from D to R mail-ins occurs again and again, and is likely due to outlying rural areas having more R votes. These outlying areas take longer to ship their ballots to the polling centers."

That is, the author says there should be a discernible pattern.

Note that this is a hypothesis which was created only after looking at the patterns in different states. While it appears to be true of some other states, there's no strong argument for why it should be true of all states and all vote counting methods.

Unfortunately, the author then double-downs by saying that since this is the expected pattern, and it wasn't seen in PA, then something's suspicious, to the point of being "likely evidence of ballot backdating or manufacturing."

The IMO correct interpretation is observation that the random model, which was already tweaked in order to explain one data trend, might need more adjustments in order to explain other data trends.

For example, the author uses PA and WI as examples of outliers to the drift-to-R-mode. But we know that (quoting https://eu.usatoday.com/story/news/politics/2020/11/06/how-s... ):

> While most states began processing mail-in ballots before Election Day, others had laws preventing election officials from doing so. For instance, election officials in the swing states of Wisconsin and Pennsylvania requested the ability to begin processing earlier and the Republican-controlled legislatures in those states refused, all but ensuring the high volume of mail-in ballots would not be counted until after Nov. 3.

There's no reason to expect that a model trained on states with one vote counting model should be applicable to states with another vote counting model, much less be strong enough as to justify a claim of ballot backdating or manufacturing.

Furthermore, I believe there's a reasonable argument that the "Republican-controlled legislatures in those states refused" in part to muddy the waters, and create anomalies which could be interpreted as fraudulent voting.

This potential confounding issue was also not included in the "deck of cards at a casino" model.




(actual question as you seem well informed)

What about the slow D trend for mail ins? could that be that the same 4 Nov spike got spreaded over more days? Because one would expect that the very last to arrive ballots should trend R the same as rural areas


I am not well informed.

I just know that "assume a spherical cow" analyzes - ones which assume everything is uniform - are suspect from the get-go. My suspicion was further supported when the model failed to include well-described factors which I think would reasonable cast doubt on such a simple model. And the commentary failed to discuss why those factors were rejected, preferring instead to jump right to a claim of fraudulent voting, which lead me to conclude the author was even less informed than I.

A recurrent problem in data^Wscience is that it's easy to find false signals when trawling through data. As Feynman commented: "The first principle is that you must not fool yourself and you are the easiest person to fool."




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: