AI model detects asymptomatic Covid-19 infections through phone-recorded coughs

hdbsjdk848j · on Oct 31, 2020

A few caveats...

1. They trained and tested on a balanced dataset, which is very unlike the data distribution this algorithm would see “in the wild”. Under real world prescreening conditions the data would likely be extremely unbalanced toward the negative class, and also be subject to drift over time.

2. They seem to have identified positive subjects through a questionnaire not via clinical chemistry diagnostics; so (a) it is unclear whether their training labels are correct, and (b) they may have completely missed the asymptomatic population.

3. As mentioned in another comment ca. 5000 patients and 250K samples is not a lot considering the size and diversity of the population(s) where this would be deployed.

Disclaimer: I gave the article the brief high level scan treatment so I could be wrong about any or all of these. Please correct me if I am mistaken.

wenc · on Oct 31, 2020

So here's the PDF: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=920...

It's a short 9-page paper, worth a read.

1. Given that real-world distribution of positive/negative COVID cases is hugely imbalanced, having a balanced dataset would seem to be a form of random undersampling from the majority class. (undersampling potentially discards useful data from the majority class, unless we can somehow determine that the discarded data adds no new information. In this case, there's a lack of homogeneity in the majority class, which the paper points out i.e. "there are cultural and age differences in coughs, future work could focus on tailoring the model to different age groups and regions of the world ")

2. In the abstract, the claim is:

"When validated with subjects diagnosed using an official test, the model achieves COVID-19 sensitivity of 98.5% with a specificity of 94.2% (AUC: 0.97). For asymptomatic subjects it achieves sensitivity of 100% with a specificity of 83.2%." [Reminder: sensitivity = True Positive Rate = TP/P, specificity = True Negative Rate = TN/N]

If you look at Table 1, the breakdown is 59% self-reported, 28% doctor's assessment, 13% official test.

3. 5320 patient data points is something (the train/test breakdown is 4256/1064, so the model was built on 4256 data points). It would depend on the assumptions, but on first glance (based on sample size calculators), it doesn't seem underpowered. That said, this assumes a homogeneous population. The dataset is likely (unintentionally but) systematically undersampling certain populations due to lack of reach.

djsbshek · on Oct 31, 2020

Thanks for pulling out the relevant sections!

What I worry about with the undersampling are the “difficult” cases such as other types of respiratory conditions and infections. How many COPD, rhinitis, chronic bronchitis, etc patients were there in the training data? It is precisely these patients the algorithm needs to perform well on as they are higher risk and / or likely to be most prevalent among the people who seek out this app.

I think the other big question is what advantages / disadvantages does this have compared to a questionnaire administered to someone who is experiencing symptoms of an upper respiratory infection?

That being said, this study is a significant academic achievement. The authors should be very proud of what they have done. There are real challenges to doing something like this that impose hard limitations and they did as well as anyone could without infinite resources.

amelius · on Oct 31, 2020

> They seem to have identified positive subjects through a questionnaire not via clinical chemistry diagnostics

So subjects were aware of their (presumed) covid status when they coughed?

nil-sec · on Oct 31, 2020

1. Isn't an issue. They make inference on a sample by sample basis. The network has no memory so it won't expect a 50/50 distribution on the test set just because its trained like that. Having a balanced distribution is the exact right thing to do because you do not want the network to be biased to one or the other class for any given sample. If it were unbalanced the network could achieve almost 0 training error by just predicting negative all the time. This is not what you want.

djsbshek · on Oct 31, 2020

My main concerns with the imbalance are undersampling of the negative class data distribution relative to the positive class, and overestimating performance on the test splits. I can buy that you may want to train on a balanced dataset, but the testing condition should reflect the true case distribution as closely as possible.

I agree that you would not want to use only the class priors for prediction. However, I do not think it is clear that you would want to throw that information out. Also not sure that I agree with the statement that neural network has “no memory” of the prior class distribution. That is a strong claim to make about something as opaque as a neural net model.

nil-sec · on Oct 31, 2020

They could have used all negative samples for testing (and even training if they would have done it better), yes. But once your test set is large enough, whatever that means, its not that relevant anymore. They are anyway "under sampling" by not recording data from all humans that are negative right now.

And no, it's not a strong claim to make. Of course the network learns the distribution of your training set. That's why you want it balanced. But during successive applications of inference the weights do not change, it has no state. So it cannot, for example, store that it just predicted 90% negative and now it would be time again for some positive prediction.

karmakaze · on Oct 31, 2020

An easy problem to solve. Have the app keep the coughs until a test result is obtained then feed it into the model. Intermittently update the model used by the apps, done.

djsbshek · on Oct 31, 2020

Easy in theory, but not in practice. For good reasons there are very strong privacy protections in place for medical records, and significant administrative barriers. And this is not even getting into the technical / infrastructure challenges.

Maybe feasible for a VC funded company with several million dollars and >20 FTEs. Less so for an academic lab with a few grad students and postdocs being paid with pocket lint.

karmakaze · on Nov 3, 2020

I was thinking it could be self reported in-app. After using the app and saving coughs, if you get a test result you have the option of adding it with the test date.

raverbashing · on Oct 31, 2020

Yeah I think one makes sense if you're testing suspect cases, but it won't work for screening from a random population. This is an important aspect.

krajzeg · on Oct 31, 2020

One argument for not getting excited just yet:

This was not a blinded clinical trial. The subjects all knew whether they have COVID-19 or not and knowing how strong psychological effects can be, what's detectable in their cough might be their knowledge they're sick. The researchers even acknowledge in the paper that "sentiment" is a big part of how a forced cough sounds.

What's worrying is also how little of the data was from a diagnostic test (over half of "positive" samples were "self-diagnosed" COVID-19, whatever that means).

I don't think FDA or any other regulatory body would accept such an app as a screening tool without a proper trial being done.

If it works, that would be the most practical and coolest application of ML I've seen - but it still feels like something from the "too good to be true" category at the moment.

colechristensen · on Oct 31, 2020

It’s just junk science and the title is false. They used an ML model to detect if a person knows they have a diagnosis through a fake cough into a phone app. Even then their results could quite possibly just be overfitting, even with the verification data set separated.

graeme · on Oct 31, 2020

Notably this works on asymptomatic forced coughs. 17% false positive rate but 98.5% true positive rate. Extremely interesting.

sfink · on Oct 31, 2020

Thank for pulling out that false positive statistic. It is incredibly irritating when an article, even a press release type one like this, makes it a point to give an exact number for the true positive / false negative rate and then fails to answer the obvious other half of the question. It made me sneer "oh yeah? Well, a magic 8 ball that only ever says 'yes you have covid you're gonna die' would catch that remaining 1.5%; why aren't you doing that?"

But as usual, the fault is in the summary, not the research.

S_A_P · on Oct 31, 2020

Based on the title, my first question was "If they are asymptomatic why are they coughing?" I know there are many reasons you can cough, (acid reflux, allergies, etc) but this still seems that if its detectable its not 100% asymptomatic. Im splitting hairs(possibley wrong?), yes, but to me asymptomatic means its completely passive

adrianmonk · on Oct 31, 2020

I had the same thought, but I think it still counts as asymptomatic. Symptomatic and detectable are two different things.

For example, it's possible to have early stage breast cancer or colon cancer but have no symptoms (yet). Which is why they do screenings to catch these early.

colechristensen · on Oct 31, 2020

No they ask you to cough, everyone can do that.

graeme · on Oct 31, 2020

Asymptomatic doesn’t mean unaffected. It means there are no symptoms the patient is aware of and presents with. A large fraction of the asymptomatic cases on the Diamond Princess had pneumomia.

mbbutler · on Oct 31, 2020

The 17% false positive rate could be a large problem here depending on the prevalence of asymptomatic infection.

burlesona · on Oct 31, 2020

If this really works, this could be the biggest news of the year. Assuming they get FDA approval for this app, we would suddenly have a free, instantly scalable, instantly available test that catches the vast majority of true positives. You could very easily set these up at all public places and require people to check before entry. If that happens you don’t even have to require people to self-test at home, they’ll do it voluntarily to know in advance whether they’ll be let in at their destination.

The final and most difficult step would be effective quarantine of infected individuals, some of whom are likely to try and go to work anyway etc.

But even if you assume nothing more than voluntary self-quarantine etc, I would expect this to drive R0 below 1 very quickly, as the vast majority of infected would stay home and thus cease to spread the disease.

Finally, if all of the above where to come true, I think this could go down in history as the first truly life-changing AI discovery, and potentially one of the biggest watershed moments in recent history.

Obviously we’re not there yet, but I am very optimistic and excited after reading the story.

gus_massa · on Oct 31, 2020

It has almost a 20% of false positives. If you have a class of 30 students, approximately 6 of them will get a false positive on Monday and loose the rest of the week just in case. By Friday, you will have only 10 students...

The next week you will have the same problem ...

The next week, people will start to ignore the test.

And this assuming the students only take one test per day. If they also get tested in the bus and the cafeteria and the supermarket, ... the number of people without a false positive will much lower.

jorangreef · on Oct 31, 2020

The way to look at it is as a cheap instant filter, something like a bloom filter, that can protect the more expensive tests that take longer.

Your example assumes there's no hierarchy of available tests, and that this test is the only test there is.

What would really happen is those 6 false positives would be referred for a more accurate test. They might miss a day of school but not a week.

At the same time, your more accurate testing pipeline can now speed up thanks to Little's law. There's dramatically less pressure on the system and less backlog, so you have a second order effect that the more expensive slower tests also become cheaper and quicker.

But even if we gloss over all that, and we're only concerned about false positive rate, then this is still much better than no school at all, as in hard lockdown, which has a 100% false positive rate.

Finally, there's the lives saved because of earlier rapid detection and isolation, with corresponding relief for the health care system, leading to increased quality of care and resources available for more severe cases... and so on and so on.

A bloom filter can do wonders for a system, and if this test works it should do the same.

dheera · on Oct 31, 2020

Are the false positives IID, or are they correlated to each other in the same individual, i.e. a false positive will likely test false positive again, a true negative will likely never test false positive? This is the kind of critically important stuff in stochastic processing theory that medical-field statistics never seem to care to report on.

If it is the former (IID), and let's say P_D = 1.0, P_FA = 0.2, it is an extremely easy problem to solve: Just have each student take 3 tests each day, which will reduce the overall P_FA from 0.2 to 0.008. Or 4 tests for 0.0016.

If it is the latter, you will only lose 6 students for the whole week; you'll have 24 students left on Friday, not 10.

orblivion · on Oct 31, 2020

Do you think the false positives are per-recording? In that case, you could just do a few more test coughs with the same person to double check, and get that false negative rate down.

If false positives are per-person, then your scenario won't happen. It'll be the same 6 kids for whom the test never works right.

So with all this in mind, you'll have to come up with appropriate norms around the results. You could call it "okay" vs "suspect" instead of negative and positive. Maybe there's a lowered-risk version of activities for people who are "suspect" that day. Maybe they don't go to the gym that day, maybe they sit in the isolated booth in the classroom, whatever. But then, they need to take a standard test that night to return to school the next day. Or as someone else mentioned, a rapid-test at the nurse's office.

enchiridion · on Oct 31, 2020

Fall back to a rapid test then?

tim333 · on Oct 31, 2020

The FDA probably won't approve anything for ages but you could put the app out anyway?

heinrichf · on Oct 31, 2020

Discussion on /r/MachineLearning with caveats here: https://www.reddit.com/r/MachineLearning/comments/jkrzlt/d_a...

elil17 · on Oct 31, 2020

Reading the headline, I got really nervous imagining how many people would assume that a negative on the cough app means they are COVID-free and then go out and infect everyone.

However, reading the article, it seems like the false negative rate is really low. It sounds like this could be an incredibly effective screening tool.

rjtavares · on Oct 31, 2020

My wife has been following suspected COVID patients and has said since the beggining their cough is different than normal. Makes sense thay ML would work is this scenario.

curiousllama · on Oct 31, 2020

This anecdote actually makes me a lot more confident the tool might work. The article sounded to me like they just threw something at the wall, and it's not that hard to get a model that looks good on paper when you take the liberties they did.

But if this study serves as a PoC to back up that real-world observation, then this is quite a promising approach!

kevin_thibedeau · on Oct 31, 2020

I was coming off a cold from February when I got it in March. My latent cough was no different for the four days I had unusual symptoms.

gfodor · on Oct 31, 2020

The article claims that the difference is indecipherable.

eisbaar · on Oct 31, 2020

What is the false positive rate, though?

elil17 · on Nov 1, 2020

Like 20% - but that doesn’t matter as much in terms of ensure no sick people go out in public.

tim333 · on Oct 31, 2020

While people can be fairly dumb they may be able to cope with a warning that a cough listening device is not very accurate?

piker · on Oct 31, 2020

Relevant paper seems to be: https://ieeexplore.ieee.org/document/9208795

Looks incredible. Hopefully the weights and code are released quickly.

curiousllama · on Oct 31, 2020

There's a reason they published this in an engineering journal, not medical one. It's very cool, but in terms of practical use, is in the realm of "maybe it's worth doing a real medical study to see if this works" - a valuable contribution, no doubt, but a not really a contribution to diagnostics in itself (yet).

That said, I like that they're thinking outside the box on this one. A free digital test with a low false negative rate would be a game changer.

timkam · on Oct 31, 2020

I agree. Covid-related work that is disseminated in "non-medical" venues presents most likely toy examples, which is okay as long as the authors clearly state the limitations and do not try to push their research directly to the public without getting medical experts involved.

Researchers with really strong results will for sure publish in medical journals, which typically implies a massive push in prestige (impresses the administration and funding providers).

Jabbles · on Oct 31, 2020

This sounds extremely useful, if it's as accurate as it sounds - scaling this up to test literally billions of people a day would be a major factor in controlling the pandemic, and far easier than scaling physical tests.

dariosalvi78 · on Oct 31, 2020

I tried to propose this idea at the beginning of the pandemic to a bunch of pulmonologists, some also very active in research, but received no interest at all.

Anyway, happy that someone else is doing it.

mamon · on Oct 31, 2020

Minor nitpick: if there is cough involved, then those infections are not really asymptomatic, right?

daniellarusso · on Oct 31, 2020

Per the article, it is a forced cough.

nostrademons · on Oct 31, 2020

Also relevant: look at the Google Trends graph for "loss of taste", both over time and the geographic distribution:

https://trends.google.com/trends/explore?q=loss%20of%20taste...

verroq · on Oct 31, 2020

Or it’s just a graph of how much media attention the “loss of taste” symptom receives.

FrancisOfAssisi · on Oct 31, 2020

Could not find examples of a "Covid-19 cough" on the WWW! Every link I clicked on was spammed to death and I couldn't find any sound files.

Does anyone have a link to a sound file that is a good example of a "Covid-19 cough"? I would appreciate it!

dsukhin · on Oct 31, 2020

> A user could log in daily, cough into their phone, and instantly get information...

While this model is indeed extremely useful and interesting work, this seemingly casual quote gives new meaning to how unsanitary our phones really are/can be.

Tade0 · on Oct 31, 2020

I'm happy that so many phones are waterproof these days, because now I can clean the thing more thoroughly.

Of course there the issue of water getting into the charging port, but I've found that the air blown out of a laptop compiling a Node.js project works brilliantly here.

Kreotiko · on Oct 31, 2020

I am no expert but isn’t 200k sample a low number? I am worried about false positive

burlesona · on Oct 31, 2020

Why? A false positive it’s actually not that big of a deal. You spend a day being really cautious and you get another test to confirm, and then it turns out that you didn’t actually have Covid and you feel amazingly relieved.

It’s a false negative they really need to worry about. Then you have people who are going around super spreading but telling everyone that it’s fine because they tested negative.

petra · on Oct 31, 2020

>> You spend a day being really cautious and you get another test to confirm,

It depends whether there's a casual reason for detecting the same person as positive.

Kreotiko · on Oct 31, 2020

A lot of false positive would make this solution unreliable and would have the same effect of a lockdown. Also, 70k people sample could also mean discrimination towards minorities, we aren’t all made the same.

Kreotiko · on Oct 31, 2020

People voting me down probably didn’t get my point. If a tool is inefficient people won’t use it. A way to detect COVID that would produce many false positive would mine the spread of its use. Also, if the demographic sample is not done properly in theory you could risk having people underrepresented getting the wrong end of the stick. Imagine what this would mean in countries like USA!

pksebben · on Oct 31, 2020

lockdown is literally the goal. this is an example of a way to target such a lockdown instead of a blanket quarantine. if you can reliably get false negatives, you can clear people to go out.

jorangreef · on Oct 31, 2020

"false negatives" => "true negatives"

If you can reliably get true negatives, you can clear people to go out.

edjrage · on Oct 31, 2020

Can't they use the data from people they test to further improve the model?

Kreotiko · on Oct 31, 2020

I think so. This is a good start for sure

m463 · on Oct 31, 2020

Now that you or your employer have agreed to the terms and conditions of facetime/zoom/teams/webex/skype/etc, we now have an opportunity to identify and detai... market to coronavirus sufferers.

Areading314 · on Oct 31, 2020

The way this is reported doesn't give that much information. It says it has a high true positive rate, but what is the false-positive rate? A test isn't useful if there is a high rate of these.

dwheeler · on Oct 31, 2020

This sounds too good to be true. I want to see a double-blind test done ASAP.

I'm rather skeptical, but the potential upsides are so huge that it's worth rapid additional investigation.

verroq · on Oct 31, 2020

Now we just need to install public safety microphones† in every home and public space to prevent the spread of disease.

You are not against public health are you citizen? You aren’t hiding any covid patients are you?

† AKA telescreens

t0mas88 · on Oct 31, 2020

Didn't Amazon and Google already install cloud connected microphones in a lot of houses? "Alexa, check my cough" ;-)

verroq · on Oct 31, 2020

New feature coming soon via software update!

carapace · on Oct 31, 2020

Where do you draw the line though?

If this works well, isn't it kind of anti-social to refuse to cough into a phone?

Keep in mind this isn't the only (or even the first) application of this sort of always-on ubiquitous surveillance.

sieste · on Oct 31, 2020

Once this becomes widely available as a free app, people will figure out quickly how to cough in a way that gets the desired result.

gfodor · on Oct 31, 2020

If this works out this will go down as the most incredible application of ML ever.

blackrock · on Nov 1, 2020

This sounds like a bad usage of machine learning. Unless the Covid cough sounds distinctly the same, then there might just be too many false positives that can come from this. Or even a false negative.

Tycho · on Oct 31, 2020

I remember back in March a lot of people (within the field) saying AI folk should stand back and let the medical/virology community deal with Covid. This sort of thing puts paid to that attitude, IMO.

vaccinator · on Oct 31, 2020

Sounds like a job for the NSA

asdev · on Oct 31, 2020

how do you select the features for a model like this?

propogandist · on Oct 31, 2020

what a perfect way to make the ongoing domestic spying seem like it's for "public saftey".

IgorPartola · on Oct 31, 2020

I want an Amazon Echo skill to screen anyone before they come to my house :)

Seriously, this is incredibly and if this is verified to work could be a game changer. The real thing we need to do is all get tested at once on the same day at the same time. Then those who are positive need to isolate for 2-3 weeks until they are negative again. That would completely reset us back to nearly zero. Then do this again 2-3 times and we could shove this demon back into the bottle.

maxerickson · on Oct 31, 2020

That would be insufficient, the infection takes a couple days or more to become detectable.

But we could all cough into our phones every day.

What you describe is what South Korea has done a couple of times now with more traditional testing, it certainly works if you actually do it.

IgorPartola · on Oct 31, 2020

Yeah we would need to do daily testing too. The main problem is that (a) it takes 3-5 days for the infection to become detectable and (b) it takes 1-5 days to get your test results back. So for 4-10 days you are contagious and don’t know it. Instant testing like this that works before symptoms means you could detect the infection soon as it takes hold, significantly reducing the spread rate. If our traditional testing was all rapid we would be a lot better off. Well that, and leadership that isn’t solely concerned with its own reelection in a pointless bid for staying out of prison. Both would be good.