Medical device engineer / founder here. I'd love for the Apple Watch to fulfill Steve Blank's aims here, but there are very real clinical problems that don't automatically go away because Apple is great at industrial and UI design. The false positive risk with their a-fib detection will cause thousands of patients to ask their healthcare providers for further tests, only to find out they were fine. This isn't unlike the issue where iPhones have been inundating 911 call centers with unintentional "butt dials".
This doesn't mean that Apple won't be successful in healthcare. It's just that the main challenge isn't in creating a nicer product. The challenge is in showing a net positive impact on patient outcomes.
False positives are not really an issue. Last I heard 60% of chest pain cases ERs see are false positives (constipation or something like that doesn't require ER care). However it is sometimes the only symptom you get of real medical emergencies so you go to the ER anyway.
There are also a number of people who die of heart attacks with no symptoms. If Apple can get even 1% of them to the ER that would be a huge win. If the false positive rate is 80% that is still enough real positives that ERs will be get used to telling people "This time it is nothing, but it is good you came in anyway because sometimes this is all the warning you get".
The real worry is false negatives - someone who has chest pain and decides not to go because the watch says all is okay. These people will die when a hospital could save them.
the time and other resources doctors spend on healthy people is the time and resources that are not spent on sick people. There is a useful number -- NNT that may help you understand how harmful the watch might be. https://www.ted.com/talks/daniel_levitin_how_to_stay_calm_wh...
There are big concerns about both false positives and false negatives. The one lead ECG in the apple watch will very likely only detect afib. one lead ECG's are not diagnostic for heart attacks or any other cardiovascular disease. If you're having a heart attack and your Apple watch says you don't have a-fib, that would be a big concern.
The exploding ER / EMS system would like to have a word with you! Every niggle or discomfort, it seems, can generate a 911 call and a visit to the ER. Legally, EMS can not diagnose in the field so we send the bulk of people that call 911 into the ER for further evaluation... whether they really need it or not. I can tell you that most don't.
There is a reason discomfort can generate a 911 call. I have known people who ignored a discomfort and died a few hours latter. If they had called 911 they would have lived.
counterpoint. The healthcare system pushes people into emergency medicine for non-emergencies. About 10 years ago when I was in high school I got a cut on my chin that required stitches, my mom got an urgent appointment with my PCP later that day–who then told me to go the ER, saying, "we don't even have a suture in this clinic" We were trying to save time, money and not burden the system, and instead shot ourselves in the foot.
The healthcare system, or at least the insurance side of it, actively tries to push patients to urgent care clinics instead of ERs for that type of problem. Most health plans have a phone number where you can call a nurse to ask whether you should go to the ER or not.
And the vast majority of primary care clinics do have sutures in stock.
A "heart attack" is the common term for a condition where part of the heart isn't receiving blood flow, and the tissue there is starting to be starved of oxygen (an infarction). That will show up on an ECG as elevation of a certain part of the tracing (the "ST" segment, between when the main part of the heart contracts and when it "resets"). In order to be clinically significant, you need to see that elevation in at least two adjacent leads, and each lead is only looking at a narrow "slice" of the heart, so you need several of them to be sure you're seeing the part that's involved (it's not at all uncommon for a heart attack to only show up on 2-3 leads (out of 12 on a typical ECG)).
A watch cannot provide more than one lead (and the lead it does provide is really useless for detecting ST elevation, since it's a view across the "top" of the heart, which is not an area that will be involved in the infarction). It is not physically possible to detect a a heart attack from a watch.
If you had asked me six months ago if a watch could detect atrial fibrillation, my answer would have been "sure, why not?". Just because one thing is possible with the advancement of technology, it doesn't mean all things are possible...
In 2020 alone we will collect more high-dimensional health data than from the beginning of history to the present day combined. Given the improving sensors on the wrist generating high-dimensional data (heart beat, ecg, perspiration, blood oxygen, motion, vibration, body temperature, etc), and a sample size of 100M+ people, is it not possible that there is some previously unknown signal in there that could be detected by deep learning?
I'm not saying there is, I just wonder if heart attacks might cause some discernible but very complex pattern visible in high-dimensional data that we haven't discovered yet, or if there is just too much physical distance between the wrist and heart to drown out all signal.
I've got a fair bit of domain knowledge and experience (in both the medical and software engineering aspects of this topic). I'm pretty comfortable taking that "risk".
"Medical grade" is hard to define, but you might actually be very surprised by how accurate this simple sensor could be in an era of deep learning AI.
Earlier this year, there was a paper published at AAAI (https://arxiv.org/pdf/1802.02511.pdf) that found that just using the sparse, noisy data from the AW sensors (non-continuous, noisy heart rate measurements and a handful of HRV estimates every day), they could diagnose diabetes, high cholesterol, high blood pressure, and sleep apnea with relatively high accuracy.
In fact, the diabetes diagnoses were comparable in accuracy to cheap lab tests specifically for diabetes. And even more surprisingly, the sleep apnea diagnoses could be made even if one doesn't wear the watch during sleep.
There are other recent papers showing extremely good accuracy in detecting rhythm abnormalities.
Deep learning can often magnify the power of cheap, simple sensors in ways that can in many cases seem unimaginable. Partly because of the power of multi-dimensional inference, and partly because the volume of data you get from wearing a device 24/7 helps to compensate for all the noise and sparsity.
And that was with the current gen AW sensors, which now are a generation behind some other consumer devices -- I'm sure the AW4 catches up (and can likely gather data like continuous HRV). Add in another data point like ECG, even if it's the simplest possible form of ECG, and I wouldn't be surprised if the diagnostic accuracy for many conditions is higher than some lab tests. Especially for transient rhythm abnormalities like transient afib, which might show up late at night after drinking, but not occur during a lab test, which probably makes detecting transient afib hard in a lab until a lot of cardiac changes have occurred.
Commenter /y/AlanYx continues in grandchild:
Yup. Humans also have difficulty thinking and reasoning multi-dimensionally, especially with conditional probabilities across those dimensions. So while the watch probably can't apply the "simplistic fill-in-the-blank style" of reading EKGs taught in the Dubin book that someone recommended above (because the sensors are just too simplistic and the data collected too limited), deep learning can see through the data in ways people often can't.
There's a good example of this in the DeepHeart paper I linked to -- the authors mention that no diagnostically predictive relationship between heart rate patterns during waking hours and sleep apnea was known prior to the work, likely because it is too complex to spot by just looking at heart rate graphs in the way humans do. (And in fact because a convolutional neural network is used, it's not easy right now to tease out in an explainable way to humans exactly what the computer is "seeing", although its specificity and sensitivity characteristics are known.)
All of the diseases mentioned (diabetes, high cholesterol, high blood pressure, and sleep apnea) are chronic conditions, and ideal candidates for the "gather a bunch of data over a large time window and see what we can see" approach.
A heart attack is a very acute condition, and doesn't give you the same luxury of looking at things over time. You have to get it right, and you only have one shot at it. When looking at chronic conditions you can built up a confidence over time, and only raise the alert when you're very sure. With a heart attack, you have a window of minutes, to an hour or two. False positives are much more of a risk in that case, and the cost of false positives is magnified significantly when a screening tool is widespread.
I may be mistaken, but you seem to be extrapolating my comment to mean that I don't think we will see continued expansion of screening tools built into various wearables. That's not true at all. I think it's quite obvious that this is an area where we will see significant growth over the coming decade.
That doesn't change my stance that heart attacks are categorically different, and much harder to detect from a sensor isolated on the wrist. I think the most promising path to doing that is detecting elevated troponin levels in the blood, and while there is some work being done on accomplishing that noninvasively, it's still a long way from being readily available.
what about a combination of devices?
watch on a wrist, phone in a palm of another hand, AR glasses on a nose and side of a head, (going to the extreme) maybe some necklace or a ring
False positives absolutely are an issue. To be sure, they aren't as inherently dangerous as false negatives, but false positives can still waste time and money. And depending on what the followup for a false positive is, they may carry actual health risks as well.
It will be interesting to see, but they ran that study for year and a half or more (listed in article) collecting data based on the old watch is probably to try to see how well they could detect afib just from the heart rate sensor.
Apple is a big rich company, but I imagine their lawyers wanted to make sure they were REALLY sure they weren’t about to screw themselves over with tens of thousands of lawsuits, above and beyond the FDA requirements.
You sure do put a lot a faith into a $500 watch. I routinely see $40000 cardiac monitors spit out false positives Ie. actute MI or Afib with RvR on healthy patients in the field.
It’s not supposed to be in the same category. That’s why doesn’t worry me. All they have to do is not generate too many false positives and maybe help a few people out.
It certainly not designed to replace a REAL device at a doctors office/hospital for diagnostic purposes.
And this watch is not supposed to be perfect, nor replace medical devices. If it works well it will be an added benefit to the user, and could alert users about heart issues (and has done so!)
I think the ratio of true to false positives is vastly more interesting and an indicator for how well the technology is performing and if the burden is reasonable or not.
Working in the industry you already know this but FP vs TP rates are a fundamental issue in nearly any system like this (i.e. non diagnostic). You typically can’t get rid of them, so the trick is to balance it in such a way for net positive benefit as you note. This may mean some unneeded visits, balanced by more needed interventions that would otherwise be missed.
Apple Watch isn’t really any different here, except potentially in scale. While that does mean potential impact is large, there is no a priori reason to assume apple has got it wrong; it’s not that difficult to find the right people for this sort of project.
It would be very interesting to see an ROC curve ...
> The question is are they are going to create millions of unnecessary doctors’ visits from unnecessarily concerned users or are they going to save thousands of lives? My bet is both – until traditional healthcare catches up with the fact that in the next decade screening devices will be in everyone’s hands (or wrists.)
True! But a similar debate has played out in the mammography space. While routine mammograms have undoubtedly saved thousands of lives, some researchers have concluded that the complications caused by false positives (e.g. infections after a biopsy) may make the net contribution of routine mammograms negative. In other words, any individual is more likely to be harmed by routine mammograms than helped.
I'm not saying that the healthcare features in the new Apple Watch are bad; just that it will be interesting to see what their net contribution to healthcare will be.
Isn't there a meaningful difference in this case between the consequences of false positives from routine mammograms, or, say PSA testing, compared to detecting Atrial Fibrillation?
False positives will always cause undue distress, and that's a factor worth considering, but the consequences of a false positive from a mammogram or PSA test could be an invasive biopsy, as you point out, or even unnecessary treatment. Wouldn't the consequences of a false positive AF detection typically be much less serious though, such as some additional non-invasive tests? There's a financial and psychological cost to that, but is it really comparable to a false positive indication for cancer, which would be much more difficult to follow up, especially weighed against the benefits of detecting undiagnosed conditions?
When I saw the product launch I shared your concerns about emergency services being inundated with calls from watches detecting falls, but apparently the chief executive of the US National Emergency Number Association isn't concerned. He said this about it:
“These are the real beginnings of exciting innovation. I don’t see, at least initially, an overwhelming number of false positives coming in. But only time will tell.”
In your example, what is the measure for determining that routine mammography is net-negative? By what measure is "any individual more likely to be harmed" determined?
I totally understand the caution medical boards use when crafting their recommendations. However, I don't think consumer access to medical screening equipment is similar. In fact, I think that empowering consumers and patients to be more in control of their medical decisions is vastly net-positive, and as the article points out, the healthcare industry is going to need to adapt to it.
> By what measure is "any individual more likely to be harmed" determined?
Do mammograms detect cancer? Yes.
Do mammograms lead to reductions in all cause mortality? Probably not, no.
There's a bunch of stuff where we use proxy measures when we should be using different measures (all cause mortality; days lost to disability; QALYs).
> However, I don't think consumer access to medical screening equipment is similar. In fact, I think that empowering consumers and patients to be more in control of their medical decisions is vastly net-positive, and as the article points out, the healthcare industry is going to need to adapt to it.
You get a scan. It shows a lump. What do you do? Most people say "get a biopsy", which is ok if we're saving lives or reducing days lost to disability, but if we're not doing those and we're causing harm then giving power to people really just means hurting them.
>where iPhones have been inundating 911 call centers with unintentional "butt dials".
This happened to me just a few months ago. I'm sitting in the passenger's seat of a car and, out of the blue, I get a call from a weird Caller ID which, of course, I hang up on. I get another call, this time from a regular looking number in the local area code. I reluctantly answer it (in spite of my friend driving telling me to hang up) and it's nearby emergency services. I'm of course totally confused and ended up hanging up on them when they start asking my name etc.
It turns out there's a setting you can change to make this harder to do by accident but I had no idea until I looked it up.
Has anyone actually had a chance to try this feature on the Apple Watch 4? All the demo videos so far keep saying the feature is coming but no one actually has used it yet. I'm curious in how frequent it produces false positive results and in what types of situations or activities.
Having said that, I love the direction Apple is taking with this product. Smart watches have been on the market for quite a number of years now but they're pretty much restricted to being gimmicky devices only as most people don't find it very useful or have a real need for it. If Apple could achieve the breakthrough in blood glucose monitoring feature then I strongly believe we will see mass adoption in very near future.
Not to mention the health angle requires Apple lock down the device even further to stay HIPPA compliant; to keep from prying eyes. Apple has pit the FDA against the FBI.
Nothing in HIPAA would require Apple to lock down the device in ways that would inhibit the FBI. Those are entirely separate issues. For example, a key escrow scheme (which I don't support) could still be fully HIPAA compliant.
This doesn't mean that Apple won't be successful in healthcare. It's just that the main challenge isn't in creating a nicer product. The challenge is in showing a net positive impact on patient outcomes.