Statistics and CoV antibody testing
Posted: Tue Apr 21, 2020 2:27 pm
We are seeing a number of studies indicating that there are far more people around with antibodies than can be accounted for by known infection rates. There was one in Iceland a while ago and recently one in Santa Clara (CA) county. Before we all go off into 'herd immunity' wonderland, here's a pinch of salt.
Suppose that you have a test for antibodies. It's a very good test: it's 95% accurate. (That's typical for tests of this type. HIV-positive by immunoassay is presumptive and it is confirmed by a second test. This used to be Western Blot but is now more usually RNA-based. And we have 30 years experience looking for HIV.)
For very good public health reasons, medical tests are biased to give false positives rather than false negatives. It's generally thought better to treat non-cases than to let a carrier out into the neighborhood who might unknowingly spread the disease.
So the test is knowingly skewed towards the false positive side. So if it's 95% maybe it's 99.7% accurate for positives and 95.3% for negatives.
Suppose you have a population which is 0.2% affected with the virus. We have to date a US known cases number of something like 700,000 in a population of 350 million, so that's about right. But if you use this test to look for the antibodies on 1000 people, it will correctly find it in those 2 people (0.2%) who have had it or who currently have it. But it will also find a further 47 people (100% - 95.3% = 4.7%) who are false positives - who do not have it but appear to do so, based on the test.
The immediate conclusion, particularly from those who (a) do not understand statistics and (b) want some good news (and let's call this subgroup of the population politicians, for want of a better word) is that there are something like 25 times the expected number of people out there bearing the antibodies and that we are well on our way to herd immunity.
Not necessarily.
Edited to add: I went looking on the web to see if anyone was rising the same concern about the Santa Clara study and found this. (Google coronavirus statistics Bayesian - that's the abbreviated name of this statistical phenomenon) and found this piece by someone called Andrew who is (a) not me and (b) actually knowledgeable about statistics.
Suppose that you have a test for antibodies. It's a very good test: it's 95% accurate. (That's typical for tests of this type. HIV-positive by immunoassay is presumptive and it is confirmed by a second test. This used to be Western Blot but is now more usually RNA-based. And we have 30 years experience looking for HIV.)
For very good public health reasons, medical tests are biased to give false positives rather than false negatives. It's generally thought better to treat non-cases than to let a carrier out into the neighborhood who might unknowingly spread the disease.
So the test is knowingly skewed towards the false positive side. So if it's 95% maybe it's 99.7% accurate for positives and 95.3% for negatives.
Suppose you have a population which is 0.2% affected with the virus. We have to date a US known cases number of something like 700,000 in a population of 350 million, so that's about right. But if you use this test to look for the antibodies on 1000 people, it will correctly find it in those 2 people (0.2%) who have had it or who currently have it. But it will also find a further 47 people (100% - 95.3% = 4.7%) who are false positives - who do not have it but appear to do so, based on the test.
The immediate conclusion, particularly from those who (a) do not understand statistics and (b) want some good news (and let's call this subgroup of the population politicians, for want of a better word) is that there are something like 25 times the expected number of people out there bearing the antibodies and that we are well on our way to herd immunity.
Not necessarily.
Edited to add: I went looking on the web to see if anyone was rising the same concern about the Santa Clara study and found this. (Google coronavirus statistics Bayesian - that's the abbreviated name of this statistical phenomenon) and found this piece by someone called Andrew who is (a) not me and (b) actually knowledgeable about statistics.