In my previous post, I talked about a study published recently in Nature Medicine that reported the identification of a set of compounds whose concentration in blood might predict if a person would develop memory impairment (or even early Alzheimer’s disease) within the next few years.
My goal in writing that post was to describe how the people included in the analysis that had led to the identification of the compounds had been selected, in order to highlight two main points:
1) the results of the study applied to people who had met a particular set of criteria, and they were therefore not directly generalizable to the whole population,
2) the lipids used to develop the blood test were identified based on differences in blood concentration between relatively small numbers of people, making it all the more important to verify that these differences would also be seen in larger cohorts of people before claiming that a blood test had indeed been developed.
I then noticed that many news articles reporting the study results to the public not only gave the impression that an up and running blood test for Alzheimer’s disease was on the table, or would very soon be, but also often mentioned that the blood test would predict Alzheimer’s disease “with 90% accuracy”.
I can see two problems with giving out such a piece of information without elaborating on the subject.
1. What does it mean exactly?
The 90% accuracy refers to the test having a sensitivity of 90% and a specificity of 90%, as reported by the study’s authors. But are we all expected to know that, and to understand what sensitivity and specificity mean? Which raises the second issue:
2. The sensitivity and specificity of a test are not the same thing as its predictive value.
Again, are we all expected to know what the predictive value of a test is and how it can be figured out? If not, then stating that a given test can predict something with 90% accuracy without further explaining what it means can mislead the lay reader as to what the true predictive power of the test is.
Now, you may wonder, does it really matter if readers are told only about accuracy or if they are also told about predictive value? Well, yes, it does: mainly because accuracy is not enough to judge of the usefulness of a screening test, the predictive value is important as well.
- Sensitivity and specificity
Let’s take the lipid-based blood test proposed by the authors of the Nature Medicine report as an example. They state 90% sensitivity and 90% specificity for their test. (Let’s assume that the results have been replicated in additional groups of people, and that the test can indeed identify people who will develop signs of memory impairment within the next three years.)
Imagine now that you have a time machine. You travel three years in the future and make a list of 100 people who show signs of memory impairment (but who did not three years before), and 100 people who don’t show any sign of memory impairment. You then come back to the present time, hunt down all of these people, and make them take the lipid-based blood test.
– Out of the 100 people who you know for sure will develop memory impairment, 90 will have a positive test (90% sensitivity).
There will therefore be 10 people for whom the test will be negative, despite the fact that they will develop disease: those are false negative.
– Out of the 100 people who you know for sure will not develop memory impairment, 90 will have a negative test (90% specificity).
There will therefore be 10 people for whom the test will be positive, despite the fact that they will not develop disease: those are false positive.
In other words:
– the sensitivity of a screening test indicates the percentage of sick people who will be correctly identified as such (positive test)
-> the higher the sensitivity, the fewer false negatives
– the specificity indicates the percentage of non-sick people who will be correctly identified as such (negative test)
-> the higher the specificity, the fewer false positives.
Now, time machines don’t exist (alas). We do not know who will get sick and who will not. All that we have is what the blood test tells us right here, right now. From there, we want to know if we will develop disease or not. And that is the question addressed by the predictive value of the test.
- Predictive value of a screening test
In a way, the predictive values of a test look at the flip side of the problem compared to the sensitivity and specificity. For example:
– in the case of sensitivity, we look at a group of sick people, and we want to know what percentage of these people will have a positive test
– in the case of positive predictive value, we look at a group of people with a positive test, and we want to know what percentage of these people will truly be sick.
In other words:
– the positive predictive value indicates the percentage of people with a positive test who are truly sick (in yet other words, it is the probability of being truly sick when the test is positive)
– the negative predictive value indicates the percentage of people with a negative test who are truly not sick (it is the probability of not being sick when the test is negative).
An important thing to know is that sensitivity and specificity are two intrinsic characteristics of the screening test: they will for example depend on technical aspects of the test, but they will not depend on how common the disease is in the population that is being screened (prevalence of the disease). By contrast, the predictive value of the test does depend on disease prevalence (of course, it also depends on the sensibility and specificity of the test).
Let’s come back to the example of the lipid-based blood test that can predict the development of memory impairment a few years in advance. This time, we will not look at our imaginary group of 200 people selected thanks to our travel in the future, but at the real population. In the Nature Medicine study, about 5% of the people recruited in the study (aged 70 or more) developed signs of memory impairment: let’s take that as an estimate of how common it is to develop memory impairment for someone over 70 and use it in our example.
If we look at 1,000 seniors going to see their family doctor and taking the lipid-based blood test, then:
– out of the 1,000 people, 50 will convert from normal memory function to memory impairment, and 950 will not (disease prevalence of 5%)
– out of the 50 people who will convert, 45 will have a positive test and 5 a negative test (the test has a 90% sensitivity)
– out of the 950 people who will not convert, 855 will have a negative test and 95 a positive one (the test has a 90% specificity)
– in total, out of the 1,000 tests, there will be 45+95=140 positive tests, but only 45 of these will be true positives.
As already said, the positive predictive value of the test represents the proportion of people with a positive test who will turn out to truly develop/have the disease.
-> In our example, the positive predictive value is 45/140, or 32%.
In other words, if your screening test is positive, the probability that you will develop memory impairment is about 1 in 3. In yet other words, about 2 out of 3 people with a positive test will not develop memory impairment.
The negative predictive value of the test represents the proportion of people with a negative test who will turn out to truly not develop/have the disease.
-> In our example, the negative predictive value is 855/(855+5), or 99%.
In other words, if your screening test is negative, there is only a 1% chance that you will develop memory impairment.
Finally, it is worth noticing that for any screening test (which has a fixed, built-in sensitivity and specificity), the less common the disease is in the population, the lower the positive predictive value of the test will be.
-> In our example of the Alzheimer’s blood test, if we now look at a younger set of people, we know that memory impairment will be less common (Alzheimer’s disease being associated with old age); let’s estimate that the prevalence is 1% in this younger group: the positive predictive value of our test (which is still “90% accurate”) will now only be about 8%, meaning that 92% of people testing positive will not develop memory impairment in the next few years.
- Accuracy, predictive value, and conclusion
To sum up, the lipid-based blood test proposed by the authors of the Nature Medicine study to predict memory impairment and Alzheimer’s disease before the appearance of symptoms is indeed 90% accurate, in that it is 90% sensitive and 90% specific.
However, it has a positive predictive value of only about 32% for people aged 70 or more (with the prevalence of the conversion from normal memory function to memory impairment estimated to be about 5% in that particular age group). This means that just one-third of people who test positive will indeed develop memory impairment or early Alzheimer’s disease within the next few years.
I now want to come back to a question I asked earlier in this post: why should we tell readers about accuracy and predictive value? (And this actually apply to any screening test, not just the one for Alzheimer’s disease that has been advertised lately.)
For one thing, people might see things differently if you only tell them “this test is 90% accurate” or if you also tell them something like “2 out of 3 positive tests will be false positives”.
More importantly, the predictive value of a screening test particularly matters if there is no other test that can be used to confirm or infirm the results obtained (which is currently the case for a test that would propose to detect Alzheimer’s disease three years before symptoms appear). If, on top of that, there is no effective way to prevent the disease that is screened for, nor any way to cure it or even to slow down its progression, then the positive predictive value of a screening test that would predict disease matters even more, as you do not want to alarm people unnecessarily (though in that case one might also wonder about the usefulness of such a predicting test in the first place). Finally, the positive predictive value of a test for a given disease will also matter if the treatment available for this disease has considerable toxic side effects, or if subsequent tests to confirm diagnosis involve potentially harmful procedures: in that case, you really want the positive predictive value to be high so as to avoid treating/testing someone who does not need it.
– On the hazards of significance testing. Part 1: the screening problem, on DC’s Improbable Science blog
(this post deals with how to figure out the predictive values of screening tests, and shows a useful schematic tree to quickly visualize the possible results of a screening test given disease prevalence, test specificity and sensitivity)
– Blood Test Accuracy Not Easily Measured, on MedPage Today
(this article also deals with predictive values and focuses on the case of the Alzheimer’s lipid-based blood test)
– Sensitivity and specificity, on Wikipedia