Published in Skeptical Inquirer Volume 43, No.1 (Jan/Feb 2019).
Translated to Portuguese here.
“Pink October” and “Blue November” are campaigns to promote awareness for breast and prostate cancer respectively. In Brazil, as well in other countries, the population is encouraged to do tests such as the PSA for prostate cancer and mammography for breast cancer.
The main idea is called screening: to perform tests in healthy persons to detect and to treat diseases before their symptoms appear – to increase the cure chances and even to offer a less aggressive treatment. A good example of cancer screening is for cervical cancer, where incidence of the advanced cases decreased after Papanicolaou test was introduced (Adegoke et al. 2012). However, studies that assessed the efficacy of screening for prostate or breast cancer show that in reality things are more complicated than usually advertised by those campaigns.
On May 2018, the US Preventive Services Task Force (USPSTF) reviewed studies of PSA screening for prostate cancer (Fenton et al. 2018). Only two randomized controlled trials were of sufficient quality to assess impact on mortality. The PLCO1 did not show differences in mortality. The ERSPC2 trial showed that screening reduced prostate cancer mortality in men aged 55-69. Even in that case, it’s far from simple. In order to avoid 1 prostate cancer death and 3 cases of metastatic prostate cancer, 1000 men aged 55-69 have to be screened every 4 years during 13 years. Of those, 27 men received treatment—prostate surgery and/or radiation therapy. More important, the majority of those who were treated, 24 patients, received aggressive treatment with no benefits, only the harms caused by the treatment itself. Regardless of screening, 5 men died from prostate cancer. See Table 1 for complete estimates.
So a careful analysis of cancer screening must consider harms of treatment. The USPSTF review found that, of those who undergo complete prostate surgery, 1 in 5 men develop urinary incontinence and 2 in 3 experience impotence. More than half of who receive radiation therapy develop impotence and 1 in 6 men develop bowel symptoms, including bowel urgency and fecal incontinence (Fenton et al. 2018).
The overall picture that prostate cancer screening might cause more harm than good it’s far from new. A 2013 Cochrane meta-analysis (Ilic et al. 2013) of five studies showed no reduction in mortality and in 2012 USPSTF recommended against screening regardless of age. Now, the USPSTF concluded the benefits and harms of screening for men aged 55-69 are balanced, recommending an individualized decision after a careful consideration of potential benefits and harms. For men aged 70 and older, the USPSTF recommended against screening. See link for USPSTF decision aid.
Regarding breast cancer, the evidence might look more favorable to screening with mammography, but again not crystal clear as the campaign messages suggest. A 2018 JAMA article (Keating and Pace 2018) estimated that 10 of 10 000 women in their 50s screened annually with mammography during 10 years would avoid breast cancer death. But 940 women would have unnecessary biopsy and 44 would be treated needlessly with surgery, radiation, chemotherapy, or hormonal therapy. Despite screening, 62 women would still die from breast cancer. Since breast cancer treatment improved dramatically as trials were conducted, screening benefits might be even smaller today than those reported in trials (Keating and Pace 2018). To offer benefits while reducing the possible harms, the USPSTF recommends mammography every 2 years for women aged 50-74 and recommends against teaching breast self-examination.
This situation is not restricted to breast and prostate cancer. Between 1975 and 2009, the incidence of thyroid cancer in United Stated tripled – from 4.9 to 14.4 in 100 000. But the mortality rate remained constant: 0.56 in 100 000 (Esserman et al. 2014). A study in Finland detected thyroid cancers in autopsy of 36% of patients that had died for other causes (Esserman et al. 2014).
Autopsy studies in prostate in patients that died for other causes also show a great disease reservoir (Sandhu and Adriole 2012). In autopsied men aged 60-79, the incidence of prostate cancer varied between 14 and 77%. Astonishingly, prostate cancer was found even in autopsied men in their twenties, with an incidence of 8-11% (Sandhu and Adriole 2012).
What these data are describing is called overdiagnosis (Welch and Black 2010; Carter and Barrat 2017). Screening detects mainly non-lethal or harmless cases of cancers – if it weren’t for screening we would die from other causes not even realizing that we also had cancer. Overdiagnosis is not a false-positive result, which is a positive test that subsequent evaluation shows no signs of cancer. In overdiagnosis, the lesion detected actually meets the diagnostic criteria for cancer, but would not have been diagnosed in the absence of screening (Welch and Black 2010; Carter and Barrat 2017).
Overdiagnosis and its consequences are the main harm of screening. Since at the time of diagnosis, it is impossible differentiate harmless lesions from lethal ones, almost all cases are treated (Welch and Black 2010). Estimates suggest that between 20% and 60% (Fenton et al. 2018; Carter et al. 2015) of screen-detected prostate cancers were overdiagnosed. Of screen-detected breast cancers, overdiagnosis estimate from trials is 19% (Keating and Pace 2018), while an analysis of screening programs reported 52% (Jørgensen and Gøtzsche 2009). Then, a few might benefit from prostate and breast cancer screening, but more patients face harms of aggressive treatment they didn’t even need it in the first place.
One of the premises of screening is that cancer has a linear progression, which would always allow detection before it’s lethal. But this premise is outdated. Cancers are heterogeneous, with different progression rates (Figure 1) (Carter and Barrat 2017). Screening is more likely to detect cancers that grow slowly or would have regressed. Critically, most lethal cancers, those that grow fast, are less likely to be detected by screening because they tend to cause symptoms between screening rounds (Carter and Barrat 2017).
Figure 1. Heterogenity of cancer. Not all cancers have the same progression rate (Carter and Barrat 2017; adpated from Welch and Black 2010).
While the messages encouraging screening tests rarely mention overdiagnosis, they often come with claims like: “If it is early diagnosed, the chances of cure are 95%, but it’s 20% if detected in advanced stage”. However, when overdiagnosis exists, the rate of cured patients is biased – the number of patients that survived cancer increases “automatically” because those patients with new harmless cases are now classified as “cured”, even when screening offers no benefits. Ironically, the rise in cancer incidence and inflated rates of cure due to overdiagnosis might reinforce the efforts to screening leading to even more overdiagnosis3 (Brodersen et al. 2018).
Another way the survival metric is biased relates to how many years the patient has lived after the diagnosis. Screening is only effective if it can detect diseases earlier. Consider for instance that without screening patients are diagnosed due to symptoms at 70 years of age and die at 75. Consider also that these patients would be diagnosed by screening at 65 years and die due to cancer at 75. With these descriptions screening looks beneficial since whoever does screening has a survival of 10 years and who doesn’t only survives 5 years after the diagnosis. In both cases, the patient died at the same time; screening only made the diagnosis to occur earlier, without actually increasing the life expectancy. This is called the lead time bias (Raffle and Grey 2007).
Due to biases, survival statistics do not show the efficacy of screening. If screening works, incidence of advanced cases must reduce. After the introduction of breast and prostate cancer screening, it was expected an increase in the incidence of early cancers. That should be followed by, as the population ages, a compensatory decrease in advanced cancers, while overall incidence remains unchanged (Esserman et al. 2009). Note in Figure 2 that the incidence of early breast cancers increased significantly, while the incidence of regional cancers decreased very little and rate of distant metastases remained stable. Interestingly, although breast cancer mortality is falling, the fall was larger in young women that were not invited to screening (Narod et al. 2015). In addition, breast cancer mortality decreased in a similar way over the world, but the start of screening differs between countries (Gøtzsche 2015a). Similar observations could be for prostate cancer. After screening, there was not a significant decrease in invasive cases as expected and different rates of screening and treatment in different regions were unrelated to prostate cancer mortality (Esserman et al. 2009). These trend analyses, while does not show causality, indicate that screening leads to considerable overdiagnosis of early disease and its impact on breast and prostate cancer mortality is small at best.
Figure 2. Incidence of different stages of breast cancer standardized by age in the United States White. SEER 9, 1975–2011 (Narod et al. 2015)
The best approach to measure screening efficacy is using randomized controlled trials, like PLCO and ERSPC. Trials compare screened group with control group, looking for a reduction in deaths caused by the cancer being screened for—what it’s called cancer specific mortality. It is the reduction in cancer specific mortality that prompts the claim that screening “save lives.” But as women overdiagnosed with breast cancer might receive radiotherapy, which increases mortality due to lung cancer (Gøtzsche 2015b), screening could cause more deaths than breast cancer deaths averted. Since deaths by treatment are usually classified as other causes, cancer specific mortality is biased in favor of screening. This bias is avoided by using overall mortality. What might be shocking is that cancer screening trials do not show overall mortality reduction. As Vinay Prasad and colleagues wrote in the BMJ (Prasad et al. 2016), “cancer screening has never been shown to save lives.4” Does screening increase deaths from other causes? We don’t know—maybe it’s just chance, since millions of people are required in a trial to look for a difference in overall mortality. Prasad and colleagues argued those large trials are needed to know screening effects. In contrast, researcher Peter Gøtzsche think such trials are not an ethical thing to do, since a large number of people would have to be screened without knowing whether this will increase their lives, while it will make them less happy due to psychological distress caused by false-positives results and overdiagnosis (Gøtzsche 2015a). Due to small, if any, benefit in mortality but documented harms, Gøtzsche has stated that mammography screening would have been withdrawal from the market, had it been a drug (Gøtzsche 2015b).
Other scientists, like Laura Esserman, think that we should focus on ways to make screening better. For example, she and colleagues suggested not to call cancer those indolent cases, which are usually detected by screening (Esserman et al. 2009). Since a cancer diagnosis is associated with a lethal disease that causes suffering in the mind of patients and physicians, renaming those indolent lesions might reduce needlessly treatment. This was firstly proposed almost ten years ago, but as late as August 2018 other scientists are still asking for those changes (Nickel et al. 2018). Esserman has also proposed to move to a risk-based screening, which targets people at high risk of cancer. Testing whether risk-based screening can reduce mammography use without increasing advanced cancers is the objective of the Wisdom study (Esserman et al. 2017).
Meanwhile, the public needs to be properly informed. The prostate and breast cancer awareness campaigns must be used to clearly tell the population the complexities regarding screening. This is very important; according to surveys women overestimate the benefits of mammography screening by a factor of 10-200 (Wegwarth and Gigerenzer 2018). Also, as screening is often promoted as prevention, 68% of women in a survey wrongly believed that mammography reduces their chance of developing breast cancer (Domenighetti et al. 2003). As a perspective article in New England Journal of Medicine (Biller-Andorno 2014) pointed out, “How can women make an informed decision if they overestimate the benefit of mammography so grossly?” This might be explained by doctor’s failure to communicate screening risk: in a survey of 300 US screening patients, 90% of them had not received information about possible harms of screening by their doctors (Wegwarth and Gigerenzer 2018).
That’s not the whole story. A 2017 systematic-review showed that doctors usually overestimate screening and treatment benefits while they underestimate its harms (Hoffman and Del Mar 2017). A survey with primary care physicians in the United States suggests that doctors misunderstand screening statistics: 76% of doctors participants were misguided by the survival metric discussed earlier (Wegwarth et al. 2012). They wrongly thought patients diagnosed by screening with better 5-year survival rates than patients diagnosed by symptoms means screening test saved lives. As asked in an article (Wegwarth and Gigerenzer 2018), "Why is risk literacy so scarce in health care?" The authors discussed that the difficulties to access risks and benefits in health likely lies with how statistical information is presented, from biased reports in medical journals to the use of relative risk and misleading statistics by the media. And research shows that decision aids help patients to be more informed regarding screening decisions (Stacey et al. 2014). The researchers beautifully concluded: “A critical mass of informed citizens will not resolve all healthcare problems, but it can constitute a major triggering factor for better care (Wegwarth and Gigerenzer 2018).”
1. PLCO: Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial
2. ERSPC: European Randomized Study of Screening for Prostate Cancer.
3. This has been called the popularity paradox: “The greater the harm through overdiagnosis and overtreatment from screening, the more people there are who believe they owe their health, or even their life, to the programme.” (Raffle and Grey 2007, 68).
4. Lung cancer screening with CT in heavy smokers in a 2011 trial reduced lung cancer and overall mortality. Even though this is case of screening in a high-risk group, Prasad and colleagues considered the best evidence for overall mortality reduction in a cancer screening trial. However, as discussed by the authors, a 2013 meta-analysis for the USPSTF have not shown overall mortality reduction (Prasad et al. 2016).
Interesting popular readings
· Overdiagnosed: Making People Sick in the Pursuit of Health by H. Gilbert Welch, Lisa Schwartz and Steve Woloshin (2012);
· Mammography Screening: Truth, Lies and Controversy by Peter C. Gøtzsche (2012).