For the last two decades, critics of our largely private health care system have published paper after paper comparing U.S. health care to health care in other countries. Most of them tell the same story: the U.S. health care system costs more than any other health care system and produces the same or inferior outcomes.
In almost every case, researchers treat the data used to reach their conclusions as perfectly sound. Unfortunately, the data used to compare national health systems are generally much less than perfectly comparable. Expenditure data are calculated using different accounting systems, vital statistics are logged using different definitions and coding customs, and prices are controlled and therefore uninformative, especially in government controlled health systems.
The latest example of using dodgy data to portray the U.S. health care system as a failure comes to us via yet another study funded by the Commonwealth Fund. The claim this time is that the U.S. health care system is bad because it allows more people to succumb to “amenable mortality,” deaths that could have been prevented with “timely and effective health care services,” than health systems controlled by national governments. While the paper is an interesting academic exercise, the quality of the data used to produce its results are simply not up to the task of saying anything useful about the relative quality of medical care in different countries.
Nevertheless, recent headlines trumpeted the finding that the United States health care system performs poorly, presenting the results as unassailable fact in stories that were warmed over versions the Commonwealth Fund press release.
The paper reports that the decline in deaths from amenable mortality from 1997/1998 to 2006/2007 averaged 31 percent in the 16 countries covered. Ireland performed the best with a decline of 42.1 percent. The U.S. performed the worst with a decline of 20.5 percent.
The authors ignore their own previous cautions. The authors of the study, Ellen Nolte and Martin McKee, are coauthors of another study — a 2011 paper entitled “Measuring NHS Performance 1990-2009 Using Amenable Mortality: Interpret With Care.” Yet one thing they failed to do in the Commonwealth Fund study is interpret U.S. amenable mortality with care. They assert that “potential” explanations for the “poor” performance of the U.S. are “lack of universal coverage and high cost of care.” Rather than defend this hackneyed conclusion with any sort of evidence, they evade the issue by claiming that “the space available does not permit a detailed examination of the underlying reasons for its [the US’s] poor performance.”
The paper’s readers would have been better served had Nolte and McKee considered “potential” explanations that included the definitional differences and other known problems with “cause of death” statistics. In short, there are a number of reasons why neither this paper, nor amenable mortality in general, can be used for any sort of useful comparison of health system quality.
Definitions of amenable mortality are subject to large errors and typically have not been subjected to medical review. To begin with, it is clear that deaths from so-called amenable causes are affected by many factors beyond the control of any health system. These factors include differences in population income, education, risk-tolerance, behavior, migratory status, genetics, geography, and environment. As one would expect when so many factors are involved, researchers have been able to show only weak and inconsistent associations between amenable mortality and health care expenditure.
Amenable mortality is also difficult to define when disease incidence varies across countries. Nolte and McKee include epilepsy and diabetes as conditions amenable to health care. But mortality risks are higher, even with good care, for people who have conditions like epilepsy and childhood Type I diabetes. For diseases that have a higher mortality risk, variations in amenable mortality will occur if disease incidence varies even in two countries with identical health systems.
The worldwide incidence of childhood Type I diabetes varies from 0.6 per 100,000 in Korea to 35.3 per 100,000 in Finland. If one is going to seriously evaluate health systems based on deaths from Type I diabetes, it would seem that such a huge variation in incidence must be taken into account. If it is not taken into account, it is difficult to say whether changes in amenable mortality over time or across nations are due to differences in health care quality or differences in morbidity.
Finally, the ability of a health care system to save people afflicted by disease largely depends on the people themselves because medical care cannot heal unless people seek care and comply with recommended procedures. Deaths amenable to medical care vary with education and social class even in relatively homogeneous populations covered by the same universal coverage system. In Norway, Hem et al. (2007) found that men aged 25-49 with a basic education were twice as likely to die from causes amenable to health care as men who went to college. In British Columbia, Wood et al. found that death rates from amenable mortality were associated with social class. In general, health is known to be associated with income, education, ethnicity, and genetic endowment. Even though these factors are known to affect death rates from various conditions, they are not modifiable by health care systems in free countries.
Estimates of those who died from “amenable” causes rely on death certificate data that are known to be wildly inaccurate. There is a large literature suggesting that death certificates do not provide good estimates of the number of people who die from the kinds of conditions generally considered “amenable” to medical intervention. Mathers et al. (2005) estimate that just 7 of the 16 countries analyzed by Nolte and McKee have high quality cause of death information when measured by timeliness, completeness, coverage, and the “sparing use of codes for ill-defined causes.” Those countries were Australia, Finland, Ireland, Japan, New Zealand, the United Kingdom, and the United States. In the U.S., comparisons of cause of death on death certificates with medical records suggest that both under and over-reporting errors can be as high as 50 percent.
Lozano et al. (2001) suggest that that ischemic heart disease mortality is commonly underreported by as much as 30 percent in Japan and France. This might explain why Nolte and McKee’s data show such large, and generally inexplicable, differences between U.S., French, and Japanese death rates from fatal heart disease. In 1997/1998 Japan had a death rate of 8.93 per 100,000 and France had a death rate of 12.39 per 100,000. The U.S. had a death rate of 39.5 per 100,000.
As ischemic heart disease is both a leading cause of mortality in relatively young people and is considered an “amenable” condition, underestimating it introduces an important source of error. Lloyd-Jones et al. (1998) compared the death certificates of 2683 deceased Framingham Heart Study participants with the opinion of a three physician panel and concluded that U.S. national mortality statistics, which are based on death certificate data, may overestimate the frequency of coronary heart disease by 7.9 percent to 24.3 percent.
Perinatal death, usually defined as infant deaths within a week of birth, also presents serious problems for Nolte and McKee’s comparison of death rates. The U.S. and Canada, which Guy et al. (2011) report as having the highest perinatal mortality in the higher income OECD countries, define a live birth as any baby who shows any signs of life. Because very low birth weight babies have much a higher probability of dying in their first few days of life, the U.S. definition of a live birth results in higher perinatal death rates that reflect definitional differences rather than health system quality.
Other countries may require that live births reach a certain birth weight or gestational age and classify the same very low birth weight baby as a fetal death or stillbirth. Nolte and McKee exclude stillbirths from their estimates and did not, as far as one can tell, include any adjustments for different national definitions of perinatal death.
The differing definitions can have large effects on perinatal death rates. Graafmans et al. (2001) estimate that lowering the threshold for a stillbirth to 24 weeks of gestation from 28 weeks of gestation raises the perinatal mortality rate by 15 percent. Reducing the birth weight requirement to 500 grams from 1000 grams increases the perinatal mortality rate by 17 percent. According to Lack et al. (2003), when Germany reduced the lower limit for birth weight for registration of fetal deaths from 1000 grams to 500 grams in 1994, its infant mortality rate jumped from 5.5 per 1,000 births to 6.6 per 1,000 births.
Estimates of amenable mortality changed significantly between the two periods studied. The literature suggests that the change from ICD-9 to ICD-10 coding caused changes in estimated causes of death rates that were as high as 30 percent.
Nolte and McKee write that UK data for 1999 were “corrected for known discontinuities affecting counts of deaths coded as pneumonia and cerebrovascular disease in particular.” They further stated that “[i]n the U.S., implementation of ICD10 in 1999 led to discontinuities in counts of deaths coded as ischaemic heart disease (IHD) (I20-25) in particular. While these are corrected for 1999-2007 in data from the CDC, 1998 data were available in sufficiently disaggregated format from the WHO only and had to be corrected for consistency for IHD with the later CDC figures.”
The reason for the focus on adjusting U.S. IHD deaths is not clear. The Centers for Disease Control published comparability ratios that compared coding differences for various causes of death between ICD-9 and ICD-10. They suggest that the new coding system increased U.S. deaths from ischemic heart disease by 335, about 0.06 percent.
Other categories included in the Nolte and McKee amenable death definition but not mentioned as having particularly important discontinuities included septicemia deaths, which the switch to ICD-10 increased by 4,054, or 19 percent. Deaths from nephritis and nephritic syndrome increased by 6,186, about 26 percent. Overall, the changes were large enough that the U.S. National Center for Health Statistics warns that “[for] some causes of death, the discontinuity in trend can be substantial…considerable caution should be used in analyzing cause-of-death trends for periods of time that extend across more than one revision of the ICD.” In France, Meslé et al. (2008) note that it is “very difficult” to reconstruct consistent cause-of-death time because France implemented automatic cause-of-death coding at the same time it transitioned to ICD-10.
Use of the European Standard population may bias U.S. age-standardized death rates. The calculation of the age-standardized death rates used by Nolte and McKee requires dividing the number of deaths from amenable conditions in a given age band in a given country by the population in that age band. Nolte and McKee use the European Standard population to provide a standard population estimate across countries. In most of the age bands under consideration, SEER reports that there are fewer people in the European Standard population age groups than in the current U.S. standard population. Without adjustment, standardizing raw U.S. deaths using the European Standard would increase U.S. mortality rates.
Though U.S. National Vital Statistics Reports note that the choice of standard population may make relatively little difference when one is looking at relative trends, it can make a difference in trends in some of the leading causes of death, especially when the age structure of the populations differs. When the U.S. changed from the year 1940 standard population to the year 2000 standard, the decline in deaths from heart disease fell from 30 percent from 1970 to 1995 using the 1940 standard and by just 26 percent using the 2000 standard. The reason was that the 1940 standard had more people aged 25-64 years old than the 2000 standard so the 43 percent decline in deaths in that age group receive less weight in the 2000 calculation. One way to determine how much difference the population difference makes in amenable mortality calculations would be to run the same study using the U.S. population standard.