I've written previously about how the plausibility of results might affect the interpretation of new evidence from a trial. In particular, I argued that results that cannot fit in with a web of prior evidence must be viewed with skepticism.
While being overly credulous of results that are unlikely to be true is a common problem in medicine, a less common but more severe problem is how unwilling doctors tend to be to believe a new result that upsets past dogma even when that dogma was built on low quality evidence.
Historical disbelief in startling results
I discussed in the past the unreasonable attacks that I watched academic physicians make on trials showing that H. pylori was causing ulcers. The classic example of this, though, in my medical lifetime is probably what occurred with hormone replacement therapy (HRT). Large, well-analyzed observational studies consistently found that in post-menopausal women, the use of HRT was associated with a reduced risk of coronary disease. Based on this, HRT became a central facet of recommended primary prevention in such women, despite the clear increased risk of breast cancer with HRT. The data showing about a 30% reduction in coronary events and reduced fractures with HRT were felt to more than offset the increased breast cancer risk.
Through the 1990s, more and more post-menopausal women were treated with estrogen. In 1998, the HERS trial was published in JAMA. It was a randomized trial of HRT, but in women who already had coronary disease. In HERS, HRT showed no benefit on coronary events and produced an increase in thromboembolic events and a nonsignificant trend toward higher overall mortality.
HERS was a startling result. If HRT was good for preventing coronary disease, you would have expected that it would be beneficial in preventing the progression of coronary disease. We knew that all the evidence we had for HRT was from observational studies that were at risk of confounding caused by healthier women choosing to take HRT. And yet the nearly universal reaction to HERS was to try to either figure out what was wrong with it or, more commonly , to explain how the observed effects of HRT for secondary prevention did not apply at all to HRT when used for primary prevention. These arguments tended to lose track of how much the opposite was predicted before the trial was conducted. Yet I watched as nearly all the academic primary care internists I knew continued to prescribe and recommend HRT.
When HERS was published, I stopped recommending HRT and started suggesting to women taking it that they might want to stop. I worked with one other physician who did the same and many who argued that HERS simply did not apply to their primary prevention patients. Then, in 2002, the results of the WHI trial were published after the trial was stopped for harm. This was a primary prevention trial of HRT and showed an increased risk of coronary disease. Word spread quickly, and HRT for primary prevention became a thing of the past, though for years after people still tried to explain away the results.
The present day
HRT and disbelief in an infectious cause of ulcer disease are really ancient history at this point, but the tendency of doctors to try to explain away randomized trial results that they should find startling and disturbing remains.
In May of 2011, the NEJM published a trial of giving IV fluids (either saline or albumin) to children in resource limited settings in Africa who appeared to have septic shock. This trial included 3100 very ill children, and I have to think that the IRB approving this trial must have worried about the ethics of having a control arm where such sick children with impaired perfusion were not given fluid boluses, since I would guess that very few clinicians would have thought there was equipoise between the likelihood of clinical benefit and harm to the intervention. The trial must have been intended as a prod to get local governments and international agencies to expend some resources to provide this minimal level of care for septic children.
Stunningly, the trial showed around a 45% increase in mortality in the children given fluid boluses. The researchers stopped the trial for harm and wrote a paper clear on the effects seen in resource limited settings, while pointing out that this raises some concerns for other settings. An accompanying editorial allowed for much greater concern about the generalizability of these results:
It seems clear that the results of this trial indicate that bolus-fluid resuscitation with either crystalloids or colloids in patients with compensated shock who do not have a clinical fluid deficit must be practiced with much greater caution than is now the case and with increased vigilance.
The editorialist goes on to argue for the urgent need for additional study around the questions raised by this trial.
The evidence we have for fluid resuscitation in children with compensated septic shock in resource rich settings is entirely observational. Mortality rates are fairly low (in the 1% to 2% range) and so an individual clinician would be unlikely to notice harm from a therapy that increased mortality by 45%.
I have heard and read multiple experts try to explain away the generalizability of the trial. A common aspect of the arguments seems to be a belief that the burden of proof lies with those who want to generalize from these results. They seem to think that because giving fluids is standard practice that this means the practice is supported by high quality evidence, but this just is not the case.
I expect that nearly everyone who I have heard or read attack this trial would have never thought the intervention could cause a 45% increase in mortality in Africa. That is because they thought the benefits they attributed to fluid resuscitation were generalizable. As such, they should both have been completely shocked by this result, and have been extremely worried that the results generalize back from resource limited settings to resource rich settings. They should have examined the basic premises and evidence around fluid resuscitation, in just the way the editorialist did.
Instead, predictably, people tried to figure out how to defeat the trial and prove it could not possibly apply to resource rich settings. The main arguments made seem to be either that many of these children had malaria, or that many of them had severe anemia, and that in both settings there is a suspicion that fluids can be harmful.
Letters in response to the article make both these arguments. This is odd, since the data presented in the paper show them clearly to be unreasonable attacks. The increase in mortality with fluid boluses in children with malaria was 59% and in children without malaria was 43%, and these results were not significantly different from each other. The increase in mortality with fluid boluses in children with severe anemia was 71% and in children without severe anemia was 31% and again these were not significantly different from each other (and note that even if the real increase in mortality in children without severe anemia is lower than in those with severe anemia and the trial did not have the power to show this, the point estimate is still a one-third increase in mortality). One of the letter writers points out that the increase in mortality in those without severe anemia was not statistically significant, but this is a completely unfair argument. Once an increase in mortality is seen in the group as a whole, you can always find subgroups that by virtue of having fewer patients will have a similar mortality rate that does not pass a given level of statistical significance.
All who were startled by the results of this trial should be much less convinced of the benefits of fluid resuscitation in compensated sepsis and should be wary of its use. They should be pushing for trials in resource rich settings, and while awaiting those results should keep the possibility that administering fluids may increase mortality in the front of their minds however they decide to treat septic shock. Reasonable people may still choose to cautiously administer such fluids for compensated shock in resource rich settings, but they should be anxious every time they do so.
The authors of this trial wrote in the paper of how surprised they were by the results and at one point say:
Although fluid boluses adversely affected the outcome, important survival gains, across all groups, may have resulted from training and implementation of triage, basic life-support measures, and regular observation.
When I was reading the paper, it wasn't until I read those words that I realized the impact of this trial not just on the children but on the researchers. There were around 70 excess deaths in the fluid bolus arms of the trial; 70 children who likely died because of the intervention. We can imagine the affect on parents and families, but this must also be a researcher's worst nightmare. This was an incredibly difficult study to carry out, and those who did so must have thought they were heading down a path that would ultimately save innumerable children's lives in resource limited settings. Instead, perversely, the intervention killed around 70 children and those same researchers are likely having trouble sleeping and comforting themselves with the knowledge that the training from the study may have helped save some children overall. They deserve our honor for a trial well designed and executed and our sympathy for what must be the emotional toll.
For me, I remember a man in his 30s who was admitted to my service in septic shock near the end of my internship. We gave him innumerable fluid boluses over the course of 12 hours as he inexorably moved towards death. When I think of my hardest moments as a doctor, telling his 13-year-old daughter in the middle of those 12 hours that her father was not doing alright has pretty much been at the top ever since that day. Now, after reading this trial, I can add to that the worry that those repeated fluid boluses didn't merely swell him with edema past the point of recognizability, but may actually have contributed to his death.