"Red herring" surrogate endpoints
Endpoints are the outcomes that are measured during clinical trials. Ideally trials should measure outcomes that are clinically important (important for patients) e.g. the impact of therapy on mortality or aspects of morbidity such as pain, functional status, ability to work or satisfaction with treatment.
Surrogate endpoints are phemomon that are believed to predict clinically important outcomes that are used instead of measuring important outcomes. Surrogate endpoints are quicker to measure than clinically important endpoints. Changes to ideal surrogate endpoints reliably predict changes to clinically important endpoints. Unfortunately, many surrogate endpoints may turn out to be misleading "red herrings". For example, flecainide was believed to be beneficial because it reduced arrhythmias (a surrogate endpoint). However the CAST trial found that flecainide increased the death rate (a clinically important endpoint).
"Red herring" surrogate endpoints are surrogate endpoints which are believed to predict clinically important endpoints but do not.
"Red Herring (Non Sequitur)
Legend has it that a red herring was used when dogs were being trained to follow a criminal's scent. The fish was dragged over the scent of the "criminal" and the dogs were then taught to ignore this fresher, perhaps more interesting, scent and stay with the old scent. The red herring fallacy does the same thing: it presents facts that, although interesting, are not related to the conclusion.
Several years ago, an advertisement stated that cimetidine "reduces potential risks of prolonged acid suppression, reduces potential for acetaminophen toxicity, and (has a) lower risk of hepatotoxicity than ranitidine". On the surface, cimetidine apears to be safer than ranitidine. On further analysis, the facts, although perhaps true, have little bearing on the clinical decision to use one agent over the other. The first claim states that cimetidine doesnt decrease gastric acid secretion for as long as does ranitidine, and this shorter duration of action is somehow better. However, the clinical detriment to prolonged acid suppression has not been demonstrated. Though cimetidine theoretically may decrease the metabolism of acetaminophen to toxic agents, this action is of little practical use since an overdose of acetaminophen occurring in an individual already taking either cemetidine or ranitidine is extremely small. The third claim of lower hepatotoxicity is also irrelevant, since the risk with either agent is extremely small.
In all three appeals in this argument, the facts may be true but are irrelevant. This promotional campaign incurred the wrath of the FDAs Division of Drug Advertising and Labeling, but only after the claims were brought to the FDAs attention by the maker of ranitidine.
Red herring appeals usually occur when little difference
exists among the available choices of therapy. This appeal preys
upon our tendency toward wishful thinking."
- Shaughnessy et al (1994)
The example of Hypertension
The aim of therapy for people who have hypertension is to delay premature morbidity and mortality from complications. The complications include cerebral or coronary artery disease, heart failure and aortic aneurysm and microvascular disease of the brain, kidney and retina.
Consequently, when evaluating antihypertensive therapies we need to focus on evidence for morbidity and mortality benefits.
Drug companies are allowed to promote drugs for hypertension if they lower blood pressure. Lowering blood pressure may, or may not, effect the clinically important endpoints: morbidity and mortality from complications. Lower blood pressure is used as a surrogate for the endpoints that matter.
Thiazides have been shown to reduce mortality. Calcium Channel Blockers remain unproven for reducing mortality due to uncomplicated hypertension. Consequently, promoters of Calcium Channel Blockers often use surrogate endpoints in their advertising.
The following examples show how surrogate endpoints may be misleading:
Flecainide
"Sometimes relying on 'surrogate endpoints' to measure a drug's effectiveness can be misleading and downright dangerous. Which is exactly what happened with the deadly drug flecainide. Designed to prevent cardiac arrests, it was widely prescribed during the 1980s for irregular heart beats long before it was properly tested.
As with the new drugs for high blood pressure, surrogate endpoints had been used to measure flecainide's risks and benefits. While the drug had been shown to settle irregular heart beats, no evidence existed from long-term trials showing the drug could actually prevent cardiac arrests and death. It was assumed that if the drug reduced irregular heart beats in people then it must prevent heart attacks. But this assumption was fatally flawed.
After a number of years, questions were raised about how the drug was actually affecting patients, and top cardiologists in the United States belatedly set up a large trial to run at 27 centres around the country. As soon as results started to flow in, however, a horrifying picture emerged: the drug was in fact causing the very same problems it was designed to prevent. The trial called the Cardiac Arrhythmia Suppression Trial, or CAST, had found that rather than save lives, two drugs in particular-flecainide and encainide-were actually leading to an increased risk of death in those taking them by causing the cardiac arrests the drugs were supposed to prevent. The large trial was stopped prematurely and the use of these drugs curtailed.
In 1995, US health policy researcher and investigative journalist Thomas J. Moore wrote about the extraordinary case in a book called Deadly Medicine: Why tens of thousands of heart patients died in America's worst drug disaster. His meticulous research showed that hundreds of thousands of people were taking flecainide and drugs like it before they were properly tested. The book revealed that an estimated 50,000 patients died from taking these drugs.
The disaster sent a warning to the medical profession
around the world about the dangers of relying on 'surrogate
endpoints' rather than properly assessing a new drug's actual
impact on the health of those taking it."
-Moynihan (1998)
Mibefradil
"Advertisements for mibefradil suggest that there was little modesty in the promotion of the drug, which was claimed to be a first-line therapeutic agent for hypertension and angina. Given the absence of long-term studies such as those that have established thiazide diuretics and ß-blockers as truly first-line therapy for hypertension, objective observers would classify mibefradil as a second-line agent. ...
Pharmacological plausibility plays a large
part in the licensing of drugs, as is evidenced by the wide use
of surrogate outcomes. Perhaps it is time to be less enamoured by
pharmacological rationale and to accept that the true value of a
drug and whether it should be a first-line agent can be assessed
only through properly conducted randomised controlled trials with
hard outcomes."
-Li Wan Po, Zhang (1998)
More about surrogate endpoints
"In a veiled slight on surrogate end points, Sackett and his team remind us that the choice of specific treatment should be determined by evidence of what does work and not on what seems to work or ought to work. "Today's therapy", they warn "when derived from biologic facts or uncontrolled clinical experience, may become tomorrow's bad joke"
If you are a practising (and non-academic) clinician, your main contact with published papers may well be through what gets fed to you by a "drug rep". The pharmaceutical industry is a slick player at the surrogate end point game, and I make no apology for labouring the point that such outcome measures must be evaluated very carefully.
I will define a surrogate end point as "a variable that is relatively easily measured and that predicts a rare or distant outcome of either a toxic stimulus (for example, pollutant) or a therapeutic intervention (for example, surgical procedure, piece of advice), but which is not itself a direct measure of either harm or clinical benefit". The growing interest in surrogate end points in medical research reflects two important features of their use:
In the evaluation of pharmaceutical products, commonly used surrogate end points include:
Surrogate end points have several drawbacks. Firstly, a change in the surrogate end point does not itself answer the essential preliminary questions: "What is the objective of treatment in this patient?" and "What, according to valid and reliable research studies, is the best available treatment for this condition?" Secondly, the surrogate end point may not closely reflect the treatment target-in other words, it may not be valid or reliable. Thirdly, the use of a surrogate end point has the same limitations as the use of any other single measure of the success or failure of treatment - it ignores all the other measures! Over-reliance on a single surrogate end point as a measure of therapeutic success usually reflects a narrow or naïve clinical perspective.
Finally, surrogate end points are often developed in animal models of disease so that changes in a specific variable can be measured under controlled conditions in a well defined population. Extrapolation of these findings to human disease, however, is liable to be invalidl5-17.
The ideal features of a surrogate end point are shown in below. If the "rep" who is trying to persuade you of the value of the drug cannot justify the end points used, you should challenge him or her to produce additional evidence.
| Ideal Features of a surrogate end
point The surrogate end point should be reliable, reproducible, clinically available, easily quantifiable, affordable, and exhibit a "dose-response" effect (that is, the higher the level surrogate end point, the greater the probability of disease. It should be a true predictor of disease (or risk of disease) and not merely express exposure to a covariable. The relation between the surrogate end point and the disease should have a biologically plausible explanation It should be sensitive-that is, a "positive" result for the surrogate end point should pick up all or most patients at increased risk of adverse outcome It should be specific-that is, a "negative" result should exclude all or most of those without increased risk of adverse outcome There should be a precise cut off between normal and abnormal values It should have an acceptable positive predictive value-that is, a "positive" result should always or usually mean that the patient thus identified is at increased risk of adverse outcome It should have an acceptable negative predictive value-that is, a "negative" result should always or usually mean that the patient thus identified is not at increased risk of adverse outcome It should be amenable to quality control monitoring Changes in the surrogate end point should rapidly and accurately reflect the response to treatment-in particular, levels should normalise in states of remission or cure |
One important example of the invalid use of a surrogate end point is the CD4 cell count (a measure of one type of white blood cell which, when I was at medical school, was known as the "T helper cell") in monitoring progression to AIDS in HIV positive subjects. The CONCORDE trial was a randomised controlled trial comparing early versus late initiation of zidovudine treatment in patients who were HIV positive but clinically asymptomatic. Previous studies had shown that early initiation of treatment led to a slower decline in the CD4 cell count (a variable that had been shown to fall with the progression of AIDS), and it was assumed that a higher CD4 cell count would reflect improved chances of survival.
The CONCORDE trial, however, showed that while CD4 cell counts fell more slowly in the treatment group, the survival rates at three years were identical in the two groups. This experience confirmed a warning issued earlier by authors suspicious of the validity of this end point. Subsequent research has attempted to identify a surrogate end point that correlates with real therapeutic benefit - that is, progression of asymptomatic HIV infection to clinical AIDS and survival time after the onset of AIDS. Using multiple regression analysis, investigators in the United States found that a combination of several markers (proportion of CD4 C29 cells, degree of fatigue, age, and haemoglobin concentration) was the best predictor of progression.
If you think this is an isolated example of the world's best scientists all barking up the wrong tree in pursuit of a bogus end point, check out the literature on the use of ventricular premature beats (a minor irregularity of the heartbeat) to predict death from serious heart rhythm disturbance, blood concentrations of antibiotics to predict clinical cure of infection, or plaques on magnetic resonance imaging to chart the progression of multiple sclerosis. You might also like to see the fascinating literature on the development of valid and relevant surrogate end points in the important field of cancer prevention.
Clinicians are increasingly sceptical of arguments for using new drugs, or old drugs in new indications, that are not justified by direct evidence of effectiveness. Before surrogate end points can be used in the marketing of pharmaceuticals those in the industry must justify, the utility, of these measures by demonstrating a plausible and consistent link between the end point and the development or progression of disease.
It would be wrong to suggest that the pharmaceutical
industry develops surrogate end points with the deliberate
intention to mislead the licensing authorities and health
professionals. Surrogate end points have both ethical and
economic imperatives. The industry does, however, have a vested
interest in overstating its case on the strength of these end
points."
-Greenhalgh (1997)
Greenhalgh T. How to read a paper: the basics
of evidence based medicine. BMJ Publishing. London 1997;90-4
Li Wan Po A, Zhang WY. What lessons can be learnt from withdrawal
of mibefradil from the market? Lancet 20 June 1998: 351: 9119:
1829-30
Moynihan R. Too much medicine? ABC Books. Sydney 1998;56-7
Roland M, Torgerson D. Understanding controlled trials:
What outcomes should be measured? BMJ 1998;317:1075-1080
( 17 October )
Shaughnessy AF, Slawson DC, Bennett JH. Separating the wheat from
the chaff: identifying fallacies in pharmaceutical promotion. J
Gen Intern Med 1994;9:563-8.