This is a third talk in a series that began with Relationship Based Medicine , continued with Beware of Doctors Bearing Gifts and concludes with this talk, which could called History of a Medical Psychosis, Medical Neoliberalism, Evident versus Evidence Based Medicine, A Lutheran Moment, or Does Objectivity Come from using Chance to Control Bias or Bias to Control Chance?
It is the most important talk I have ever given.
The first lecture was delivered to clinicians in New York with a Q and A afterwards.
The second was delivered to the public in Lethbridge Alberta, thanks to Jennifer Williams and Dan Johnson but owing to tech difficulties at the venue (See In Memory of Dexter Johnson), it was difficult to record the Q and A with the public. Suffice to say though between the technical difficuties, the lecture and the Q and A, we were all there for the better part of 3 hours and the discussion was great.
This third lecture was delivered to Aaron Kesselheim’s PORTAL group – Program on Regulation, Therapeutics and Law. There are two versions. The History of a Medical Psychosis was recorded by Bill James the day before in case of glitches – same day as Putin and Biden gave speeches. The second was recorded by Aaron – Faulty Evidence and Moral Hazard.
There are slight differences between them. The text and slides below add some detail to both talks but the tone of voice and gestures in the talks likely convey things not in the text.
Slide 1: Faulty Evidence and Moral Hazard
Welcome to a very conservative talk – based on a belief in the medical model and in evaluating the drugs we use thoroughly.
Slide 2: These quotes are a precis of key points in the deposition of Ian Hudson, Chief Safety Officer of GlaxoSmithKline (GSK) in 2000 in the Tobin v SmithKline trial.
Forty-Eight hours after starting Paxil Don Schell shot his wife, daughter and granddaughter and then himself. Hudson is being asked – Can SSRIs cause Suicide?
The jury dismissed Hudson’s Evidence Based Medicine view in favor of Evident Based Medicine and in this Civil trial found GSK guilty of negligence that resulted in the death of this family.
Hudson’s view, however, remains ensconced at the top of Britain’s drugs regulator, of which he was later the CEO – as well as FDA, EMA, TGA, Health Canada, WHO, and Boston institutions like Harvard, MRCT, and Vivli. Joe Biden and the Pope’s advisers will also endorse and tell their bosses to say – Yes RCTs are the Way the Truth and the Light.
Slide 3: Hudson’s views originate 70 years earlier in the work of a strange man – Ronnie Fisher.
Here you see Fisher smoking a pipe. He dismissed the later link between smoking and lung cancer, saying personality types predisposed to both cancer and smoking. Evidence was not Fisher’s strong point.
He had nothing to do with medicine and never ran an RCT. Controlled trials and randomization were there before Fisher and were no big deal but for no clear reason his book the Design of Experiments transformed what came next.
Fisher ran a thought experiment to characterize expert knowledge. He mentioned randomization as a means to control for any trivial unknown unknowns. Randomization later became semi-mystical.
Fisher’s expert knew parachutes worked so if we set up two groups, one with parachutes and the other not, we might randomize in case there was someone with webbed feet who might behave differently when falling. Otherwise, we would expect those wearing parachutes to live and those not to die – unless a chance strong wind lands a person in snow covered trees.
If randomization eliminated webbing as a factor, the only thing that could get in the way of an expert being right was chance and this could be assigned a statistically significant value. If 1 in 20 of those without parachutes lived we wouldn’t say the expert didn’t know what he was talking about. Fisher was characterizing expertise rather than characterizing an exploration of the unknown.
Randomization can’t control for ignorance.
Slide 4: Fisher’s expert is a Robin Hood who 19 times out of 20 can split a prior arrow lodged in the Bull.
Slide 5: But the trials done to license drugs especially antidepressants look more like this. A mismatch on this scale indicates medical RCTs are nothing like what Fisher had in mind.
Slide 6: The first RCT in medicine was a trial of streptomycin for tuberculosis. Tony Hill used randomization as a method of fair allocation – he was not managing mystical confounders. Hill helped put the effects of smoking on the map. He had no time for Fisher. He also knew doctors were not experts. His trial was not a demonstration of expertise.
Hill’s RCT found out less about streptomycin than a prior non-randomized trial in the Mayo Clinic, which showed it can cause deafness and tolerance develops rapidly.
Slide 7: Twenty years later, here is Tony Hill taking stock of controlled trials. In this 1965 lecture, he mentions that it is interesting that the people who are most heavily now promoting controlled trials are pharmaceutical companies.
Hill didn’t think trials had to be randomized. He thought double-blinds could get in the way of doctors evaluating a drug. He was a believer in Evident Based rather than Evidence Based Medicine.
Hill said we needed RCTs around 1950 to work out if anything worked. By 1960 he figured we had lots of things that worked – none of which had been brought on the market through an RCT – and he thought the need was to find out which drug worked best. This is not something RCTs can do – there is no such thing as a best drug. RCTs have instead become a way for companies to get weaker drugs on the market.
He said that RCTs produce average effects which are not much good in telling a doctor what to do for the patient in front of them.
All drugs do 3000 + things – one of which might be useful for treatment purposes. In focusing on one element, by default, Hill is saying RCTs are not a good way to evaluate a drug. All RCTs generate ignorance. But we can bring good out of this harm if we remain on top of what we are doing. Hill never saw RCTs replacing clinical judgement.
Slide 8: This 1960 RCT run by Louis Lasagna makes Hill’s point well. Thalidomide has therapeutic efficacy as a sleeping pill but the trial missed the SSRI-like sexual dysfunction, suicidality, agitation, nausea and peripheral neuropathy it causes.
Two years later, Lasagna was responsible for incorporating RCTs in the 1962 Food and Drugs Act Amendments – in order to minimise the chance of another thalidomide. By doing this, more than anyone else, Lasagna was the man who got us using RCTs
This trial would have licensed thalidomide today. The 1938 Act had no requirement for RCTs.
Slide 9: Many claim RCTs demonstrate cause and effect in a way no other study design can.
The 1950s was a golden age of new drugs that gave us the best antihypertensives, hypoglycemics, antibiotics and psychotropic drugs we have ever had without RCT input into any discoveries.
Imipramine was the first antidepressant. It and other antidepressants beat SSRIs in later RCTs. It can treat melancholia – SSRIs can’t. Melancholia comes with a high risk of suicide.
Imipramine was launched in 1958. At a meeting in 1959, European experts made clear that while it was a wonderful treatment imipramine made some people suicidal. Stop the drug and it clears. Re-introduce and it comes back. This was Evident Based Medicine showing this drug can cause suicide.
Like Fisher, let’s do a thought RCT of imipramine versus placebo in melancholia. Even though it can cause suicide, we would expect it to reduce the number of suicides because it treats this high risk condition. If you didn’t know better, this RCT would look like evidence antidepressants do not cause suicide.
Slide 10: Here is the data on the trials in mild depression that brought the SSRIs to market – mild depression because SSRIs are no use in melancholia. You see an increase of suicidal events compared to placebo in people at little or no risk of suicide.
Slide 11: This is what the data for imipramine look like in the same mild depressions. This is not a thought experiment – it was used as a comparator in SSRI trials. Now it too causes suicides.
RCTs can give us diametrically opposite answers. This is because these are not Drug Trials. They are Treatment Trials and if the condition and treatment produce superficially similar effects, randomized trials cause confounding rather than solve it. This is true for most medical conditions and their treatments.
People evaluating drugs in traditional clinical trials, before RCTs, knew this. When a patient becomes suicidal in a trial you have to use your judgement to work out what is happening but in RCTs clinicians are not supposed to use their judgment. RCTs are more objective than our judgments – supposedly.
Slide 12: Here is what a Drug Trial looks like. In healthy volunteer studies in the 1980s, companies found SSRIs cause volunteers to become suicidal, dependent and sexually dysfunctional. We heard nothing about these problems when the drugs launched in part because Drug Trials enabled companies to engineer Treatment Trials to hide these problems.
Slide 13: If you break a limb and get recruited to an RCT randomly applying casts to one limb – not necessarily the broken one – the trial will show random application beats placebo. Practicing Evidence Based Medicine rather than Evident based Medicine here would clearly be crazy.
Slide 14: Here is a James Webb telescope image. James Webb is marvellously bringing out the infinite individuality of stars.
In addition to randomization, Fisher put a premium on Statistical Significance. By 1980 every leading medical statistician was saying we need to get rid of statistical significance in favor of Confidence Intervals.
Confidence Intervals had been introduced by Gauss around 1810. Because of measurement error, the telescopes in use often failed to establish whether there was one or two stars in a location. Measurement errors should distribute nornally and so constructing confidence intervals could help us distinguish individual stars.
We have moved a long way forward in this respect with the James Webb telescope you see here.
Slide 15: Confidence intervals rushed into medicine in the mid-1980s. All the authorities on the right – many linked to Boston – argued they were much more appropriate than significance testing. They are appropriate for measurement error but are they any more a cure for ignorance than statistical significance?
Slide 16: Confidence intervals we are told allow us to estimate the size of an effect and the precision with which it is known. We have much more precise details on the likelihood of the Red Drug here killing you than we have for the Yellow Drug. The best estimate of the lethal effect for the Yellow Drug however is greater. The standard view is that if we increase the size of the Yellow Drug Trial we will have greater precision and know better what the risks are. As we shall see, this is wrong.
As things stand, if you are asked to take one of these drugs, should you be guided by precision or effect size? Ian Hudson, FDA and WHO say the only dangerous drug here is the Red One. This is because more than 95% of the data, more than 19 out of 20 lie to the right of the line through 1.0 – confidence intervals have defaulted into statistical significance.
I would take the Red rather than the Yellow one. This is not measurement error and we don’t know what confidence intervals represent when they are not representing measurement error.
Slide 17: Faced with claims Prozac causes suicide, Lilly analysed their clinical trials and claimed there is no evidence their drug causes suicide. Confidence Intervals are being spun here as indicating we don’t know Prozac causes suicide as nothing is statistical significant. This is Ian Hudson thinking – at odds with all statistical expertise. It’s wrong. The consistency across young and old, depression and eating disorders strongly suggests in real life there is an excess of suicidal events.
Slide 18: There is an intriguing mystery behind these figures. Here you see a representation of suicidal events that happened in the trials that brought Prozac, Paxil and Zoloft to market around 1990. You’ll note there are events under the word screening here. There is a 2 week washout period before a trial starts where people are whipped off their prior drugs before being put on the new treatment or placebo. This is a highly dangerous phase where people are in withdrawal and very likely to go on to a suicide attempt.
Slide 19: And here you see the moves companies made..