No Significant Results? Try a Different Research Measure

As I’ve discussed previously, drug-gene testing, also referred to as pharmacogenomics or pharmacogenetics, doesn’t really yet work for psychiatric drugs and disorders. People are buying a promise that’s not backed up by the research.

Recently, one company in this space published a followup study to a large outpatient study of patients with clinical depression. Since the first study did not show any statistical significance in the study’s primary outcome measure, the company decided to simply re-crunch the data with another measure. Voila! Significance found.

In early 2019, Myriad Genetics, makers of the GeneSight Psychotropic test, had the results of a study they funded published (Greden et al., 2019). This is referred to as the GUIDED study — Genomics Used to Improve DEpression Decisions.

The primary measure used in that study — the Hamilton Depression Rating Scale-17 (HAM-D17) — showed no statistical significance between a group of patients using treatment guided by the drug-gene test to a group that had treatment as usual. This scale is commonly used in depression drug trials as a “gold standard” for measuring the effectiveness of depression treatment.

The difference in symptom improvement scores between the two groups was 2.8%, with the “guided care” (the drug-gene testing) group experiencing slightly better symptom improvement. This difference, however, was not statistically significant.

The study also found that the guided-care group experienced significantly improvements in response and remission rates.

I guess the lack of statistical significance on the HAM-D17 was bothersome to the company, as it undermines their marketing message about the superiority of their drug-gene test. After all, the HAM-D17 was listed as the only primary outcome measure in the Clinical Trials database. Since that outcome measure did not show statistical — much less clinical — significance, that suggested the GeneSight test perhaps wasn’t as helpful as the company claimed.

Twenty-five additional secondary measures were also listed. Of the ones actually reported in the study, these measures also demonstrated mixed statistical significance for the guided-care group.

Let’s “Re-Analyze”!

So the company decided to re-evaluate the data from the GUIDED study by looking at another measure — the HAM-D6. As you can likely guess, the HAM-D6 is a subset of the HAM-D17, consisting of just 6 of the 17 questions found on the longer measure. The HAM-D6 was developed to cut down on the time needed to administer the test. It also purports to more closely measure the symptoms related to the DSM-IV diagnostic criteria for clinical depression — e.g., it is more sensitive to detect depressive symptoms that are used in diagnosis.

This reanalysis could be done since they had all the data from the HAM-D17. All they had to do was just look at those 6 questions used on the shorter measure to see what they might find. Here’s what the lead researcher of the new study claims in the company’s press release:

“The HAM-D6 scale has been shown to be a better measure of core depressive symptoms than the HAM-D17 scale,” said Boadie W. Dunlop, M.D., one of the study investigators and associate professor of Psychiatry and Behavioral Sciences at Emory University School of Medicine.

“This post hoc analysis provides further evidence that the GeneSight test led to significant and clinically meaningful improvements in clinical outcomes for patients with major depressive disorder relative to treatment-as-usual care.”

Now, honestly, this is just BS. If the original study had found statistical significance with the HAM-D17, there’s no way the same set of researchers would then go on to conduct what amounts to a big ol’ fishing expedition, in my opinion. In fact, it begs the obvious question — if the HAM-D6 is such a superior measure, why wasn’t it used (even as a secondary measure) in the original study?

The new study found that patients in the guided-care group experienced a 4.4% greater difference in symptom improvement vs treatment-as-usual group. Voila again!

Since that difference is statistically significant, it now allows the researchers to claim that the GeneSight test is superior to treatment as usual according to a widely-accepted depression measure. The researchers nicely milked that 1.6% difference between the two studies — the amount apparently needed to claim statistical significance.

Does It Matter to Patients Clinically?

Researchers can babble all day long about data and statistical significance. It means little to most people. And it’s no wonder, because statistical significance in the data doesn’t automatically translate into clinical significance in a doctor’s office.

In short, do patients subjectively feel that 4.4% difference in symptom improvement in their lives?

Arguably, the answer in this case is a firm “maybe.” The response and remission rates found in the study speak more strongly to the possible impact that the guided-care arm had in treatment, since those in that group seemed to have a quicker response to the treatment they were prescribed, and were able to keep the depression symptoms at bay more often than those in standard care.

But in terms of the actual subjective feeling of symptom improvement, I believe the results are decidedly less clear. I don’t believe that most patients would experience much of a subjective difference in their symptoms in the guided-care group versus the treatment-as-usual group.

Keep in mind that both groups studied had less depression symptoms over time. It’s just that in the GeneSight group, those patients reported a slightly greater improvement in their symptoms.

If Myriad Genetics was looking for a grand-slam in terms of evidence clearly demonstrating the efficacy of their drug-gene test, I don’t think they found it in either of these studies. What the studies demonstrate instead, in my opinion, is a slightly better outcome for some patients who take the GeneSight test. It is not an outcome that I believe to be clinically significant, nor justifies the widespread use of any GeneSight test for psychiatric disorders at this time.



Bech, P. (2006). Rating scales in depression: limitations and pitfalls. Dialogies in Clinical Neuroscience, 8(2), 207-215.

Dunlop BW, Parikh SV, Rothschild AJ, Thase ME, DeBattista C, Conway CR, Forester BP, Mondimore FM, Shelton RC, Macaluso M, Logan J, Traxler P, Li J, Johnson H, Greden JF. (2019). Comparing sensitivity to change using the 6-item versus the 17-item Hamilton depression rating scale in the GUIDED randomized controlled trial. BMC Psychiatry, 19(1):420. doi: 10.1186/s12888-019-2410-2.

Greden JF, Parikh SV, Rothschild AJ, Thase ME, Dunlop BW, DeBattista C, Conway CR, Forester BP, Mondimore FM, Shelton RC, Macaluso M, Li J, Brown K, Gilbert A, Burns L, Jablonski MR, Dechairo B. (2019). Impact of pharmacogenomics on clinical outcomes in major depressive disorder in the GUIDED trial: A large, patient- and rater-blinded, randomized, controlled study. J Psychiatr Res., 111:59-67. doi: 10.1016/j.jpsychires.2019.01.003. Epub 2019 Jan 4.

Thase ME, Parikh SV, Rothschild AJ, Dunlop BW, DeBattista C, Conway CR, Forester BP, Mondimore FM, Shelton RC, Macaluso M, Li J, Brown K, Jablonski MR, Greden JF. (2019). Impact of Pharmacogenomics on Clinical Outcomes for Patients Taking Medications With Gene-Drug Interactions in a Randomized Controlled Trial. J Clin Psychiatry, 80(6). pii: 19m12910. doi: 10.4088/JCP.19m12910.

Related Articles