Daniel Simons: July 2013

Tuesday, July 9, 2013

Pop Quiz - What can we learn from an intervention study?

Pop Quiz

1. Why is a double-blind, placebo-controlled study with random assignment to conditions the gold standard for testing the effectiveness of a treatment?

2. If participants are not blind to their condition and know the nature of their treatment, what problems does that lack of control introduce?

3. Would you use a drug if the only study showing that it was effective used a design in which those people who were given the drug knew that they were taking the treatment and those who were not given the drug knew they were not receiving the treatment? If not, why not?

Stop reading now, and think about your answers.

Most people who have taken a research methods class (or introductory psychology) will be able to answer all three. The gold standard controls for participant and experimenter expectations and helps to control for unwanted variation between the people in each group. If participants know their treatment, then their beliefs and expectations might affect the outcome. I would hope that you wouldn't trust a drug tested without a double-blind design. Without such a design, any improvement by the treatment group need not have resulted from the drug.

In a paper out today in Perspectives on Psychological Science, my colleagues (Walter Boot, Cary Stothart, and Cassie Stutts) and I note that psychology interventions typically cannot blind participants to the nature of the intervention—you know what's in your "pill." If you spend 30 hours playing an action video game, you know which game you're playing. If you are receiving treatment for depression, you know what is involved in your treatment. Such studies almost never confront the issues introduced by the lack of blinding to conditions, and most make claims about the effectiveness of their interventions when the design does not permit that sort of inference. Here is the problem:

If participants know the treatment they are receiving, they may form expectations about how that treatment will affect their performance on the outcome measures. And, participants in the control condition might form different expectations. If so, any difference between the two groups might result from the consequences of those expectations (e.g., arousal, motivation, demand characteristics, etc.) rather than from the treatment itself.

A truly double blind design addresses that problem—if people don't know whether they are receiving the treatment or the placebo, their expectations won't differ. Without a double blind design, researchers have an obligation to use other means to control for differential expectations. If they don't, then a bigger improvement in the treatment group tells you nothing conclusive about the effectiveness of the treatment. Any improvement could be due to the treatment, to different expectations, or to some combination of the two. No causal claims about the effectiveness of the treatment are justified.

If we wouldn't trust the effectiveness of a new drug when the only study testing it lacked a control for placebo effects, why should we believe a psychology intervention if it lacked any controls for differential expectations? Yet, almost all published psychology interventions attribute causal potency to interventions that lack such controls. Authors seem to ignore this known problem, reviewers don't block publication of such papers, and editors don't reject them.

Most psychology interventions have deeper problems than just a lack of controls for differential expectations. Many do not include a control group that is matched to the treatment group on everything other than the hypothesized critical ingredient of the treatment. Without such matching, any difference between the tasks could contribute to the difference performance. Some psychology interventions use completely different control tasks (e.g., crosswords puzzles as a control for working memory training, educational DVDs a control for auditory memory training, etc). Even worse, some do not even use an active control group, instead comparing performance to a "no-contact" control group that just takes a pre-test and a post-test. Worst of all, some studies use a wait-list control group that doesn't even complete the outcome measures before and after the intervention.

In my view, a psychology intervention that uses a waitlist or no-contact control should not be published. Period. Reviewers and editors should reject it without further consideration -- it tells us almost nothing about whether the treatment had any effect, and is just a pilot study (and a weak one at that).

Studies with active control groups that are not matched to the treatment intervention should be viewed as suspect—we have no idea what differences between the treatment and control condition were necessary. Even closely matched control groups do not permit causal claims if the study did nothing to check for differential expectations.

To make it easy to understand these shortcomings, here is a flow chart from our paper that illustrates when causal conclusions are merited and what we can learn from studies with weaker control conditions (short answer -- not much):

Figure illustrating the appropriate conclusions as a function of the intervention design

Almost no psychology interventions even fall into that lower-right box, but almost all of them make causal claims anyway. That needs to stop.

If you want to read more, check out our OpenScienceFramework Page for this paper/project. It includes an answers to a set of Frequent Questions.

Tuesday, July 2, 2013

Six simple steps scientists can take to avoid having their work misrepresented

Posted by Daniel J. Simons

Writing a journal article? Here are six simple steps you can take to avoid having your claims misinterpreted and misrepresented. None of these steps require any special analyses or changes to your lab practices. They are steps you should take when writing about your findings. I haven't always followed these steps in my own articles, but I will be in the future (whenever I have final say on a paper or can convince my collaborators).

Do not speculate in your abstract. Abstracts are the place to report what you did, why you did it, and what you found. It is fine to report any conclusion that follows directly from your data. But, you should not use the abstract as a place to make claims that exceed your evidence. For example, even if you think your findings with undergrads in your laboratory might be relevant for a better understanding of autism, your abstract should not mention autism unless you actually studied it. Readers of your abstract (often the only thing people read) will assume that what you said is what you found, and media reports will focus on that your speculation rather than your findings.
Separate planned and exploratory analyses and label them. If you registered your analysis plan and stick to it, you can mark those analyses as planned, documenting that you are testing what you originally intended to test. It is fine to explore your data fully, but you should flag any unplanned analyses as exploratory and note explicitly that they require replication and verification. Your exploratory analyses should be treated as speculative rather than definitive tests.
Combine results and discussion sections. Justify each analysis and explain what it shows in the same place in your manuscript. If you separate your analyses and explanations, non-expert readers will skip your evidence and focus on your conclusions. By combining them, you allow the reader to better evaluate the link between your evidence and your conclusions.
Add a caveats and limitations section. In your general discussion, you should add a description of any limitations of your study. That includes shortcomings of the method, but also limitations to the generalizability of your sample, effects in need of replication, etc. If your effects are small, you should note if and how that limits their practical implications. By identifying limitations and caveats in your paper, your readers will better understanding what your findings do and do not show.
Specify the limits of generalization. Few papers do this, but all of them should. Most papers in psychology test undergraduates and then make claims as if they apply to all of humanity. Perhaps they do, but any generalization beyond the tested population should be justified. If you tested undergraduates and expect your studies to generalize to similar undergraduate populations, you should say so. If you think they also will generalize to the elderly or to children, you should say so and explain why. Spell out the characteristics of your sample that you think are essential to obtain your effect. Specifying generalization has benefits. First, it lets readers know the scope of your effects and helps them to predict whether they could obtain the same result with their own available population. Second, it clarifies the importance of your findings. If you expect that your effects are limited to subjects at your university in December of 2012 and won't generalize to other times or places, then it is less clear that anyone should care. Third, by specifying your generalization, you are making a more precise claim about your effect that others can then test. If you claim your effect should generalize to all undergraduates, then anyone testing undergraduates should be able to find it (assuming adequate statistical power), and if they can't, that undermines your claim. If you restrict generalization too much to protect yourself against challenges, then others will have no reason to bother testing your effect. Perhaps most importantly, if you appropriately limit your generalization in the paper itself, then media coverage will be less likely generalize your claims beyond what you actually intended.
Flag speculation as speculation. If you must discuss implications that go beyond what your data show, explicitly flag those conclusions as speculative and note that they are not supported by your study. By calling speculation what it is, you avoid having others assume that your wildest and most provocative ideas are evidence-based. Speculation is okay as long as everyone reading your paper knows what it is.

Bonus suggestion: If you have a multiple-author paper, the Acknowledgements or Author's Note should specify each author's contributions clearly and completely. By doing so, you assign both credit and blame where it is deserved. For example, when I collaborate on a neuroimaging project, I make clear that I had nothing to do with any of the imaging data collection, coding, or analysis. I should get no credit for that part of a study (given that I know nothing about imaging), but I also should take no blame for any missteps in that part of the project.