Try to spot the flaw in this study. A scientist recruits a group of subjects to test the effectiveness of a new drug that purportedly improves attention. After giving subjects a pre-test to measure their attention, the experimenter tells the subjects all about the exciting new pill, and after they take the pill, the experimenter re-tests their attention. The subjects show significantly better performance the second time they’re tested.
This study would never pass muster in the peer review process—the flaws are too glaring. First, the subjects are not blind to the hypothesis—the experimenter told them about the exiting new drug—so they could be motivated to try harder the second time they take the test. The experimenter isn’t blind to the hypothesis either, so they might influence subject performance as well. There’s also no placebo control condition to account for the typical improvement people make when performing a task for the second time. In fact, this study lacks all of the gold-standard controls needed in a clinical intervention.
A couple years ago, +Walter Boot, Daniel Blakely and I wrote a paper in Frontiers in Psychology that describes serious flaws in many of the studies underlying the popular notion that playing action video games enhances cognitive abilities. The flaws are sometimes more subtle, but they’re remarkably common: None of the existing studies include all the gold-standard controls necessary to draw a firm conclusion about the benefits of gaming on cognition. When coupled with publication biases that exclude failures to replicate from the published literature, these flaws raise doubts about whether the cumulative evidence supports any claim of a benefit.
The evidence in favor of a benefit from video games on cognition takes two forms: (a) expert/novice differences and (b) training studies.
The majority of studies compare the performance of experienced gamers to non-gamers, and many (although not all) show that gamers outperform non-gamers on measures of attention, processing speed, etc (e.g., Bailystock, 2006; Chisholm et al., 2010, Clark, Fleck, & Mitroff, 2011; Colzato et al., 2010; Donohue, Woldorff, & Mitroff, 2010; Karle, Watter, & Shedden, 2009; West et al., 2008). Such expert/novice comparisons are useful and informative, but they do not permit any causal claim about the effects of video games on cognition. In essence, they are correlational studies rather than causal ones. Perhaps the experienced gamers took up gaming because they were better at those basic cognitive tasks. That is, gamers might just be better at those cognitive tasks in general, and their superior cognitive skills are what made them successful gamers. Or, some third factor such as intelligence or reaction times might contribute to interest in gaming and performance on the cognitive tasks.
Fortunately, only a few researchers make the mistake of drawing causal conclusions from a comparison of experts and novices. Yet, almost all mistakenly infer the existence of an underlying cognitive difference from a performance difference without controlling for other factors that could lead to performance differences. Experts in these studies are recruited because they are gamers. Many are familiar with claims that gamers outperform non-gamers on cognition and perception tasks. They are akin to a drug treatment group that has been told how wonderful the drug is. In other words, they are not blind to their condition, and they likely will be motivated to perform well. The only way around this motivation effect is to recruit subjects with no mention of gaming and only ask them about their gaming experience after they have completed the primary cognitive tasks, but only a handful of studies have done that. And, even with blind recruiting, gamers might still be more motivated to perform well because they are asked to perform a game-like task on a computer. In other words, any expert-novice performance differences might reflect different motivation and not different cognitive abilities.
Even if we accept the claim that gamers have superior cognitive abilities, such differences do not show that games affects cognition. Only by measuring cognitive improvements before and after game training might a causal conclusion be justified (e.g., Green & Bavelier, 2003; 2006a; 2006b, 2007). Such training studies are expensive and time-consuming to conduct, and only a handful of labs have even attempted them. And, at least one large-scale training study has failed to replicate a benefit from action game training (Boot et al., 2008). Yet, these studies are the sole basis for claims that games benefit cognition. In our Frontiers paper, we discuss a number of problems with the published training studie. Taken together, they raise doubts about the validity of claims that games improve cognition:
- The studies are not double-blind. The experimenters know the hypotheses and could subtly influence the experiment outcome.
- The subjects are not blind to their condition. A truly blind design is impossible because subjects know which game they are playing. And, if they see a connection between their game and the cognitive tasks, then such expectations could lead to improvements via motivation (a placebo effect). Unless a study controls for differential expectations between the experimental condition and the control condition, then it does not have adequate control for a placebo effect explanation for any differences. To date, no study has controlled for differential expectations.
- Almost all of the published training studies showing a benefit of video game training relative to a control group show no test-retest effect in the control group. That's bizarre. The control group should show improvement from the pre-test to the post-test—people should get better with practice. The lack of improvement in the baseline condition raises the concern that the “action” in these studies comes not from a benefit of action game training but from some unusual cost in the control condition.
- It is unclear how many independent findings of training benefits actually exist. Many of the papers touting the benefits of training for cognition only discuss the results of one or two outcome measures. It would be prohibitively expensive to do 30-50 hours of training with just one chance to find a benefit. In reality, such studies likely included many outcome measures but reported only a couple. If so, there's a legitimate possibility that the reported results reflect p-hacking. Those papers often note that participants also completed "unrelated experiments," but it's not clear what those are or whether they actually were the same experiment but different outcome measures. Based on the game scores noted in some of these papers, it appears that data from different outcome measures with some of the same trained subjects were reported in separate papers. That is, the groups of subjects tested in separate papers might have overlapped. If so, then the papers do not constitute independent tests of the benefits of gaming. If we don't know whether or not these separate papers constitute separate studies, any meta-analytic estimate of the the existence and effect size for game-training benefits is impossible. Together with the known failures to replicate training benefits and possible file drawer issues, it is unclear whether the accumulated evidence supports any claim of game training benefits at all.
Given that expert/novice studies tell us nothing about a causal benefit of video games for cognition and that the evidence for training benefits is mixed and uncertain, we should hesitate to promote game training as a cognitive elixir. In some ways, the case that video games can enhance the mind is the complement to recent fear mongering that the internet is making us stupid. In both cases, the claim is that technology is altering our abilities. And, in both cases, the claims seem to go well beyond the evidence. The cognitive training literature shows that we can enhance cognition, but the effects of practice tend to be narrowly limited to the tasks we practice (see Ball et al., 2002; Hertzog, Kramer, Wilson, & Lindenberger, 2009; Owens et al., 2010; Singley & Anderson, 1989; for examples and discussion). Practicing crossword puzzles will make you better at crossword puzzles, but it won’t help you recall your friend’s name when you meet him on the street. None of the gaming studies provide evidence that the benefits, to the extent that they exist at all, actually transfer to anything other than simple computer-based laboratory tasks.
If you enjoy playing video games, by all means do so. Just don’t view them as an all-purpose mind builder. There’s no reason to think that gaming will help your real world cognition any more than would just going for a walk. If you want to generalize your gaming prowess to real-world skills, you could always try your hand at paintball. Or, if you like Mario, you could spend some time as a plumber and turtle-stomper.
Other Sources Cited:
- Ball, K., Berch, D. B., Helmers, K. F., Jobe, J. B., Leveck, M. D., Marsiske, M., et al. (2002). Effects of cognitive training interventions with older adults: A randomized controlled trial. JAMA: Journal of the American Medical Association, 288(18), 2271-2281.
- Bialystok, E. (2006). Effect of bilingualisim and computer video game experience on the simon task. Candadian Journal of Experimental Psychology, 60, 68-79.
- Boot WR, Blakely DP and Simons DJ (2011) Do action video games improve perception and cognition? Front. Psychology 2:226. doi: 10.3389/fpsyg.2011.00226. Link to Full Text
- Chisholm, J.D., Hickey, C., Theeuwes, J. & Kingston, A. (2010) Reduced attentional capture in video game players. Attention, Perception, & Psychophysics, 72, 667-671.
- Clark, K., Fleck, M. S., & Mitroff, S. R. (2011). Enhanced change detection performance reveals improved strategy use in avid action video game players. Acta Psychologica, 136, 67-72.
- Colzato, L. S., van Leeuwen, P. J. A., van den Wildenberg, W. P. M., & Hommel, B. (2010). DOOM’d to switch: superior cognitive flexibility in players of first person shooter games. Frontiers in Psychology, 1, 1-5.
- Donohue, S. E., Woldorff, M. G., & Mitroff, S. R. (2010). Video game players show more precise multisensory temporal processing abilities. Attention, Perception, & Psychophysics, 72, 1120-1129.
- Green, C. S. & Bavelier, D. (2003). Action video game modifies visual selective attention. Nature, 423, 534-537.
- Green, C.S. & Bavelier, D. (2006a). Effect of action video games on the spatial distribution of visuospatial attention. Journal of Experimental Psychology: Human Perception and Performance, 1465-1468.
- Green, C. S. & Bavelier, D. (2006b). Enumeration versus multiple object tracking: the case of action video game players. Cognition, 101, 217-245.
- Green, C.S. & Bavelier, D. (2007). Action video game experience alters the spatial resolution of attention. Psychological Science, 18, 88-94.
- Hertzog, C., Kramer, A. F., Wilson, R. S., & Lindenberger, U. (2009). Enrichment effects on adult cognitive development. Psychological Science in the Public Interest, 9, 1–65.
- Irons, J. L., Remington, R. W. and McLean, J. P. (2011), Not so fast: Rethinking the effects of action video games on attentional capacity. Australian Journal of Psychology, 63: no. doi: 10.1111/j.1742-9536.2011.00001.x
- Karle, J.W., Watter, S., & Shedden, J.M. (2010). Task switching in video game players: Benefits of selective attention but not resistance to proactive interference. Acta Psychologica, 134, 70-78.
- Murphy, K. & Spencer, A. (2009). Playing video games does not make for better visual attention skills. Journal of Articles in Support of the Null Hypothesis, 6, 1-20.
- Owen, A.M., Hampshire, A., Grahn, J.A., Stenton, R., Dajani, S., Burns, A.S., Howard, R.J., & Ballard, C.G. (2010). Putting brain training to the test. Nature, 465, 775-779.
- Singley, M. K., & Anderson, J. R. (1989). The transfer of cognitive skill. Cambridge, MA.: Harvard University Press.
- West, G. L., Stevens, S. S., Pun, C., & Pratt, J. (2008). Visuospatial experience modulates attentional capture: Evidence from action video game players. Journal of Vision, 8, 1-9.
Our sole source of scientific knowledge is not the double blind study. Reread your own article and replace "gaming" with "smoking" and "cognitive benefit" with "cancer". All the facts you claim remain true and yet it is absurd to claim there is no evidence that smoking causes cancer. Given the large holes one often finds in psychological experiments, my favorite is the fact that a huge variety of alternative hypotheses have the same face validity to be linked to any given experiment, i don't get how suddenly one thinks the gold standard of double blind medically oriented tests is suddenly the issue with cognition improvement with gaming. Given the horrific record much of psychology has with reproducibility and design validity, your argument can be used to dismiss much of all psychology findings. Applying this general arguement to a specific area of research seems problematic if not disingenuous. Karen Gold
ReplyDeleteKaren -- Yes, some relationships can only be studied using correlational methods. Pretty much all of epidemiological evidence takes that form. Still, the gold standard for causal inference is the placebo-controlled clinical trial. In some cases, that's impossible. You can't randomly assign people to smoke or not and measure outcomes decades later. Even if you could, you couldn't blind people to whether or not they were smoking. That said, such epidemiological and correlational studies go to great lengths to rule out other possible factors driving the association. They base their conclusions on a plausibility argument - after removing all other reasonable influences, if you still find a link, then it's reasonable to assume for practical purposes that the two are _linked_. The inference of cause is dicier, but In the case of smoking and lung cancer, reverse causality isn't plausible and epidemiologists try to rule out as many third-factor causes as possible.
DeleteClinical interventions are an entirely different situation. There, you are explicitly manipulating the conditions to see whether one produces a bigger effect than the other. The entire purpose of an intervention study is to demonstrate the causal effectiveness of the intervention. Doing that depends on equating everything about the conditions other than the variable of interest (in the case of gaming, the difference between the games themselves). Double blind studies with random assignment to conditions are unquestionably the gold standard for intervention studies designed to infer cause. Ask anyone conducting psychological interventions and they will tell you exactly that. My point was that even though researchers know the gold standard and can explain why it's essential, they often take few or no steps to ensure that they are ruling out problems with their baseline.
With an inappropriate baseline, no causal inference is valid. Perhaps with many different flawed baseline comparisons, you could zero in on a single causal factor. But, there haven't been any replications of existing gaming effects using baselines with different flaws and shortcomings. Consequently, it is not possible to converge on a single cause (and even then, all of the converging evidence would be flawed).
As you note, a huge number of alternative explanations can account for improvements in any condition. The random-assignment clinical trial is designed to average across all of those alternatives, leaving on the primary difference between the treatment and control. That's precisely why it is important to use an appropriate baseline. And, I completely disagree with your conclusion abotu psychology. This argument applies to all intervention studies and to most studies in psychology that use random assignment to infer causal effects. Using an appropriate baseline is common practice in psychology. In fact, every decent fMRI study is premised on subtracting out an appropriate baseline (the results are based on differential activation, not absolute activation). Every cognitive study that compares response times across conditions is based on equating those conditions on other dimensions. The subtraction logic underlying the clinical trias -- subtract a baseline that includes everything but the effect of interest -- is the same as that underlying pretty much all psychology experiments (using experiment in the proper sense meaning random assignment to conditions, etc). There is nothing special about psychology, and I'm not treating gaming interventions any differently than I treat any other experiments. I don't get why you think I'm being disingenuous, but I hope this clarification helps.
Hi Karen,
DeleteCaution should be exercised when interpreting the results of cross-sectional studies, regardless of the field. Take the example of hormone replacement therapy (HRT). Cross-sectional studies seemed to indicate a clear benefit, with HRT reducing the risk of heart disease. However, when randomized clinical trials were done, the results were the exact opposite. HRT increased, not decreased heart disease risk.
The case of smoking causing cancer is pretty clear, researchers have gone through extensive efforts to rule out third-variable problems, and directionality problems are of little concern. In the video game literature, however, cross-sectional studies often make not attempt to ensure gamers and non-gamers don't differ on other important variables, and the reverse directionality problem is a plausible one.
These are all valid points but since your 2011 paper newer studies have started to appear with better controls that have continued to find an effect.
ReplyDeleteIt is worth noting that none of these completely control for the points you mention although Cain et al. just by-passed the whole issue and included a measure of something not predicted by hypothesis to improve - in this case verbal memory - on the basis that if the biases exist, they should also effect a non-predicted measure, which they didn't. You don't always have to control for potential confounds, you can also test for them.
However, a word about methodological critiques. Christopher Chabris asked on Twitter, echoing your own comment, “What (if any) studies of video game benefits for cognition replicate and lack confounds?”
Well, as it is impossible to have a study free of confounds, none. Because ‘perfect studies’ are impossible, judgements can only be made on the weight of evidence. Coming up with a potential confound does not therefore mean it exists or automatically invalidates the evidence (suggested confounds are also hypotheses of course). They should, however, affect our confidence in the results.
The weight of evidence in the gaming literature suggests that the effect is likely to be real, and the fact that this effect has remained as studies have become methodologically more rigorous adds to this weight.
In other words, what would have to be true for it not to be a real effect is growing increasingly improbable. It is possible that studies from numerous labs, in numerous countries, from independent research groups, have recruited participants who have guessed the 'desired outcome' (tested for by Strobach et al and not found by the way) and have experienced unconscious social pressure from experimenters that influenced how participants performed on the computerised dependent measures etc. I find this unlikely.
The p-value hacking issue is a genuine concern I think and I suspect when controlled for will reduce the extent of the effect that is seen. I was also curious about the fact that you mentioned some researchers have ‘conflicts of interest’. As far as I know, none have been declared and considering we are discussing studies on off-the-shelf video games it seems unlikely that they have any personal gain from products. However, I’d be interested to hear more.
But even with this in mind, your conclusion “There’s no reason to think that gaming will help your real world cognition any more than would just going for a walk” is just daft. Really, no reason? Apart from the RCTs, observational studies, meta-analyses and neuroimaging experiments?
It may still turn out to be the case that the effect is not real, but it’s certainly not ‘obvious’ and probably quite unlikely.
Hi Vaughan -- here are some responses, broken across a couple of comments to make them fit.
Delete1. Not one of the new studies (and there aren't many new interventions) addresses the concerns we raised. Cain et al included a "story memory" task and found no differential improvements, but even they note that expectations for improvements might not differ for that task. It's a start, but it doesn't really address the differential expectations problem. A few expert/novice comparisons have been using blind recruiting (rather than recruiting people because they are experts), which is an improvement as well. But, cross-sectional comparisons do not permit causal claims about the effect of one variable on another. See our published response to Strobach for how they misunderstood the nature of our critiques about motivation—they claimed to have equated overall motivation, but we're focusing on differential expectations for improvements.
2. True, a perfect study is impossible for most psychological interventions because it isn't possible to have truly blind conditions. That is, a truly double-blind, random assignment clinical trial can't be done for gaming. That said, the primary purpose of blind designs is to equate for differential expectations (i.e., placebo effects). If you can't have people blind to conditions, you have an obligation to at least measure and try to equate for placebo effects. Not one study has done so adequately. This is not a case of just coming up with any potential confound -- it is dealing with THE FUNDAMENTAL CONFOUND in an intervention study that lacks blinded conditions. And, researchers know this. It's Research Methods 101. If you ask researchers what design best permits causal inference, they'll tell you that it's a double blind, RCT. If you ask them why each component of that design is essential, they'll tell you that it is to rule out demand characteristics, placebo effects, and subject population differences. Yet, when that ideal design isn't possible, researchers must take efforts to control for such problems. Published studies haven't.
3. As we noted in our Frontiers paper, it is impossible to evaluate the weight of the evidence from the game training literature because it appears that some of the studies are not independent experiments. It is not clear how many independent, positive effects there actually are. There also are failed replications as well as a file drawer.
Delete4. I have not argued that implicit pressure from experimenters drives these differences. I doubt that would be the case for most of these interventions. That said, I think it's entirely reasonable to assume that subjects in these studies will have expectations about how their training might affect their perofrmance on the outcome measures. (We have evidence that such expectations are closely aligned with the reported effects -- paper about to be submitted.)
6. I think the p-hacking issue is potentially huge. You don't conduct 30 hours of training and measure just one outcome. That'd be prohibitively expensive. If you report one outcome measure out of every 10, you are in serious danger of producing a lot of false positives. And, if a number of the papers reporting individual outcome measures were actually part of the same experimental intervention, then there is even less evidence.
7. In Bavelier's paper in Annual Review of Neuroscience this year, they acknowledged the following conflict of interest: "D.B., C.S.G., and A.P. have patents pending concerning the use of video games for learning. D.B. is a consultant for PureTech Ventures, a company that develops approaches for various health areas, including cognition. D.B., C.S.G., and A.P. have a patent pending on action-video-game-based mathematics training; D.B. has a patent pending on action-video-game-based vision training." That seems to be a growing trend. The brain training industry makes a lot of money.
8. As for my "just daft" conclusion about walking, there is plenty of evidence from RCTs that physical activity (i.e., walking) improves cognition and the brain. There may be issues with those studies as well, but the converging evidence is better grounded, I think. More broadly, what does the action in action game training? People spend tremendous amounts of time walking down crowded sidewalks, avoiding obstacles, crossing traffic, etc. Those tasks and daily activities like driving would seem to impose many of the same demands on cognition as gaming, so what is added by the games themselves. Why would an additional 10 (or now 30 or 50) hours add?
9. If the small number of random trials have potential placebo problems, we should hesitate to draw causal conclusions. Observational studies do not permit causal inferences unless other obvious explanations (e.g., self selection effects, etc) are implausible. In this case, they are quite plausible—people are far more likely to keep gaming if they happen to have cognitive talents that make them good at gaming. Meta-analyses are only as good as the data that goes into them, and in the case of random gaming interventions, we have no idea how many have even been conducted.
Quick clarification. The Cain et al study in #1 above was a cross-sectional comparison, not an intervention study. Causal conclusion not merited.
Delete"...newer studies have started to appear with better controls that have continued to find an effect."
DeleteWhat "newer studies"? Could you please cite them, Mr. Bell?
The p-hacking issue (or to put it more gently, the possible multiple comparisons problem) which you mentioned as your item #4 seems so important (especially in the context of all the marketing efforts going on in this area--I am hearing radio ads for these products daily) that it would be a service for someone to formally ask this question of all the authors of significant papers in this area. We may not really have any established social norms yet as to the matter of "Do you have to answer this question if asked?" although I would say yes if you are an ethical investigator you do have to answer it. I have faith that all or almost all investigators would be willing to provide this information.
ReplyDeleteHal Pashler
The Feng, Spence, & Pratt (2007) study found that females actually improved in mental rotation abilities and spatial attention more than males in their sample after playing 10 hours of action video games (but not significant gains in a control group after playing a 3D puzzle game). If the Bavelier lab was the only lab that studied the positive effect of video games, it might be easier to dismiss their research, but there are other labs that have found benefits for at least some groups of people. Action video games aren't a one-size-fits-all panacea, but I don't think we should throw out all the existing training studies completely based on the common set of criticisms you could attack any psychological intervention study with. I don't see how that invalidates the positive effects found for every intervention study that psychology has ever published (you can't ever have double-blind psychology or behavioral interventions, you don't always find training effects in intervention studies with long periods between pre-test and post-test, you don't always use a huge battery of tests or ever publish on a dozen outcome measures if you think that your effect is localized to a small subset of measures). However, the more sophisticated studies these days in other areas outside gaming are asking questions about who does and doesn't benefit from an intervention. It's like any other psychological intervention study: Some people benefit for some skills. The important questions then are: What predicts that people will actually benefit from game or not (eg. age, gender, other baseline measures, etc)? If older individuals and females are more likely to benefit, then studies looking at undergraduate males may not show benefits of action games, for example. What entertainment and educational games have actual benefits or not? The low action puzzle games actually make decent control games if you are looking at essentially puzzle solving outcome measures. What skills and abilities are improving vs not improving? We should be better at measuring and including tasks we don't think will improve (and report on them) as appropriate controls, though this is almost never done in the field of psychology as a whole (because reviewers don't always want to publish it, and collecting data on things you don't expect to change can be a waste of time and money resources).
ReplyDeleteElisabeth -- a few points and comments:
Delete1) Other than the Strobach et al paper, the Feng et al (2007) paper is the only one other than those from the extended Bavelier lab to the best of my knowledge. It suffers from the same shortcomings noted in our 2011 paper as all of the other published training studies.
2) The sex difference really should be taken with a grain of salt. It was based on a grand total of 6 Men (14 women), with half of each assigned to the experimental group and half assigned to the control group. That means there were 3 men and 7 women in the experimental group, and the differential improvements are based on those 7 women showing more of an improvement than the 3 men. That sample size is about an order of magnitude too small draw strong conclusions about sex differences. It's suggestive, but replication is needed (and, to my knowledge, nobody has even tried).
3) The reason the criticisms potentially undermine all of the reported positive effects is that all of the published findings share the same critical shortcomings, the most important of which is the lack of any check for differential expectations for the outcome measures across conditions.
4) You might well be right that there are individual and group differences in who might benefit from gaming (or any other intervention). Those would be interesting to explore. There might also be individual and group differences in how much they expect to improve on outcome measures, and those would be important to rule out before claiming that those group differences moderate improvements.
5) The active controls used in gaming studies are better than those used in many training intervention. I have no in-principle objections to the particular games used, and in many respects they do match the demands across conditions. That said, without checking whether they induce differential placebo effects, it's not clear that they are an adequate control.
6) Your last point is key. We need to be better about including tasks for which we do not expect improvements. More critically, we need to include tasks for which SUBJECTS do not expect differential improvements. That's the only way to avoid a placebo effect when we can't keep subjects blind to conditions.
7) It's far less costly to include many outcome measures (that typically take minutes to complete) than to conduct multiple separate training studies for each. I know of almost no training studies that have truly tested only a couple of outcome measures. It's just not feasible to test large numbers of people for 10-50 hours and only check for improvements on a couple measures. Ideally, studies should include multiple, distinct measures of the same construct.
8) Yes, we could attack any psychological intervention that lacks blinding to conditions using many of the same critiques. That's because any study in which participants know the nature of their treatment is subject to placebo effects. The fact that other literatures also fall flat on this basic design principle does make it okay, especially when an easy solution is to check for such differential expectations and rule them out.
9) Minor point: In the game training literature, almost all of the training studies use women rather than men because it's harder to find male non-gamers.
Nice article. I have a question though. Do your critiques/remarks only apply to the study of beneficial effects of video games, or do they also apply to studies regarding the detrimental effects of video games (such as studies showing a link between video games and violence, etc.)?
ReplyDeleteFab -- I was writing about the benefits of game training, but many of those critiques also apply to studies linking gaming to violence. I haven't done a thorough review of that literature recently, though.
ReplyDelete