Friday 17 August 2012

The Small Effect Size Effect - Why Do We Put Up With Small Effects?

Small effects sometimes matter - but psychology can do better
One of the things that bugs me about 'embodied' cognition research is that the effects, while statistically significant, tend to be small. What this means is that the groups were indeed different in the direction the authors claim, but only slightly, and that the authors had enough people showing the effect to make it come out on average. 

The problem with small effect sizes is that they mean all you've done is nudge the system. The embodied nervous system is exquisitely sensitive to variations in the flow of information it is interacting with, and it's not clear to me that merely nudging such a system is all that great an achievement. What's really impressive is when you properly break it - If you can alter the information in a task and simply make it so that the task becomes impossible for an organism, then you have found something that the system considers really important. The reverse is also true, of course - if you find the right way to present the information the system needs, then performance should become trivially easy. 

Psychology has become enthralled by statistical significance (to the point that we're possibly gaming the system in order to cross this magical marker). If your effect comes with a p value of less than .05, it is interesting, regardless of how small the effect is in terms of function. This is a problem, and we don't have to put up with it. If you ask a question about the right thing, you should get an unambiguous answer. If your answer is ambiguous, you may not be asking about the right thing. 

I want to remind readers of a couple of examples of nuisance small effects I've covered here before, then talk a little about some work which either broke or fixed the right thing, to highlight that we don't actually have to suffer from the tyranny of the small effect effect.

Small effects in 'embodied' cognition
Regular readers will know we have a low opinion of most of the research that calls itself 'embodied cognition'. My two main examples of this work are 'moving through time' (Miles et al, 2010) and 'leaning to the left' (Eerland et al, 2011). Not only are they good examples of what my field spotter's guide is designed to find, they also suffer from the small effect effect. 

Miles et al (2010): The authors measured postural sway while people thought about either the future or the past. People's sway was slightly more forwards in the future condition, and slightly more back in the past condition. Miles et al interpreted this as the effect of metaphors about time, which are grounded in things like forward and back motion. The problem is that the significant effect on sway peaked at approximately 2mm in each direction. In terms of posture, this is meaningless. 

Eerland et al (2011): The authors made people lean either left or right, and then had them estimate the magnitude of a wide range of things that people would not know the actual size for, but would know the ball park size (such as the Eiffel Tower). Leaning to the left reduced estimates of size by a z score of about .08 relative to being upright. Leaning to the right had no effect. The authors interpreted this as showing how access to the mental representation of magnitude (thought to be like a number line) can be primed by posture - a state of the body affects access to a mental state. The main problem here is not only the tiny overall effect and lack of effect to the right, but the fact that out of 39 items only 25 show a difference in the right direction between left and right items, and only 9 show the right overall ordering of left leaning < upright < right leaning. 

This is the rule, not the exception, in 'embodied' cognition, and is a hint that they aren't really tapping anything interesting. 

The small effect effect vs. Gibson
One other great example of this problem that I've covered before comes from Gehringer & Engel (1986). They set out to test an ecological analysis of the Ames Room which suggests that people will come to resolve the ambiguous nature of the room if they are allowed to explore. They allowed people progressively more opportunity to move around, and had them judge the relative size of two discs, one in each corner. In the worst case, people erred by about 21mm (out of a possible 30mm error); in the best case (lots of exploration) the error was reduced to 2.6mm, which was of course statistically different from zero error. Gibson, they concluded, was wrong, and direct perception should be thrown out. Of course, a 2.6mm error in a size matching judgement task is actually not a bad effort at all, and Sverker Runeson (1988) proceeded to tear this paper a new one with the kind of keen analysis that makes him my favourite perceptual psychophysicist.

The point here is that the authors pointed to their statistical significance and tried to conclude that Gibson's entire theory was flawed, when they had actually almost entirely eradicated the Ames room illusion by letting people move around. Way to miss the real story, chaps.

Small ape effects
I reviewed a paper here recently about tool use in chimps, and how they are (contrary to what is apparently a heated argument in the literature) able to use weight to distinguish between objects. One of the interesting things about this paper is that the group level effect was often small, and (in Expt 2) not seen in any individual ape! My interest in this work is that I think it would benefit from a proper analysis of the affordances chimps perceive, rather than assumptions about the role of weight. I think that if you did this, you should end up with huge, unambiguous effects like the ones I'm about to describe for context.

What it looks like when you break the right thing
For a given visually guided action, the perception-action approach suggests that there is an invariant feature of the optic array which a) specifies the property of the world required to make the action function correctly and b) has to be present for the action to work. In coordinated rhythmic movement, that information is the relative direction of motion, which specifies relative phase. After training at 90°, people use something else, and Geoff Bingham & I wanted to know what it was. It could only be one of three things: relative speed, relative frequency, or relative position, so we designed displays that preserved relative phase (the property in the world) but broke one (and only one) of each of these potential information variables at a time. Relative to unperturbed performance, all the perturbations had significant effects on performance at 90°. Frequency had some effect, related to it's overlap with the Position perturbation; but the Position perturbation simply blew performance out of the water. Interestingly (and something I am working on following up) three participants in the first study showed this effect at 0° and 180° where I wasn't expecting it, suggesting they were using the new information there, instead of relative direction.

Data from Wilson & Bingham, 2008. Trained performance at 90° was utterly disrupted by the right information manipulation
What it looks like when you fix the right thing
A real problem in visually guided action is the accurate, metric perception of size (to pick an object up, you need to scale your hand to the right size ahead of time). Study after study after study has showed that vision simply can't provide this without haptic feedback from touching the object; but we do scale our hands correctly! The question is how do we do it? Geoff has been plugging away at this for years, trying to provide people with what he thought were sensible opportunities to explore objects visually, with no luck, until he rotated the objects through 45° (a huge amount in vision). BAM! Suddenly people could visually perceive metric shape, and this persisted over time without being constantly topped up (Lee & Bingham, 2010). Suddenly we knew how we did this task; metric visual perception of shape is enabled by all the large scale locomotion we get up to - moving into a room, for example. Without this calibration, the task was impossible, but as soon as the right manipulation was made, the impossible became straight-forward, and the effect size is huge.

We don't need to suffer the tyranny of the small effect effect any more. Careful, theory driven hypothesis testing means we can start finding the factors that actually contribute to our behaviour. After all, is that what we psychologists actually want to do?

Eerland, A., Guadalupe, T., & Zwaan, R. (2011). Leaning to the Left Makes the Eiffel Tower Seem Smaller: Posture-Modulated Estimation Psychological Science, 22 (12), 1511-1514 DOI: 10.1177/0956797611420731

Gehringer, W., & Engel, E. (1986). Effect of ecological viewing conditions on the Ames' distorted room illusion. Journal of Experimental Psychology: Human Perception and Performance, 12 (2), 181-185 DOI: 10.1037//0096-1523.12.2.181 (Download)

Lee, Y.L., & Bingham, G.P. (2010). Large perspective changes yield perception of metric shape that allows accurate feedforward reaches-to-grasp and it persists after the optic flow has stopped! Experimental Brain Research, 204 (4), 559-73 PMID: 20563715 (Download)
Miles, L., Nind, L., Macrae, C. (2010). Moving Through Time Psychological Science, 21 (2), 222-223 DOI: 10.1177/0956797609359333

Runeson, S. (1988). The distorted room illusion, equivalent configurations, and the specificity of static optic arrays. Journal of Experimental Psychology: Human Perception and Performance, 14 (2), 295-304 DOI: 10.1037//0096-1523.14.2.295 (Download) 

Wilson, A. D., & Bingham, G. P. (2008). Identifying the information for the visual perception of relative phase Perception & Psychophysics, 70 (3), 465-476 DOI: 10.3758/PP.70.3.465


  1. I know you were bemoaning the inability to start good conversations on the blog recently, but all I have to add is: "YES!"

    1. Ha! I'll have to start writing less complete posts :)

  2. I agree that big effects are more interesting but I doubt there is any single prescription for finding big effects. Even if you can't do that, maybe the most useful prescription is: study whichever known effects are big. Seems like the heuristic used by embodiment researchers is: study effects that are cute, regardless of whether they are big.

    Also interesting to reverse the question and ask: are there effects that are tiny in terms of Cohen's d which have proven very worthwhile? I think yes, in the world of reaction times, there are.

  3. Oh? Which RT effects then? What mechanism or explanation have these tiny RT effects revealed, other than the phenomenon that was found to be statistically significant?

  4. Have you seen de Groot et al (2012) 'Chemosignals communicate human emotions' in Psychological Science 23(11) 1417–1424? Reported effect sizes are all between .01 and .07