Wednesday 20 November 2013

What are the units that perception measures the world in? Firestone vs Proffitt

Perception is an act of measurement, and, like all acts of measurement, it needs a scale in order to be useful. Think about placing something on your kitchen scales; all that actually happens is that the object presses on the scale and the scale registers that something has changed by some amount in response (the location of a tray, for example). In order to know what that change means, the change is presented to us on a calibrated scale (by moving a needle around to point at some number, for example). The needle always moves the same amount for a given weight but the resulting number can vary (you might have an imperial rather than metric kitchen scale, for example). Without the scale, you can say that one thing is heavier than another by noting that it moves the scale more (this is an ordinal evaluation) but you need the scale in order to say what the weight difference is (the metric evaluation).

Visual perception measures the world in terms of angles; objects subtend a certain number of visual angles that depends on their size, distance, etc. Your thumbnail held at arm's length is about 1° of visual angle. You can get ordinal information directly from angles (the fact that one thing is closer/bigger/etc) but you need a scale to get the metric information required to use vision to control action. For example, you need to perceive how big something actually is in useful units in order to scale your hand size appropriately when grasping it; relative size doesn't help. One of the fundamental questions in (visual) perception research is, therefore, what are the metric units that the perceptual systems use to scale their measurements?

Dennis Proffitt has been studying this question for a long time and is in favour of task-specific, body-scaled units. His evidence comes from studies in which people perceive their environments differently as a function of their ability to act on that environment. Probably the most well-known example is the study that showed people judge hills to be steeper when they are wearing a heavy backpack (Bhalla & Proffitt, 1999). The idea is that the backpack will make traversing that hill more difficult, and when the visual system measures the slope, it scales its measurement in line with this perceived effort. The hypothesis is that this is functional; it's a feature of the visual system that helps us plan appropriate actions. 

Perspectives on Psychological Science recently hosted a point-counterpoint debate on this topic. Firestone (2013) reviewed the literature on this type of action-scaling in perception and concluded that not only do the data not really support Proffitt's account, but that this account couldn't work even in principle. Proffitt (2013) rebutted Firestone's arguments and defended his view. I'm interested in this because Proffitt is at least a little ecological, and the basic idea he defends is one I would defend as well (although not in the form that he proposes). So who won?

The paternalism of spatial vision
Firestone begins by framing Proffitt's theory as paternalistic. Proffitt's theory is that vision contains systematic biases that are introduced in order to make us behave in a certain way. These biases are 'well intentioned white lies' that 'bias perceivers towards favourable actions' (Firestone, 2013, p. 456). Proffitt rejects this characterisation as inaccurate; this body scaling is not about lying, it's just about calibration. Although I think Proffitt (and to a greater extent his student, Jessica Witt) do sometimes talk in a way that opens them up to the paternalism label, paternalism is not really a fair label because it implies being misled. Calibration is not a process of distortion, it's a critical part of measurement, and just because the perceived result doesn't match what a physicist might produce doesn't make the perceiver in error (an argument I lay out in detail here)

That aside, let's score the main arguments. 

Argument 1: The effect sizes are the wrong size for the job
Firestone's first real argument against Proffitt is about whether or not the action-scaling found in their experiments can possibly be functional. He reviews a range of results and notes that the resulting change in perceived slope (or passability of an aperture, or what have you) is typically quite small, and generally smaller than that actual change that has occurred as a result of the experimental manipulation. So if vision is telling us white lies, they are not very useful white lies because they don't match the change in the world. 

Proffitt has three replies; his account is the only one that even predicts these effects should occur in the direction they do (irrelevant), adaptation takes time (true, but weak; see below) and you can see good matching between bias and action in overlearned tasks such as grasping (ok but still weak). In effect he's arguing that calibration has a dynamic (i.e. it occurs over time and in a characteristic way) which is true. The problem is that Proffitt has never studied the dynamics in any detail (see Mon-Williams & Bingham, 2007 for an example of how to do this) which he should have by now if he wants to make this argument. In addition, if they are to be functional, then the adaptation really needs to occur on the timescales he measures. Firestone takes this one because there is critical work yet to be done. Score: 1-0 Firestone.

Argument 2: Action specific units cannot be compared
Firestone notes that if you want to choose how to traverse some distance, action scaling is a problem because the units for walking, running, throwing etc will all be different. If the scales are different, you can't compare the measurements to pick the best option. Proffitt agrees, but notes that this isn't a problem because he never claims the system tries to compare measurements to choose actions and the evidence suggests that the action-scales really are only applied within relevant tasks (what he calls action boundaries).

Action scaling is indeed task-specific and cannot be directly compared, and Proffitt is right that it's not a problem (because action selection is about affordances, not action scaling; I've argued this in some talks on throwing recently that I should really write up). Score 1-1, although note that Proffitt doesn't propose a solution to the problem of action selection; I wouldn't pick on this except that he gets huffy about Firestone critiquing without replacing.

Argument 3: There's no information for ability scaling
In order to apply a metric to a visual measurement, there has to be a relation between these two things that has detectable consequences; there needs to be information about the scale as well as the measurement. Firestone discusses eye-height scaling of object sizes. The horizon always cuts objects at eye-height, regardless of distance. Some simple geometry means that because the visual system has access to the angular size of the object and the angular size of the part below the horizon, then it has access to the ratio (the size of the object in eye-height units). Eye height is a viable scale because it has visually detectable consequences; action scales such as walkability or jumpability do not, and therefore cannot be viable scales.

Proffitt appeals to dynamics again; calibration takes time and purposeful behaviour (you act, perceive the consequences and correct the errors). He also highlights that eye-height scaling, while a nice simple example, doesn't actually seem to get used much anyway.

Proffitt misses a couple of key points; calibration requires information, so this is a problem for him. Of course, there are non-visual sources of information that might solve the problem (Firestone only talks about vision). Firestone gets the point for being basically right and for Proffitt not paying enough attention to the critique. Score: 2-1 Firestone.

Argument 4: Visual space doesn't look like it's warping
Firestone says that if vision is distorting space to help guide action, we should experience this warping (because some of the effects can be quite large, contra Argument 1). But we don't. Proffitt has three replies. First, he states that vision is trying to provide stable access to the environment, so lots of perturbations are filtered by the system. As an example, he notes the fact that our view of the world does not whizz around as we saccade our eyes three times a second because of saccadic suppression. He then notes some work which found that we don't notice even when a (virtual) environment really is shrinking and growing (Glennerster et al, 2006). Finally he notes that changing which ruler you're using doesn't actually change the locations of objects in space. Think about measuring a gap in centimetres, then switching to an inch ruler. The distance is the same, the number has changed, but this only matters if you keep acting as if you still used a centimetre ruler. In the same way, people use the current calibration, not the previous one.

Proffitt wins this one across the board. Appeals to visual experience never help in arguments about how vision actually does what it does. While we do perceive the world, we do so by detecting information, and we have no real access to the experience of detecting information per se. More importantly, however, is Proffitt's point about how all that's changing is the ruler. We have no privileged access to the world; all we 'know' is what the calibrated detection of information tells us, and different calibrations are incompatible (see Argument 2) which means there's no way to compare them and identify a difference. As Firestone notes in Argument 3, you need information to detect everything, and this applies to him as much as to Proffitt. Final Score: 2-2.

Tiebreak: Everyone loses
I really want to like Proffitt's work. His heart is in the right place, after all; perception really is scaled in task-specific action units. But his work only ever scratches the surface and rarely deeply enough to justify his conclusions. He needs task dynamics, he needs to frame his work in terms of calibration and he is (as Firestone rightly points out) in desperate need of some information to back this all up.

Given this, I really wanted Firestone's critique to have the requisite substance, because a solid review of this literature and an analysis of what's missing would be a valuable contribution to the literature. But he really only has one major point (about information) and while he's right to say Proffitt suffers here, he's wrong to say action scaling can never have informational consequences. It might not have visual consequences the way eye-height does, but that's not the point. 

So really the losers here are us, the readers. Instead of a substantive analysis and defence of a problematic but on-the-right-track theory of perceptual scaling, we got a mixed bag of viable points generally poorly defended and hidden in amongst some irrelevant information and a surprising amount of snark. I'm all for being feisty and punchy when it's called for but everyone just seemed a bit pushy throughout which made this exchange less productive that it could have been. Full marks to Perspectives, though, for allowing the debate and successfully negotiating what I'm sure was a complicated review process.


Bhalla M. & Proffitt D.R. (1999). Visual-motor recalibration in geographical slant perception., Journal of Experimental Psychology: Human Perception and Performance, 25 (4) 1076-1096. DOI:

Firestone C. (2013). How "Paternalistic" Is Spatial Perception? Why Wearing a Heavy Backpack Doesn't--and Couldn't--Make Hills Look Steeper, Perspectives on Psychological Science, 8 (4) 455-473. DOI:

Glennerster A., Tcheang L., Gilson S.J., Fitzgibbon A.W. & Parker A.J. (2006). Humans Ignore Motion and Stereo Cues in Favor of a Fictional Stable World, Current Biology, 16 (4) 428-432. DOI:
Mon-Williams M. & Bingham G.P. (2007). Calibrating reach distance to visual targets., Journal of Experimental Psychology: Human Perception and Performance, 33 (3) 645-656. DOI: Download

Proffitt D.R. (2013). An Embodied Approach to Perception: By What Units Are Visual Perceptions Scaled?, Perspectives on Psychological Science, 8 (4) 474-483. DOI:


  1. Hi Andrew,

    Thanks for engaging so deeply with this discussion — I'm glad to have your voice and expertise in the conversation!

    Given that you scored most of the preliminary rounds for me (and the one early round you scored for Proffitt seems to have been given with some reluctance), I'll mostly focus on Argument 4, which was the argument that (e.g.) if backpacks really do make hills look 25% steeper, then we should be able to experience this for ourselves. I thought this argument was really a KO, so I'm disappointed that you're still standing! Here are some thoughts:

    1. I'm worried that the concerns you raise with visual phenomenology are misplaced. Of course I agree that the mind is not so constituted as to grant direct access to the inner workings of our visual systems. If anything, my paper defended that very view: I think vision is modular, and one of the properties usually attributed to modular input systems is the lack of 'access to interlevels'. So, I'm very much with you that "we have no real access to the experience of detecting information per se", and that we're not going to figure out how see just by meditating on it. In fact, I take this to be the central lesson of cognitive science; we don't know 'from the inside' how our minds really work.

    Fortunately, the fact that we cannot directly access the information-processing done by the visual system does not at all entail that phenomenology "never helps" ("never"!) in discussions about how perception works. Access to the inner workings of a system is not the same as access to its output, and when a theory about how perception works also makes explicit claims about *awareness*, then the way things look to us is perfectly relevant. This strikes me as an utterly pedestrian thing to say, especially in a field like vision science where 'demos' are so ubiquitous in research. Indeed, imagine that a color vision scientist claimed to discover some manipulation that produced huge shifts in color reports, but that nobody was able to experience these effects for themselves when they underwent the manipulation; surely we would be skeptical (and rightly so) that the manipulation really affects perception. And it seems to me that this sort of evidence is commonplace in cognitive science: For example, we can, famously, determine whether a sentence is grammatical even though we cannot access the rules our minds apply to deliver such determinations. Clearly, the lack of access to the processing rules themselves doesn't mean that intuitions about grammaticality are irrelevant to psycholinguistics; here, too, to think otherwise would be to conflate access to the workings of the system with access to its output. This situation is perfectly analogous to the present case: We may not know *how* we see, but surely we know at least something about *what* we see! (in the right circumstances, given certain assumptions, etc.). If someone is going to tell me that wearing a heavy backpack makes hills look steeper — really *look* steeper — then I'd better see a steeper hill when I put on a backpack, especially when (a) the alleged increase in steepness is big; (b) I know exactly what to look for and can attend appropriately (which rules out appeals to change blindness, inattentional blindness, etc.); and (c) I can attend for as long as I please (which rules out hysteresis effects).

    1. 2. Of course, you went one step further, and said of ability-scaling that "It might not have visual consequences the way eye-height does, but that's not the point". I have two comments here:

      (a) It definitely is the point — and Proffitt and I are in agreement about this. The very first sentence of Proffitt's reply contains the popular Koffka quote, "Why do things look as they do?". As I understand things, he and I both take embodied perception / paternalistic vision to be a thesis about (inter alia) *how things look*. That, I imagine, is why Proffitt gave arguments as to why we don't notice a warping perceptual world; if the lack of visual consequences were "not the point", then he could have just said that and moved on. Now, it sounds from this post and others that your own view is that conscious visual experience is unimportant, and I am aware of (though not so sympathetic to) the perspective that if perception is fundamentally for action, then questions about appearance are secondary. But that's really orthogonal to this issue; the studies and theories under discussion here explicitly claim that the world looks different when one can act differently, and the broader field has (rightly) treated the studies as intending this interpretation. That's the claim I am interested in.

      (b) If I understand you correctly when you say about ability-scaling that "It might not have visual consequences the way eye-height does", then you actually *agree* that eye-height scaling has visual consequences! But given that you thought Proffitt's replies to the phenomenology argument were successful, why wouldn't those replies work equally well against the eye-height case too? For example, you mentioned liking Proffitt's reply that changes to a 'perceptual ruler' shouldn't make the world look bigger or smaller. Doesn't that argument apply equally to changes in eye-height as to changes in climbing ability? If not, why not? But if so, then I have an easy reductio argument against those replies: They should apply just as well to cases where everyone already agrees there are visual consequences (e.g. eye-height); therefore, they must go wrong somehow.

    2. 3. As it happens, I'm quite unmoved by Proffitt's replies to the phenomenology argument. One main reason for this, independently of any specific argument's details, is just that so many discussions of embodiment effects are filled with claims about how intuitive they are. Even the very first paper on embodied perception (Proffitt et al., 1995) notes that runners and cyclists often think the later miles of a racecourse are steeper than the earlier miles even when the course is looped such that the later miles retrace the earlier miles (e.g. p.427). This level of noticeability is all that is asked for by the objection from perceptual phenomenology, and yet we don't even get that when we wear heavy backpacks. Moreover, the embodied perception group has itself reported a phenomenologically noticeable finding: Linkenauger et al.'s (2010) 'illusory shrinkage and growth' paper. This makes the replies to the phenomenology argument self-defeating: If the replies really are successful, then this particular embodied perception study shouldn't have worked!

      But, that issue aside, I also just don't feel the force of the replies. For example, the reply that the world doesn't actually move around when a ruler changes size seems like a red herring. On the embodied perception view, perception is the ruler, and our experience represents the world in 'units' of that ruler; even on this view of perception (which I don't particularly share), it is trivial to set up cases that should yield perceptually noticeable results. For example, suppose there were a passable aperture one meter wide in front of me, and someone stood beside the aperture holding a meterstick perpendicular to the aperture (i.e. extending into the depth plane). At this point, the meterstick looks to be the same size as the aperture (people are pretty accurate at aperture-width judgments of this sort). Next, I hold a wide rod that makes the aperture less passable, which has been shown to reduce aperture-width judgments (but should leave the meterstick perceptually unchanged). At this point, the aperture and the meterstick should no longer look the same size! But, of course, this doesn't actually happen. And note that I'm *not* saying that the aperture should appear to dynamically *shrink* before my eyes; I'm just saying that something should look different before and after adjusting my arms — namely, the size ratio between the aperture and the meterstick. None of these assumptions violates any tenets of the 'perceptual ruler' theory, but they are clearly contradicted by experience.

      I feel similarly about the Glennerster et al. (2006) citation. Their very interesting study indeed found that subjects did not notice changes in room-size; but, crucially, those subjects instead made enormous errors in judging the sizes of *objects* in the room. For example, on a typical trial, subjects saw a cube at one end of a room, then walked to other end of the room, where there was a cube of the same physical size; while subjects walked, the room grew to four times its original size, and then subjects misjudged the cube’s size by the same factor of four. In other words, the change in room size wasn't *ignored* by the visual system, but rather misinterpreted as a (massive) change in the size of objects in the room. Glennerster et al. even liken their result to the famous “Ames room” illusion in which differences in the scale of a cleverly built room are instead experienced as (utterly noticeable) differences in the scale of objects in the room. There is thus every reason to think that if the world were resized before our eyes, we would at least notice *something* amiss — and yet it is agreed that we do not subjectively observe any such effects.

      Thanks again for your very thoughtful comments on this issue. You've given me a lot to think about, and I'd definitely be up to continue this discussion if you have any other thoughts about these topics.

      All the best,

  2. Thanks, Chaz, glad you enjoyed it.

    Some thoughts. My take on the argument about visual experience is that it's dealt with by the argument about incommensurability (which I give to Proffitt with no hesitation, by the by; I just point out he got huffy where he had no need or right to :). Specifically, because different action scales are different scales and therefore cannot meaningfully be compared, then the change from one scale to another scale also won't be detectable in any meaningful way.

    The hill has not actually changed steepness; what has changed is the relation between the hill and the person's ability to climb the hill. At time 0, with no pack, the relation is expressed in 'ability of person to climb hill' units (crudely speaking). At time 1, the backpack goes on and after some recalibration, by time 2 the relation is now expressed in 'ability of person wearing pack to climb hill' units. These are incommensurable and people only have access to one of these at a time, thus Proffitt's account actually predicts no detectable warping.

    Now, I will grant you that a lot of this work is very careless with the way they talk about their approach, so you're 100% right to pick up on that. But I think the actual nuts and bolts of the theory, as well as what the data suggest, means that their laziness aside, Proffitt is right not to expect detectable changes.

    A side note: people are rubbish at comparing lengths in orthogonal dimensions. This has led people to conclude that the relevant geometry that describes visual space perception is affine, not Euclidean. So that's a complicated example, is all.

    I'll think more on the rest as well.

    1. Hi Andrew,

      I worry that you're underestimating the implications of the 'perceptual ruler' theory, both as Proffitt and his collaborators have articulated it and even as you have characterized it here. You already conceded that there are "visual consequences" of eye-height scaling; why do you believe this, given your remarks just now? Why don't the same reasons that have you thinking that there are noticeable visual consequences to eye-height scaling apply to this case, which supposedly operates on similar principles?

      Note again that the issue isn't just about whether embodied perception should *expect* noticeable perceptual changes (though it is indeed about that!), but also whether embodied perception studies do *in fact* find evidence for such changes. You mentioned that the data themselves don't suggest noticeable perceptual changes; but if you've seen the Linkenauger et al. (2010) paper (reference below), then you know that in some of these cases observers are literally claiming to experience objects growing and shrinking before their eyes! I don't think this particular effect is actually explained by 'embodied perception' — do you? If not, then we may agree here; but if so, then why are there noticeable dynamic changes there and not elsewhere?

      Thanks for bringing up the affine geometry work. I'm a big fan of Todd, Koenderink, and the others who have promoted those views, and I know the work decently well. I do think the particular example I gave escapes most of that criticism, though: There's been work by Stefanucci on almost exactly the setup I described, and the subjects seem to have done fairly well in estimating aperture-width. But, any rate, you'll agree that it would be easy to construct an analogous scenario involving geometric properties that are preserved under affine transformations (such as parallelism, which has also been studied by embodied perception researchers); either way you cut it, visual experience doesn't corroborate these claims.

      Finally, I do feel compelled to add that I'm not quite onboard with your characterization of the claims made by the embodied perception approach as "careless" or the product of "laziness". I don't really think that at all (and I hope I didn't give that impression); to the contrary, I think the embodied perception view is a rich and interesting theory, and I really respect the thinking of the researchers who have worked on this — I wouldn't be spending so much time thinking and writing about it if I thought otherwise!

      Linkenauger, S. A., Ramenzoni, V., & Proffitt, D. R. (2010). Illusory shrinkage and growth: Body-based rescaling affects the perception of size. Psychological Science, 21, 1318–1325.

    2. You already conceded that there are "visual consequences" of eye-height scaling; why do you believe this, given your remarks just now? Why don't the same reasons that have you thinking that there are noticeable visual consequences to eye-height scaling apply to this case, which supposedly operates on similar principles?
      This is all I meant; eye height has a visually detectable consequence (the invariance of the horizon) which, when combined with the fact that part of objects fall above this and part below, is purely visual information about scale. It's not the same as the action scaling stuff (note eye height ratios tell you little if anything about affordances).

      I don't know the Linkenauger paper in detail; I'll have a look when I get a chance. Sally's coming to Leeds to give a talk next year, I'll ask her annoying questions then :)

      It's my view that some of the talk around Proffitt style work is lazy; a lot of the papers recycle the introductions and simply don't push the idea hard enough. I think if they did, they'd get pushed towards a more careful, formal informational analysis which would solve a lot of these detail problems.

  3. Ah, I see that I misunderstood then — for whatever reason I'd thought you meant 'consequences for visual experience' rather than 'consequences for visual information' (though, looking back, it should have been clear...). As you say, I agree that there are no relevant visual consequences, in that sense, of actions. Whether or not it's the point is hard to say! Since neither proper action calibration nor affordance perception requires any effect on spatial perception, it's quite possible (even probable) that whatever the non-visual informational basis for action scaling turns out to be (e.g. some structure in the 'global array', à la Stoffregen) still won't entail that spatial perception should be altered by such scaling. There's a nice Gibson quote on this issue that didn't make it into my paper: "When the constant properties of constant objects are perceived (the shape, size, color, texture, composition, motion, animation, and position relative to other objects) the observer can go on to detect their affordances" (Gibson, 1966, p. 285, emphasis added). So, it was at least Gibson's own view that spatial perception is prior to affordance detection, and I suppose that would be my view too (though I'm no Gibson scholar, and it's quite possible that he amended this view in the 1979 book; I'm sure you'd know better than I!).

    At any rate, I'm still a bit floored by your taking issue with perceptual phenomenology as a relevant source of evidence here. If you think there are technical reasons that the embodied perception / paternalistic vision view doesn't require differences in perceptual experience as a result of embodied 'rescaling', then fair enough — although I continue to disagree on this issue (both in spirit and with respect to particular details, such as the Glennerster et al. study), and I think Proffitt's use of the Koffka quote is telling; at bottom this is an issue about how things look, and so it seems to me that phenomenology simply must be relevant. As I mentioned earlier, even if embodied perception doesn't require that molehills grow into mountains before our eyes, I think it is still quite easy to generate cases where there should be subjectively appreciable 'upshots' of embodied rescaling, including changes in appearance of relative size, slant, or distance, etc.

    And do let me know what you think of the Linkenauger paper! It is a really fascinating set of results, but I think the precise nature of its relation to embodied perception is yet to be determined. I'm sure Sally will be up for discussing that though! I've spoken with her at length about these and related issues and she has sharpened my thinking about them in all sorts of ways.

    1. At any rate, I'm still a bit floored by your taking issue with perceptual phenomenology as a relevant source of evidence here.
      Given how much of the important work of vision is done without any conscious awareness at all, I gave up worrying about what things look like a long time ago and got more interested in how they affect action.

    2. You may be underestimating how much information - not just geometrical - the conscious percept contains.

  4. I wouldn't worry so much about snark. It's emotion that gets people motivated to get into it. There's actually too much of the opposite problem - people being too polite to say what they want to say.

    1. I'm all for being feisty, but I've learned that snark is distracting. Think about what's wrong with basically every argument on the internet and you'll see what I mean.