Notes from Two Scientific Psychologists: What are the units that perception measures the world in? Firestone vs Proffitt

Wednesday, 20 November 2013

What are the units that perception measures the world in? Firestone vs Proffitt

Perception is an act of measurement, and, like all acts of measurement, it needs a scale in order to be useful. Think about placing something on your kitchen scales; all that actually happens is that the object presses on the scale and the scale registers that something has changed by some amount in response (the location of a tray, for example). In order to know what that change means, the change is presented to us on a calibrated scale (by moving a needle around to point at some number, for example). The needle always moves the same amount for a given weight but the resulting number can vary (you might have an imperial rather than metric kitchen scale, for example). Without the scale, you can say that one thing is heavier than another by noting that it moves the scale more (this is an ordinal evaluation) but you need the scale in order to say what the weight difference is (the metric evaluation).

Visual perception measures the world in terms of angles; objects subtend a certain number of visual angles that depends on their size, distance, etc. Your thumbnail held at arm's length is about 1° of visual angle. You can get ordinal information directly from angles (the fact that one thing is closer/bigger/etc) but you need a scale to get the metric information required to use vision to control action. For example, you need to perceive how big something actually is in useful units in order to scale your hand size appropriately when grasping it; relative size doesn't help. One of the fundamental questions in (visual) perception research is, therefore, what are the metric units that the perceptual systems use to scale their measurements?

Dennis Proffitt has been studying this question for a long time and is in favour of task-specific, body-scaled units. His evidence comes from studies in which people perceive their environments differently as a function of their ability to act on that environment. Probably the most well-known example is the study that showed people judge hills to be steeper when they are wearing a heavy backpack (Bhalla & Proffitt, 1999). The idea is that the backpack will make traversing that hill more difficult, and when the visual system measures the slope, it scales its measurement in line with this perceived effort. The hypothesis is that this is functional; it's a feature of the visual system that helps us plan appropriate actions.

Perspectives on Psychological Science recently hosted a point-counterpoint debate on this topic. Firestone (2013) reviewed the literature on this type of action-scaling in perception and concluded that not only do the data not really support Proffitt's account, but that this account couldn't work even in principle. Proffitt (2013) rebutted Firestone's arguments and defended his view. I'm interested in this because Proffitt is at least a little ecological, and the basic idea he defends is one I would defend as well (although not in the form that he proposes). So who won?

The paternalism of spatial vision
Firestone begins by framing Proffitt's theory as paternalistic. Proffitt's theory is that vision contains systematic biases that are introduced in order to make us behave in a certain way. These biases are 'well intentioned white lies' that 'bias perceivers towards favourable actions' (Firestone, 2013, p. 456). Proffitt rejects this characterisation as inaccurate; this body scaling is not about lying, it's just about calibration. Although I think Proffitt (and to a greater extent his student, Jessica Witt) do sometimes talk in a way that opens them up to the paternalism label, paternalism is not really a fair label because it implies being misled. Calibration is not a process of distortion, it's a critical part of measurement, and just because the perceived result doesn't match what a physicist might produce doesn't make the perceiver in error (an argument I lay out in detail here).

That aside, let's score the main arguments.

Argument 1: The effect sizes are the wrong size for the job
Firestone's first real argument against Proffitt is about whether or not the action-scaling found in their experiments can possibly be functional. He reviews a range of results and notes that the resulting change in perceived slope (or passability of an aperture, or what have you) is typically quite small, and generally smaller than that actual change that has occurred as a result of the experimental manipulation. So if vision is telling us white lies, they are not very useful white lies because they don't match the change in the world.

Proffitt has three replies; his account is the only one that even predicts these effects should occur in the direction they do (irrelevant), adaptation takes time (true, but weak; see below) and you can see good matching between bias and action in overlearned tasks such as grasping (ok but still weak). In effect he's arguing that calibration has a dynamic (i.e. it occurs over time and in a characteristic way) which is true. The problem is that Proffitt has never studied the dynamics in any detail (see Mon-Williams & Bingham, 2007 for an example of how to do this) which he should have by now if he wants to make this argument. In addition, if they are to be functional, then the adaptation really needs to occur on the timescales he measures. Firestone takes this one because there is critical work yet to be done. Score: 1-0 Firestone.

Argument 2: Action specific units cannot be compared
Firestone notes that if you want to choose how to traverse some distance, action scaling is a problem because the units for walking, running, throwing etc will all be different. If the scales are different, you can't compare the measurements to pick the best option. Proffitt agrees, but notes that this isn't a problem because he never claims the system tries to compare measurements to choose actions and the evidence suggests that the action-scales really are only applied within relevant tasks (what he calls action boundaries).

Action scaling is indeed task-specific and cannot be directly compared, and Proffitt is right that it's not a problem (because action selection is about affordances, not action scaling; I've argued this in some talks on throwing recently that I should really write up). Score 1-1, although note that Proffitt doesn't propose a solution to the problem of action selection; I wouldn't pick on this except that he gets huffy about Firestone critiquing without replacing.

Argument 3: There's no information for ability scaling
In order to apply a metric to a visual measurement, there has to be a relation between these two things that has detectable consequences; there needs to be information about the scale as well as the measurement. Firestone discusses eye-height scaling of object sizes. The horizon always cuts objects at eye-height, regardless of distance. Some simple geometry means that because the visual system has access to the angular size of the object and the angular size of the part below the horizon, then it has access to the ratio (the size of the object in eye-height units). Eye height is a viable scale because it has visually detectable consequences; action scales such as walkability or jumpability do not, and therefore cannot be viable scales.

Proffitt appeals to dynamics again; calibration takes time and purposeful behaviour (you act, perceive the consequences and correct the errors). He also highlights that eye-height scaling, while a nice simple example, doesn't actually seem to get used much anyway.

Proffitt misses a couple of key points; calibration requires information, so this is a problem for him. Of course, there are non-visual sources of information that might solve the problem (Firestone only talks about vision). Firestone gets the point for being basically right and for Proffitt not paying enough attention to the critique. Score: 2-1 Firestone.

Argument 4: Visual space doesn't look like it's warping
Firestone says that if vision is distorting space to help guide action, we should experience this warping (because some of the effects can be quite large, contra Argument 1). But we don't. Proffitt has three replies. First, he states that vision is trying to provide stable access to the environment, so lots of perturbations are filtered by the system. As an example, he notes the fact that our view of the world does not whizz around as we saccade our eyes three times a second because of saccadic suppression. He then notes some work which found that we don't notice even when a (virtual) environment really is shrinking and growing (Glennerster et al, 2006). Finally he notes that changing which ruler you're using doesn't actually change the locations of objects in space. Think about measuring a gap in centimetres, then switching to an inch ruler. The distance is the same, the number has changed, but this only matters if you keep acting as if you still used a centimetre ruler. In the same way, people use the current calibration, not the previous one.

Proffitt wins this one across the board. Appeals to visual experience never help in arguments about how vision actually does what it does. While we do perceive the world, we do so by detecting information, and we have no real access to the experience of detecting information per se. More importantly, however, is Proffitt's point about how all that's changing is the ruler. We have no privileged access to the world; all we 'know' is what the calibrated detection of information tells us, and different calibrations are incompatible (see Argument 2) which means there's no way to compare them and identify a difference. As Firestone notes in Argument 3, you need information to detect everything, and this applies to him as much as to Proffitt. Final Score: 2-2.

Tiebreak: Everyone loses
I really want to like Proffitt's work. His heart is in the right place, after all; perception really is scaled in task-specific action units. But his work only ever scratches the surface and rarely deeply enough to justify his conclusions. He needs task dynamics, he needs to frame his work in terms of calibration and he is (as Firestone rightly points out) in desperate need of some information to back this all up.

Given this, I really wanted Firestone's critique to have the requisite substance, because a solid review of this literature and an analysis of what's missing would be a valuable contribution to the literature. But he really only has one major point (about information) and while he's right to say Proffitt suffers here, he's wrong to say action scaling can never have informational consequences. It might not have visual consequences the way eye-height does, but that's not the point.

So really the losers here are us, the readers. Instead of a substantive analysis and defence of a problematic but on-the-right-track theory of perceptual scaling, we got a mixed bag of viable points generally poorly defended and hidden in amongst some irrelevant information and a surprising amount of snark. I'm all for being feisty and punchy when it's called for but everyone just seemed a bit pushy throughout which made this exchange less productive that it could have been. Full marks to Perspectives, though, for allowing the debate and successfully negotiating what I'm sure was a complicated review process.

References

Bhalla M. & Proffitt D.R. (1999). Visual-motor recalibration in geographical slant perception., Journal of Experimental Psychology: Human Perception and Performance, 25 (4) 1076-1096. DOI: 10.1037//0096-1523.25.4.1076

Firestone C. (2013). How "Paternalistic" Is Spatial Perception? Why Wearing a Heavy Backpack Doesn't--and Couldn't--Make Hills Look Steeper, Perspectives on Psychological Science, 8 (4) 455-473. DOI: 10.1177/1745691613489835

Glennerster A., Tcheang L., Gilson S.J., Fitzgibbon A.W. & Parker A.J. (2006). Humans Ignore Motion and Stereo Cues in Favor of a Fictional Stable World, Current Biology, 16 (4) 428-432. DOI: 10.1016/j.cub.2006.01.019

Mon-Williams M. & Bingham G.P. (2007). Calibrating reach distance to visual targets., Journal of Experimental Psychology: Human Perception and Performance, 33 (3) 645-656. DOI: 10.1037/0096-1523.33.3.645 Download

Proffitt D.R. (2013). An Embodied Approach to Perception: By What Units Are Visual Perceptions Scaled?, Perspectives on Psychological Science, 8 (4) 474-483. DOI: 10.1177/1745691613489837

11 comments:

Chaz Firestone27 November 2013 at 04:51
Hi Andrew,

Thanks for engaging so deeply with this discussion — I'm glad to have your voice and expertise in the conversation!

Given that you scored most of the preliminary rounds for me (and the one early round you scored for Proffitt seems to have been given with some reluctance), I'll mostly focus on Argument 4, which was the argument that (e.g.) if backpacks really do make hills look 25% steeper, then we should be able to experience this for ourselves. I thought this argument was really a KO, so I'm disappointed that you're still standing! Here are some thoughts:

1. I'm worried that the concerns you raise with visual phenomenology are misplaced. Of course I agree that the mind is not so constituted as to grant direct access to the inner workings of our visual systems. If anything, my paper defended that very view: I think vision is modular, and one of the properties usually attributed to modular input systems is the lack of 'access to interlevels'. So, I'm very much with you that "we have no real access to the experience of detecting information per se", and that we're not going to figure out how see just by meditating on it. In fact, I take this to be the central lesson of cognitive science; we don't know 'from the inside' how our minds really work.

Fortunately, the fact that we cannot directly access the information-processing done by the visual system does not at all entail that phenomenology "never helps" ("never"!) in discussions about how perception works. Access to the inner workings of a system is not the same as access to its output, and when a theory about how perception works also makes explicit claims about *awareness*, then the way things look to us is perfectly relevant. This strikes me as an utterly pedestrian thing to say, especially in a field like vision science where 'demos' are so ubiquitous in research. Indeed, imagine that a color vision scientist claimed to discover some manipulation that produced huge shifts in color reports, but that nobody was able to experience these effects for themselves when they underwent the manipulation; surely we would be skeptical (and rightly so) that the manipulation really affects perception. And it seems to me that this sort of evidence is commonplace in cognitive science: For example, we can, famously, determine whether a sentence is grammatical even though we cannot access the rules our minds apply to deliver such determinations. Clearly, the lack of access to the processing rules themselves doesn't mean that intuitions about grammaticality are irrelevant to psycholinguistics; here, too, to think otherwise would be to conflate access to the workings of the system with access to its output. This situation is perfectly analogous to the present case: We may not know *how* we see, but surely we know at least something about *what* we see! (in the right circumstances, given certain assumptions, etc.). If someone is going to tell me that wearing a heavy backpack makes hills look steeper — really *look* steeper — then I'd better see a steeper hill when I put on a backpack, especially when (a) the alleged increase in steepness is big; (b) I know exactly what to look for and can attend appropriately (which rules out appeals to change blindness, inattentional blindness, etc.); and (c) I can attend for as long as I please (which rules out hysteresis effects).
ReplyDelete
Replies
Andrew27 November 2013 at 10:05
Thanks, Chaz, glad you enjoyed it.

Some thoughts. My take on the argument about visual experience is that it's dealt with by the argument about incommensurability (which I give to Proffitt with no hesitation, by the by; I just point out he got huffy where he had no need or right to :). Specifically, because different action scales are different scales and therefore cannot meaningfully be compared, then the change from one scale to another scale also won't be detectable in any meaningful way.

The hill has not actually changed steepness; what has changed is the relation between the hill and the person's ability to climb the hill. At time 0, with no pack, the relation is expressed in 'ability of person to climb hill' units (crudely speaking). At time 1, the backpack goes on and after some recalibration, by time 2 the relation is now expressed in 'ability of person wearing pack to climb hill' units. These are incommensurable and people only have access to one of these at a time, thus Proffitt's account actually predicts no detectable warping.

Now, I will grant you that a lot of this work is very careless with the way they talk about their approach, so you're 100% right to pick up on that. But I think the actual nuts and bolts of the theory, as well as what the data suggest, means that their laziness aside, Proffitt is right not to expect detectable changes.

A side note: people are rubbish at comparing lengths in orthogonal dimensions. This has led people to conclude that the relevant geometry that describes visual space perception is affine, not Euclidean. So that's a complicated example, is all.

I'll think more on the rest as well.
ReplyDelete
Replies
Chaz Firestone6 December 2013 at 05:45
Ah, I see that I misunderstood then — for whatever reason I'd thought you meant 'consequences for visual experience' rather than 'consequences for visual information' (though, looking back, it should have been clear...). As you say, I agree that there are no relevant visual consequences, in that sense, of actions. Whether or not it's the point is hard to say! Since neither proper action calibration nor affordance perception requires any effect on spatial perception, it's quite possible (even probable) that whatever the non-visual informational basis for action scaling turns out to be (e.g. some structure in the 'global array', à la Stoffregen) still won't entail that spatial perception should be altered by such scaling. There's a nice Gibson quote on this issue that didn't make it into my paper: "When the constant properties of constant objects are perceived (the shape, size, color, texture, composition, motion, animation, and position relative to other objects) the observer can go on to detect their affordances" (Gibson, 1966, p. 285, emphasis added). So, it was at least Gibson's own view that spatial perception is prior to affordance detection, and I suppose that would be my view too (though I'm no Gibson scholar, and it's quite possible that he amended this view in the 1979 book; I'm sure you'd know better than I!).

At any rate, I'm still a bit floored by your taking issue with perceptual phenomenology as a relevant source of evidence here. If you think there are technical reasons that the embodied perception / paternalistic vision view doesn't require differences in perceptual experience as a result of embodied 'rescaling', then fair enough — although I continue to disagree on this issue (both in spirit and with respect to particular details, such as the Glennerster et al. study), and I think Proffitt's use of the Koffka quote is telling; at bottom this is an issue about how things look, and so it seems to me that phenomenology simply must be relevant. As I mentioned earlier, even if embodied perception doesn't require that molehills grow into mountains before our eyes, I think it is still quite easy to generate cases where there should be subjectively appreciable 'upshots' of embodied rescaling, including changes in appearance of relative size, slant, or distance, etc.

And do let me know what you think of the Linkenauger paper! It is a really fascinating set of results, but I think the precise nature of its relation to embodied perception is yet to be determined. I'm sure Sally will be up for discussing that though! I've spoken with her at length about these and related issues and she has sharpened my thinking about them in all sorts of ways.
ReplyDelete
Replies
Anonymous3 January 2014 at 08:26
I wouldn't worry so much about snark. It's emotion that gets people motivated to get into it. There's actually too much of the opposite problem - people being too polite to say what they want to say.
ReplyDelete
Replies

Add comment

Pages

Wednesday, 20 November 2013

What are the units that perception measures the world in? Firestone vs Proffitt

11 comments: