Pages

Friday, 26 April 2013

The Information Available in Pictures

I've become fascinated with the problem of pictures and how they relate to the things they are pictures of. One reason is the regular use of pictures of objects to study how the affordances of those objects might ground cognition; this, I think, is a major problem

A more positive reason is that, like language, pictures contain information about something they themselves are not (see Sabrina's information taxonomy). I have a hunch that an ecological study of picture perception might help guide an ecological study of language, because the former can take more direct advantage of the work already done about how we perceive meaning in events via ecological laws but then act as a bridge, a point along the way to the conventional world of language meaning.

Finally, the topic seems to be woefully understudied in the ecological approach. There is some, however. In the comments section on my rant about using pictures to study affordances, I was pointed to the work of John Kennedy (a Gibson student, now emeritus at the University of Toronto). I have downloaded his 1974 book, 'A Psychology of Picture Perception' and am working my way through it. Matthieu de Wit then linked me to an archive of a discussion, in papers, between Gibson and Ernst Gombrich about picture perception. I thought I'd start with Gibson (1971), The Information Available in Pictures, to begin to sketch out what we know and what we don't.


The current question at hand is, can pictures provide the same information about the things they depict?

Two (incorrect) theories about what pictures are
Gibson first discusses two theories about pictures and how they can inform us about their subject:
  1. Pictures re-present the sheaf of light rays the eye would intercept if you were standing in the right place looking at the real world version of the scene in the picture. This then simply triggers the same chain of information processing events ending in the same percept. This is based in standard projective geometry and the idea that vision begins with a retinal image. Besides being an incorrect analysis of vision, this definition of how a picture informs us about its subject fails to explain anything other than photographs or realistic paintings. Line drawings and (especially) caricatures cannot be explained using these tools. Pictures therefore do not inform us about their subject by presenting identical sensations as their subject.
  2. Pictures, like language, are symbol systems. Art is not constrained to projective geometry but instead develops new 'languages' with which to express concepts and the observer must learn to 'read' these in order to be informed by the picture about it's subject. Gibson suggests that the only conventional elements to picture perception are the 'rules' for observing it in order to be convinced it's real (monocular vision, no slant on the image, etc; 'aperture' viewing, in effect), and that everything else about viewing a picture is just an instance of visual perception and will follow the same rules.
Ecological optics and pictures
Gibson then talks about those rules, grounded in ecological optics. He suggests that a picture is a display of optical information. Information is to be found in the invariant relations that are preserved over transformations; information is what remains stable over time and as we move around. These invariants are typically revealed by our motion simply by being the things that remain the same.However, the invariants are also present at all points in time as we move and change views, and so a picture of one of those views should also contain the invariant elements. If it doesn't, it is not a good picture of the object. 

His formal definition of a picture is:
A picture is a surface so treated that a delimited optic array to a point of observation is made available that contains the same kind of information that is found in the ambient optic arrays of an ordinary environment. (pg 31)
and he notes that given this,
...the optic array from a picture and the optic array from a world can provide the same information without providing the same stimulation.(pg 31)
When we perceive a real object from a given station point, we don't simply see the form from that side, we perceive the object as a whole. If the same information is present, then the picture of the object can be perceived in the same way as we perceive the object; as Gibson phrases it, "as an object in the phenomenal world, not a form in the phenomenal visual field".

Given this, Gibson seems to be saying that pictures can in principle provide the same information for the affordances of an object as the object itself.  

A note: Since Gibson, work on identifying information has progressed and largely focused on motion invariants. Take relative phase; it is specified by the relative direction of motion, and a snapshot of two things moving does not tell you their relative phase because direction is only specified over motion. So not all information can be present in an image; in fact, probably most information can't. However, something like the graspability of an object entails information about object size (e.g. the maximum object extent; Mon-Williams & Bingham, 2011) and this could conceivably be present in a picture and able to create the relevant information. If the object and the picture of the object are not the same size, for example, the grasping affordances will vary in ways that affect the validity of using the picture. It is therefore always an empirical question as to whether the information is present and perceived. This has never been tested for the 'grounding cognition' studies I critiqued here.

The duality of pictures

Gibson talks about something I've been thinking about a lot, namely the fact that we can perceive both the information that there is a picture present and the information about the thing depicted in the picture. Gibson notes that adults are able to notice information about both of these aspects, and shift attention from one to the other. He mentions an experiment he ran in which he presented people with a large photo-mural of a street lined with trees, and asked people to judge distances to the trees in the picture and to the picture itself. People did both tasks happily and in the former task with about the same accuracy and confidence as in the real world. (Mirrors have the same duality but the information 'depicted' is so overwhelmingly stable and valid that we are not good at perceiving the surface per se; Lawson et al, 2007). 

Gibson denies that this means we ever mistake the picture for reality. By the prospective projection theory above, if the picture and the scene kick off the same process of cognitive enrichment, then they would be mistaken for one another. But this is not how perception works; it's about information. We would never mistake the edge of a picture for the frame of a window (because when we move there would be no accretion and deletion of optical texture to specify the world coming in and out of view). The dual nature of the representation is therefore always apparent, and he says
A mediated perception cannot become a direct perception by stages. No matter how faithful, how lifelike, how realistic a picture becomes, it does not become the object pictured. Perception at second hand will never be perception at first hand....The experience  obtained by a picture is as if one were confronted with a material layout of light-reflecting surfaces, but only as if. (pg 33)
Summary
This paper wanders around quite a lot. At times Gibson seems to be saying that pictures can inform us about the world the same way as the world does, so pictures should be useful experimental tools. But throughout he draws attention to critical limits on how well this works, and I think on balance that this paper is more about clarifying the discussion than endorsing pictures as stand-ins for the the world.

I think the key points are this:
  • Perceiving a picture follows the same rules as perceiving anything else, and pictures therefore only inform us about something to the extent that they contain the relevant information.
  • Pictures present two sets of information: information about the presence of a depiction and information about the thing depicted. We are therefore never confused that the picture is the reality, unless steps are taken to remove the information about the presence of a depiction. Even the compelling experience of a mirror doesn't actually completely fool the system. 
  • Information is invariant over time and space, and so is typically present at all moments in the event; therefore any snapshot from the event should also contain the information. (This only holds for information that doesn't entail motion). A picture therefore can, in principle, contain the same information as the event and a trained system can be sensitive enough to detect this information (which is typically made really obvious by being invariant over a transformation). 
Pictures can provide the same information as the thing they depict. This is important, because it is clearly the case that we find pictures informative about those things! It is not the case that pictures are always informative or informative about all of the same things as the thing they depict. You must still do the ecological optics analysis to identify what information is present and what it specifies.This analysis will reveal that pictures have an inherent duality that will always affect how they are perceived and which must be addressed. 
It is now possible to distinguish between the pictorially mediated perception of the features of a world and the direct perception of the features of the surroundings, and yet to understand that there is common information for the features they have in common. (pg 34)

References
Gibson, J. (1971). The Information Available in Pictures Leonardo, 4 (1) DOI: 10.2307/1572228 Download

Kennedy, J.M. (1974). A psychology of picture perception. San Francisco: Jossey Bass. Download


Lawson, R., Bertamini, M., & Liu, D. (2007). Overestimation of the projected size of objects on the surface of mirrors and windows. Journal of Experimental Psychology: Human Perception and Performance, 33(5), 1027-1044.

Mon-Williams, M. & Bingham, G.P. (2011). Discovering affordances that determine the spatial structure of reach-to-grasp movements. Experimental Brain Research, 211(1), 145-160. Download

5 comments:

  1. Fascinating stuff... I've been thinking a lot about the accounts of symbolic behaviour presented here and this has added another bunch of reading and kicked off more thoughts.

    Random thought though:

    "Gibson denies that this means we ever mistake the picture for reality." - This immediately made me think of the (possibly apocryphal) stories of the audiences who first viewed animation and moving pictures reacting to the image as real, even though they were projected on (presumably perceivable) 2D screens in low quality. Obviously, moving pictures would be in a different class of "picture" to still images since they would contain a lot more information (depending on the camerawork), but it raises the possibility that when pictures are first encountered they could be mistaken for reality, however briefly, if the organism has no history of responding to that sort of symbol.

    Not that I think that would be a real problem for the account, and it'd be nearly impossible to test unless we invent a new medium for presenting symbols and keep it terribly secret while we experiment. (Or find a tribe in deepest darkest somewhere that has no visual art... which seems unlikely... though I do seem to remember something about certain tribes having trouble with line drawings, so that might give something to an account whereby a particular history is required to pull the correct information from an image)

    - Anthony

    ReplyDelete
    Replies
    1. Moving pictures are a more interesting case; they have the potential to contain more kinds of information and thus are more capable of being realistic. To follow Gibson's point, though, they still don't flow correctly (ie when you move your head your view of the scene depicted doesn't move correctly) so there is again this necessary duality and information for the duality.

      He also suggests that the 'naive' (no special training) perspective is to perceive invariants, not forms; in order to draw forms, you have to learn to perceive forms and not invariants. You can do it, but you must learn to do it. So I think he's actually suggesting that your tribe with no experience of images would be less likely to confuse it for reality, because it's so unlike the real thing.

      Developmentally too, he says that children must learn to draw forms (recapitulating this historical, cultural process in developmental time). So there's a ;potential test; are children less likely to mistake a picture for the real thing?

      Delete
  2. Minor first thought re mirrors... Have you seen the relevant deleted scene from T2? If you don't care about Terminator history and lore, just skip to the punchline starting at about 5 minutes https://www.youtube.com/watch?v=F1uTCMym2tM

    Anthony,
    Do you know anything about the cool research on Aborigine Australian art and their interaction with missionaries? It is actually a really good context for you thinking, because they have pictures, but it works very differently from the European tradition. For example, their art assumes a view from above (because it grows out of a tradition of mixed drawing/story-telling in the sand), rather than from the side.

    ReplyDelete
    Replies
    1. I had a vague awareness of it existing but couldn't remember details.

      Thanks!

      Delete
  3. Thanks for sharing, I will bookmark and be back again...



    Mass Calibration

    ReplyDelete