I'm particularly interested in the information in pictures and mirrors; surfaces that present information about being surfaces and about being something else. This post is me thinking out loud about the implications; these are by no means my final thoughts on the matter, it's me taking the taxonomy for a spin and seeing where I end up. Feedback welcome!
Ecological Direct vs Ecological Indirect Perception
One of the most important things to know about perception is that while we detect information (structure in local energy arrays), what we perceive is what that information is about. In perception-for-action, that meaning is about the dynamics of the event that created the information and it is grounded in the law-based process by which event dynamics interact with energy arrays such as light. This is ecological, direct perception.
There are other cases, however, namely language and picture perception, in which the relevant meaning is not about the local dynamics. Instead, the meaning of, say, a speech event is the meaning of the words created by the local dynamics, while a picture is about the thing it depicts, not the surface doing the depicting. This, I'm going to say, is ecological, indirect perception.
We argue (Wilson & Golonka, 2013) that while this distinction matters, it doesn't make these tasks different in kind, because the information is not different in kind (Gibson, 1971 makes this argument about pictures in particular, and we extend this to language). From the first person perspective of the organism, the task is always just 'detect information and learn what that information means', and humans happen to come prepared and able to learn the indirect meaning of language, rather than the direct one.
A nice example of this came up recently. Sabrina and I were chatting recently with a colleague in Speech and Language therapy. She teaches students the skills needed to be a speech therapist, and this requires them to learn to listen to someone speak and to perceive the details of the articulation, rather than the linguistic contents of the speech. The students, she said, find this extremely difficult to do - they persist in happily understanding what the person has said while not being able to latch onto the underlying dynamics of the speech act itself. To a trained language user, speech events point to their linguistic meaning, not to the dynamics that created them.
If they are not different in kind we can therefore study them the same way; language is special but not magical! This is important, and we will continue to push hard on this point. But we also have to face up to the consequences of the differences. Sabrina has begun this with her taxonomy of information and I want to use this to frame a discussion of one important aspect of ecological indirect perception. The SLT example above illustrates this nicely: cases of indirect perception (where the meaning is not about the local dynamics) still produce information about the local dynamics:
- Language information is about the linguistic content, but the information is structured by language dynamics and by articulation dynamics.
- Mirrors reflect optic flow that specifies a 3D environment, not a 2D surface (and people perceive the former more clearly than the latter).
- Pictures are static and this limits what they can do, but they can provide information about both the thing depicted and the surface doing the depicting.
Information can specify the world, but not by being identical to it
As I've discussed before in detail, we need our perceptual systems to be capable of informing us about the dynamics of objects and events in the world. Our perceptual access to those dynamics is mediated by information, which is kinematic. Kinematics can specify dynamics, so that detecting the former can be equivalent to perceiving the latter. Direct perception is therefore possible.
The fact that perception of the world must work via this kinematic layer has two consequences.
- Behaviour is organised with respect to the information for the underlying dynamics, and not the underlying dynamics per se. Kinematic information can specify, but it doesn't do this by being identical to the dynamics and so we must identify both the relevant world properties (affordances) and the information. The simplest example comes from my work on coordination dynamics: the relevant 'world' property is relative phase, while the information which specifies relative phase is the relative direction of motion. The variation in behaviour that we see in coordinated rhythmic movement (0° easy, 180° harder, everything else difficult without training) is driven by the stability of relative direction, not relative phase.
- Second, we have to learn what the kinematic information that we detect means. We come equipped with perceptual systems that can detect kinematic patterns and tell the different between them. But we have to spend a lot of time learning before we come to use those different patterns as information for different underlying events. A good example of this is work done by Wickelgren & Bingham (2001), who showed that infants could readily distinguish between the kinematic patterns of normal vs. time reversed events, but showed no preference for the dynamics of the 'real' event over the 'impossible' event. Adults have no problems seeing that one is right and the other is wrong, but this took extensive learning.
Learning the meaning of information about action vs. language
Sabrina has worked out a taxonomy of information, in which she lays out the various dimensions along which information varies. One critical dimension is Aboutness, which Sabrina explains this way:
Pole 1 - The meaning of the structure in the array is about* the event or property specified by the structure.A speech event actually creates information at both Poles. There is information that is 'about' both the dynamics that produced it (articulation) and also 'about' the linguistic content of the speech act. Sabrina rightly points out that we must learn that the meaning we want for language is the second one, and we can do this because only that mapping can shape our actions. If you try to organise any behaviour on the basis of the first meaning, nothing much happens because that meaning doesn't typically relate reliably to anything, well, meaningful (unless you are a speech therapist!). In effect, we learn the linguistic mapping for speech because it's the only mapping that is stable and reliable enough to be learned (and because humans come prepared to do so, a fact I think is dictated by the duality of the information).
Pole 2 - The meaning of the structure is about something else - not about the property specified by the structure.
*Aboutness is similar to the specification versus convention distinction I wrote about in the language posts. Meaning is defined in terms of an organism's ability to take appropriate action as a consequence of detecting information. When the action is related to the event in the world that caused the structure in the energy array (e.g., we duck upon hearing a loud noise) then the information is defined in terms of Pole 1. When the action is not related to the event in the world that caused the structure (e.g., when we duck upon hearing the exclamation "Duck!"), then this information is defined in terms of Pole 2.
I've been watching our son learn language, and it took him a long time to realise that the sounds we make when we speak have this second layer of meaning to learn. He's recently 'cracked the code' however, and is accumulating words at a rate of knots. He copies what we say with alarming accuracy and speed (no more swearing in our house :) and the easiest way to get a smile out of him is to act on his speech. He loves being able to get us to do things by speaking ('glasses on!', or 'more toast!'). He hasn't quite solved all the issues; he'll babble away in non-English nonsense and clearly expect us to act on it; it's like he's learned that verbal behaviour produces consequences but is still learning the specific set of sounds that work on his English speaking parents. But he's well and truly up and running towards a meaning that no other animal even goes looking for.
One consequence of the dual nature
But there's a reason children learn motor skills like visually guided reaching faster than language; learning the meaning of language information means ignoring the local dynamics interpretation and going after the more distal meaning of the words. The Speech & Language students have a problem; they have spent their entire lives learning that the meaning of the information created in a speech event is linguistic, and not about the dynamics of articulation. At the age of 19 or so, they are now trying to learn the other mapping, and have to fight both their extensive prior experience and the human disposition to acquire linguistic meaning.
Gibson (1971) talks about a related problem in learning to produce pictures. He suggests that we have to learn to draw forms as seen from a specific perspective because perceptual systems don't work that way naturally;
When one sees an object....[it] is an object in the phenomenal world, not a form in the phenomenal visual field. (pg 31)
The modern child also has to learn [how to draw]. He is surrounded by pictures and is encouraged by his parents to convert his scribblings into representations as soon as possible. But this is not easy, for contrary to orthodox theory, he does not experience his retinal image. And so, in learning to draw, he has to learn to pay attention to the projected forms as distinguished from the formless invariants. If the young child experienced his retinal image he should not have to learn to draw. The 'innocent eye', far from registering points of color or even patches of color, picks up invariant relations. (pg 32)
A continuum of cases?
Sabrina and I started out trying to find ways to use the ecological research programme to structure an analysis of language. This meant thinking about information. Along the way, Sabrina began picking apart the ecological concept of specifying information used for the control of action and identified that this is just one small part of the space of biologically, psychologically relevant information. Action and language are, roughly, two extreme ends of a continuum of informationally controlled behaviours, and Sabrina's taxonomy has started to flesh out the intermediate regions.
People think the jump from perception-action to language is too big, though. I've gotten interested in pictures and mirrors because I think they might be useful intermediate cases. People might be more happy to see an ecological analysis of the information in pictures and mirrors, which gets us past that initial resistance. But studying these topics will also allow us to tackle one of the features of ecological indirect perception, namely how the system copes with the dual presentation of information about the surface and the thing depicted on the surface. This will, I hope, enable us to start slotting some data into Sabrina's taxonomy, solve a few of the methodological issues and get ecological and non-ecological types thinking more broadly about information.
Gibson J.J. (1971). The Information Available in Pictures, Leonardo, 4 (1) 27. DOI: 10.2307/1572228
Wickelgren E.A. & Bingham G.P. (2001). Infant sensitivity to trajectory forms., Journal of Experimental Psychology: Human Perception and Performance, 27 (4) 942-952. DOI: 10.1037//0096-15126.96.36.1992
Wilson A.D. & Golonka S. (2013). Embodied Cognition is Not What you Think it is, Frontiers in Psychology, 4 DOI: 10.3389/fpsyg.2013.00058