Monday, 18 July 2016

Reply to Graziosi: In detail


This post is a detailed reply to Sergio Graziosi's useful critique of our Ecological Representation pre-print. As such, it's specific to his particular concerns about our argument, but I'm putting it here so that others can join in the discussion. If you are reading this and are not Sergio, you might want to head over to his blog and read the critique first. 


First, thank you very much for your detailed critique of the paper. It is incredibly useful and we are sincerely grateful for the time you've taken to comment on our crazy ideas. 

A quick note to start with. I am largely writing this reply reading a little bit of what you’ve written at a time because I want to respond to each point you make and to keep myself honest in evaluating your later proposed solutions to some of the problems you’ve identified in the paper (so I can’t shift my positions on basic issues!). So, apologies if my responses to these points aren’t relevant because of something you say later in your reply. 


(Note: things in quotes are text from Sergio's critique)


“if I were formally peer-reviewing your paper I would recommend to reject unless you are willing to show how organisms manage (or may manage) to extract EI from unspecific stimuli.” 

“EI is indeed present outside, but already considering it a representation is at the very least misleading, as it is effectively hidden by the vast amount of potentially irrelevant data.”

I’ll offer an analogy (and then, hopefully, do a lot better, but this might be a useful idea to keep in mind): This relates to the idea of EI being “hidden” by irrelevant data. Think of a lake. There are a lot of independent water molecules jostling around. There is little structure – high entropy. Now, let’s say that a kid throws a rock into the lake. Suddenly, a subset of those molecules is moving in a specific way – structured by the physical event of a rock crashing into them. Would you consider the resulting ripples as being “hidden” in the same way as EI (structure amidst a bunch or irrelevant stuff)? Ecological information variables are like the ripples caused by the rock – structure in an otherwise symmetrical medium. If the ripple isn’t “hidden” then neither is EI. You may very well think the ripple is hidden, but this would be useful for me to know so that I can home in on your concerns!

Of course, this is a nonsense example because what is “hidden” is relative to an observer. When I ask whether the ripples are hidden, I mean, can the ripple be objectively physically identified. The answer, of course, is yes. The right kind of measurement devise can indeed detect the ripples. The cog psy question is whether this measurement device could, in principle, detect the ripple without having to separate the wheat from the chaff (without having to keep the relevant stuff and disregard the irrelevant)? I’d say that if such a measurement device can exist, then this is consistent with saying that EI in the world is a representation. If such a measurement device can’t exist, if there must be some kind of stimuli extraction, then EI wouldn’t count as a representation, though neural reps caused by EI would. 

A point that I don’t think comes through well enough in the paper is that structures in energy arrays only get labelled as informational representations if they play a role in action selection or control in some organism. That is, they have to function as representations (as stand-ins) for something, in order to be representational (this is a pretty classic move and we’ve adopted this perspective to fit in better with traditional representational accounts). The reason we might want to label this thing in the world as a representation (rather than just the resultant neural activity) is related to the first order isomorphism fallacy. This is particularly clear for very basic nervous systems. Essentially, if the representation exists in the world in the form of EI, then the nervous system can get on with the business of coordinating with that information, without needing to copy it over into a representation in neural activity. If there is a reasonably immediate connection between sensors and effectors, then there is no need to invoke neural representations to explain the preservation of informational structure in behavior. For more complex nervous systems (like ours), it is very likely the case that this structure must be preserved neurally to explain information/behavior correlations. Otherwise, the burden is to explain how the right structure is somehow recovered later on in processing (which, as you probably know, gets us in the territory of all of classical cognitive science's biggest problems). 

Okay, so that’s a brief justification for why we would like to keep external EI representations (and a better description of when structure in energy counts as a representation). The meat of your critique (and a quite important one) is how we can justify that this structure somehow directly structures neural activity without some kind of pre-processing, at the very least, to separate the structure from irrelevant variation in stimulation. 


“Once transduced, what was before unspecific energy or molecules becomes something which can be directly interpreted as a signal (the action potentials travelling through the axons of sensory neurons). Nothing particularly new in this, but this very general and universally accepted picture is apparently hard to reconcile with the vision you are proposing.”


There is an important ecological critique to make of neuroscientific paradigms investigating energy transduction and its consequences for subsequent neural activity. First, this line of research comes directly from an extensional analysis of what is available to perceived. That is, the hypothesis about what there is “out there” to be perceived is related to an analysis of how sets of individuals with certain properties structure energy media. As Turvey et al (1981) show, this analysis leads to the inevitable conclusion that structures in energy media aren’t specific to biologically or psychologically relevant properties. Second, this line of research is almost entirely concerned with how the nervous system detects primitive features (or perceptual primitives) and then builds a coherent representation out of these. This follows on from the belief (justified from the extensional analysis) that psychological properties must necessarily be constructed. Perceptual primitives are basic attributes of physics (e.g., line orientation, stimulus intensity); they are not ecological properties. Therefore, results based on investigation of these (which the ecological position argues are not behaviorally relevant) don’t tell us anything about how the transduction and propagation of EI. Neuroscientists have not been in the business of identifying ecological information variables and measuring how the structure of these variables is preserved or transformed by perceptual receptors and consequent neural activity. So, rather than the positions being difficult to reconcile, I would argue that we mostly haven’t been doing the right experiments to see what happens when EI makes contact with perceptual receptors and nervous systems. I have, however, found several experiments which suggest (to me, at least) that the picture from neuroscience would be a lot more coherent if we went looking for the consequences of EI rather than correlations between perceptual primitives and neural activity. I’ll mention some of these below. 

I think there’s also a good argument to make that our nervous systems are built to be systematically shaped (either in evolutionary or individual time scales) by EI, but not by irrelevant stimulation. 

Let’s have another analogy (sorry, I can’t help myself). We have a rock floor in a cave. The cave is humid and water condenses on the ceiling and drops to the ground. If the cave ceiling is fairly uniform, then these drops will be uniformly distributed. Although each drop might displace a molecule or two of the cave floor, over time you will have a uniformly worn surface. Now, let’s have a discontinuity in the cave ceiling. Maybe there is a stalactite which funnels all of the condensation from a relatively large area to a single point so that more water hits the cave floor at that point than at any other (please note, I have no idea if this is how caves work). Over time, this will cause a depression in the cave floor relative to any other location. In a crude sense, this is the difference between ecological information and stimulation in terms of nervous system activity. This analogy, I think, points to how ecological information comes to structure neural activity without the need to actively filter out irrelevant stimulation. There is a related issue concerning why a system currently responds to one set of information variables rather than another and I deal with this a bit later. This section deals only with how ecological information is different from stimulation and how this might affect the nervous system.

So, argument 1 - logically, EI wins over stimulation: Ecological information is the only thing stable enough to cause long term systematic changes to nervous system activity (this is EJ Gibson, education of attention logic). These changes are in the direction of increasing sensitivity to EI, as nervous system structures and connectivity are shaped by ongoing exposure to these discontinuities in energy fields. In complex nervous systems like ours, the shaping of neural activity by EI takes place largely during the lifespan of an individual. The point is, no active filter or extraction process needs to be assumed. Let’s let everything in. Over time, only stable spatiotemporal structures like EI will cause systematic changes to nervous system activity. 

For behavior to exhibit any stability, it must be linked to stable (and context-appropriate) nervous system activity. If you like, the filter is the shape of the trained (or evolved) nervous system. It’s the cave floor worn unevenly by a discontinuity (i.e., structure) in the location of water dripping from the ceiling. I think this an important point because it does the work you want (namely, selectivity to EI over random stimulation) but achieves it without requiring the type of selection you mention.

Argument 2 - the structure of EI is apparent at the level of perceptual receptors: The structure of retinal flow on the retina preserves the structure of optic flow in the world (Li & Warren 2002). This shows a relationship between the structure of EI and the structure of activity at the site of transduction. Behavior is organized with respect to structure in optic flow. Nervous system activity must somehow carry information about this structure. As argued above, EI variables are the only things available to cause systematic changes to nervous system connectivity, making nervous system activity increasingly efficient at preserving the structure of the relevant EI (see van der Weel and van der Meer 2009). 


“To be considered as such, one needs to take as a given (gloss over?) the context and internal state of the perceiving organism: depending on contingent factors, including the task at hand, what counts as EI changes all the time, so I think we’d be better off by accepting that EI is such in virtue of internal factors as defined by the organism itself and, crucially, its own ecological needs.”


“However, in the world out there there is a hell of a lot more structures and dynamics, all of them co-existing in a seemingly chaotic mixture. A priori, all of them may have important ecological implications for a perceiving agent. Importantly, in your own example, what makes the “relative direction” criteria relevant to the subject is determined by something inside the subject (in this case, what the subject is trying to do).”


These points move into the territory where we need to explain why a system is currently coordinating with one set of information variables rather than another. One thing I find useful to think about is that changes in internal state are changes in physical state. The brain at time A is, in many important ways, physically different than the brain at time B. There is nothing mysterious about the fact that EI making contact with brain time A has a different consequence to that of information making contact with brain time B. There is also no mystery about how the system might change so as to begin responding to different variables as the context (or evolution of internal physical state) changes. Even a very simple mechanism of neural habituation provides an example of how such a change might occur. 

This, crucially, has no bearing on what counts as EI, which is objectively definable apart from what a given organism is doing at a different time. What changes are the particular variables that are having consequences on action selection or control and this is explained by the current context (what is available) interacting with the current state (the physical state of the system including the entire body, not just brain, and embodying constraints from evolution, learning, and current physiological functioning - are you exhausted, hungry, etc?). So, it is fair to say that a currently present EI variable that is not having consequences on behavior does not represent anything to that system at that time. But, this doesn’t seem to be any different than the fact that a particular mental representation might only be active in some circumstances (depending on current internal state and context) so I don’t think this introduces a particular problem. Nor is it different than the fact that a representation in a formal computational system might not describe the relationship between the physical and abstract level of the computation currently underway (to invoke a more formal sense of representation, a la Horsman et al 2014). 

I do think we agree on the idea that something about the notion of EI is organism-dependent. But, I think this is accomplished by the fact that EI variables can specify dispositional properties of the environment, which are inherently defined in terms of effectivities. For example, there is a complementarity necessitated by the dispositional property “throwable.” This property can be effected only by organisms who themselves have certain properties. Nonetheless, the property can be defined without explicit reference to the organism (it is a property, not a relation). The perception of such properties (via specifying EI) IS relational – that is, the organism can organize behavior according to whether that property specifies “throwable by me” – and this successful coordination arises through either evolutionary or learning level constraints. 


“Yes, in a sense the EI is out there, it is external, but what makes it “ecological” or, if you prefer, what makes it possible to extract the signal, differentiate it from the irrelevant (not ecological, not relevant for the organism for the current task) is exclusively internal.”


Hopefully, I’ve gone someway to justifying why we want to call the external thing a representation (first order isomorphism fallacy) and limiting the scope in a way I think you’d like (only calling it a representation if it’s used by some system as a stand-in for a property of the environment). There’s another point worth making that is related to this particular way you’ve worded the issue.

Ecological behavioral models are, indeed, very good at predicting and explaining behavior, in large part because they identify the actual structures in energy arrays that are relevant to particular behaviors. But (and I talked about this recently in Poland), these models only work because WE are particular types of physical systems that respond in particular ways to these structures. You can’t just throw EI at any system and get functional behavior (even if it had the right kind of body). What’s required is the right kind of perceptual systems, nervous system, and other bodily systems so that kinematic energy patterns cause reliable changes to nervous system activity that has reliable consequences for other bodily systems (particularly the musculotendon system) such that behavior can be coordinated with respect to the kinematic structures. 

I 100% agree that this part of the story is overlooked by basically the entire ecological literature. Our paper is meant to go a small way to begin addressing this by positing that ecological information could structure nervous system activity and that this relationship would be a starting point for talking about how it is that we are the right kind of physical system to respond to EI. I would go further than this though; we (all animals) have evolved into the kind of physical systems we are only because ecological information is (and was) available to act as a stand-in for relevant properties of the world. My bet is that it was the ability of this external structure to function as a stand-in, to designate distal properties, that got the whole animal show running. I encourage you to have a read through some of Fred Keijzer’s stuff on the evolution of nervous systems – it challenges the neuron doctrine premise (that nervous systems are best understood as information processors) and argues that nervous systems are primarily coordination devices. In my view this doesn’t mean that nervous systems don’t also process information, but that’s another issue! 


“the information needed is by definition out there, but it actually becomes proper Ecological Information because of how it is internally processed.”


A couple of points. First, by our application of Newell’s definition, only a subset of neural activity is representational. EI could and does (at least in some types of nervous systems) structure behavior without the resulting neural activity preserving the EI structure. Any case of associative learning is a good candidate for this type of example because in these cases the structure of behavior does not correlate with the structure of the information. So, it would be a mistake to think that the only real EI is a neural rep of EI, because this would omit lots of cases where behavior is structured with respect to the information but there is no internal representation.

Second, whatever the result of the “internal processing” the resulting activity is only useful in structuring behavior (in an action control task, e.g.) to the extent that it preserves or systematically transforms the structure of the EI variable. I could see a motivation for you to claim that the information out there only becomes proper information AFTER it is internally processed (though I disagree). But, even if I thought that some kind of active filter process was required to separate out EI from irrelevant stimulation, our hypothesized neural representations owe everything useful about them to the extent to which they preserve behaviorally relevant aspects of the external structure. I guess my problem with the quote above is that it seems to imbue internal processing with powers to make something ambiguous informative and this is precisely what we argue is not the case. Now, it is obviously not the case that there is a direct line through the nervous system that perfectly carries an EI signal. EI makes contact with a number of physical and chemical systems and the structure is changed as a result. The point, though, is that these changes are not in service of enriching ambiguous information, which is the classical cognitive perspective. Furthermore, I think we’d be in a much better position to understand the function of these changes if we adopt an ecological neuroscientific position where we try to understand how the nervous system supports action selection and control in the presence of particular EI variables in a particular task context. 

Okay, these are my comments up until the main event in your commentary! I’m looking forward to seeing what’s next…


“The aim is to isolate EI and to discard the rest.”


Depending on how you talk about this process, I may or may not agree. I aimed to show earlier that getting a system to respond selectively to EI doesn’t require an active filter. If you are planning to argue that the effect of repeated exposure to structured EI amidst random stimulation is to shape the nervous system to progressively respond more selectively to EI, then I’m on board. 


“The problem is that what counts as EI is both context-dependent and internally defined (depends on the state of the organism).”


This is, I hope to have shown earlier, not true. The physical system in contact with EI (us) is in real and important ways, different at different times and across different contexts. What is available as EI is, at any time, externally and objectively definable, but the consequences of those variables on action will depend on the current state of the system. Again, I see no difference between this and any other representational theory, in that 1) only a subset of representations will structure behavior at any given time and 2) the consequence of a given representation will depend on the rest of the internal state of the system. 


“Such a system needs to be dynamically able to identify the correct kinematic projections from the original (outside world) dynamics. At any given time, the set of possible kinematic projections is effectively infinite. How can a system optimally isolate the correct ones when it can’t make many assumptions on what will make them “correct”? [If you wish, I’m merely restating the framing problem.]”


This is where talking more about Bingham’s task specific devices comes in handy. But, it also relates to the point I made above. We are literally different types of physical systems at different times and in different contexts. Some of these systems are built to be responsive to particular EI variables, other systems to others. Mechanisms within these systems change over time according to their own dynamics which changes sensitivity to external information. The environment changes to offer different opportunities to the system such that, if the system is in a state that is potentially responsive to those new variables, it is shaped by this new information. I think the error is in thinking that we need to explain how an identical system somehow responds flexibly and adaptively to a subset of information. The point is that we are not identical systems across time. Mechanisms (e.g., low dimensional assemblies of inherent and incidental task dynamics) that structure behavior at one time may completely cease to exist at another time because task specific devices are softly assembled. 


“One solution comes from the prediction-based approach: if you can manage to transform input at time A in such a way that it efficaciously predicts input at time A+1, you are guaranteed that you are keeping as much potential EI as possible, while at the same time you are discarding everything else – you are distilling the potential EI while filtering out all the noise.”


Okay, so my first reading of this tells me I need to do some background work before I can answer properly. In doing this I’ve gone to your information post and I was thrilled to see your criticism of Shannon information as a theory of moving things around and not of information, itself. This has always bugged me! I’ve also gone and read some of the stuff you linked to (re genes and information, e.g. – great stuff).

I also like “information is a structure that makes a difference.” The fact that you think this gives me hope that you’ll like my clarification of when a simple structure in an energy arrays gets to be considered EI. Although, my initial thought on this as a definition of information is that it is potentially trivially broad. For instance, the cave example I gave above introduces a structure (non-uniform ceiling) that makes a difference (to the contour of the floor).

**Cue time spent thinking about your proposed solution…**

Based on the first part of your review, it seems like your main issue with our current explication is that we don’t satisfactorily account for how the system comes to respond to EI when EI is present in a sea of irrelevant stuff. When I read that critique I took it as a challenge to justify how this problem might be solved based on what we know about how nervous systems change over time. In other words, I took it to be a problem that requires a solution grounded in biology. Perhaps you agree that this would be ideal as well. This seems to be the case as you say you would "recommend to reject unless you are willing to show how organisms manage (or may manage) to extract EI from unspecific stimuli.”

In any case, this expectation caused me to scratch my head a bit at your proposed solution since it’s a very abstract description of how such selection might occur. If I understand you correctly, your solution would work because only prediction based on stable structures (EI) would lead to accurate predictions – predictions based on irrelevant stimulation would be inaccurate. I don’t think that this solution succeeds in closing the gap you identify in our original argument. Despite many people using the language of prediction to describe what brains do, and even if we decide to allow that at least some of this language is justified, I don’t follow how we can apply the idea to how real brains use prediction to home in on EI (I’ve explained earlier how I believe they do it, which is more like, EI is the only stable stuff to be perceived so it's the only thing that can cause stable changes to nervous systems). 

First, how is the transformation relation established? If it is learned, what is the basis for learning? What drives exploration of the solution space to arrive at the optimal transformation relation? Second, what is the mechanism for evaluating prediction error? This is an important question even if you intend evaluation to be an emergent feature of the system (rather than something requiring an executive). We still need to know how this particular physical system differentially responds on the basis of correct or incorrect prediction. As currently described, I don't think this proposed solution meets your requirement of showing "how organisms manage (or may manage) to extract EI from unspecific stimuli." This is because the solution is not linked to the operation of a particular physical system - sure a system that could implement would solve the problem, but is the human brain this type of system? 

I, personally, don’t get a lot of mileage out of using the word “prediction” to talk about what brains do. But, there is certainly something to the idea that an organism operating as a particular kind of task specific device has internal neural dynamics that can unfold for some time even when contact is lost with task-relevant information. This is the kind of thing that supports our ability to keep track of where a moving object is if it is temporarily occluded. If someone wants to call that “prediction” okay, but I think it obscures the real explanation. That said, I think having a formal way to characterize why stable energy structures “make a difference” to our nervous systems, is a useful and important goal. There is way more work to do on this problem than the very cursory sketch I've provided. If you have more to say about how this links with Shannon information, that would be very interesting as well!

Even though I don’t think the solution (if I’ve understood it correctly) exactly solves the problem, thank you very much once again for your thorough critique; it’s been very useful. The paper will be much improved by taking on board this criticism and trying to make an explicit case for how biological systems learn to (or evolve to) respond to EI, but not to unspecific stimuli. I hope we can keep going back and forth on these ideas!



10 comments:

  1. Sabrina,
    thanks for taking me so seriously! I can't begin to explain how I find your reply both humbling and gratifying. (I am not deploying standard or empty courtesy, I really am pleased!)

    Because of length limits, I'll limit myself to the bird's eye view. Besides being pleased, my immediate reaction to your reply was:
    "Oh dear, we really talk and think in different languages, how will we ever understand each other?"

    Luckily on subsequent readings I found more and more reasons to be optimistic: there is much we agree on.

    Before tackling organisational matters, I'll mention what I think are the main disagreements: "just" two! The aim is to check whether you also recognise what follows as the meat of the argument. If you do, this should help us organise the debate.

    Disagreement (D1): you think it's relatively easy to solve the framing problem, I think it is very hard instead.
    Disagreement (D2): you don't think approaches based on predictive processing can be of help, I do.

    Organisational
    (1): we are all very busy, but have a lot of stuff to discuss. My own problem is brainpower: if I rush my replies I know I will be sloppy and either miss the point or just make mistakes. This would aggravate the problem, making us waste time/energy in clarifying misunderstanding or correcting mistakes. Would like to avoid it, but the only solution I can see is to proceed slowly. Means you'll have to wait for my replies. Which might not suit you in case you wish to revise and resubmit relatively quickly while using this discussion.
    (2): length limits on comments here are/can be an obstacle. Should I reply on my blog again? (totally ok for me). A part of me thinks we should meet and talk, but might be another organisational nightmare or undesirable for other reasons. I'm London-based, so not unreachable ;-). No strings attached, as usual.

    Back on the subject, I think that D2 becomes relevant only if we'll agree that the framing problem is indeed a big problem. Thus, when I'll write my longer reply (or replies? Ouch!) I'll concentrate on D1 unless you'll direct me in other directions.

    Today, I can make a technical point and finish with a minor detour.
    In your reply, you mention the following a few times (following my own repetitions!):
    The point is, no active filter or extraction process needs to be assumed. Let’s let everything in.
    Before that, you write:
    The cog psy question is whether this measurement device could, in principle, detect the ripple without having to separate the wheat from the chaff (without having to keep the relevant stuff and disregard the irrelevant)?
    IOW, we have input (some energy patterns coming from the lake - lots of bits, in formal Shannon's terms), device, output. Output is binary, one bit, signifying ripples detected or not. Thus overall, "detecting the signal" is inevitably equivalent to keeping the relevant stuff and disregarding the irrelevant. Detecting the signal is actually the same thing as removing the rest. This is an important concept which I've linked to "lossy compression", and one which applies to decision making as well, so I'd need to make sure you appreciate the point. Do ask if it's unclear or unconvincing.

    On D2, did you stumble on my own two posts on the Bayesian Brain? If not please have a look. The 1st could help with the head-scratching, it may help you see that yes, I'm after (full) mechanistic explanations, nothing else would satisfy me. The 2nd should help because it hints at something I should tackle explicitly: theory building is facilitated by interesting tautologies, unlike empirical verification. This also applies to the problem with "a (structural) diff that makes a diff": yes, it can apply to virtually everything, that is why (I hope) it is useful! Just hints, as there is no space for the full argument.
    Enjoy!
    Sergio

    ReplyDelete
    Replies
    1. Post where ever suits you best; just link us to anything you write! I think we might be at the point where taking things one at a time might help though; try to clear things in order so we don't spend too much time at cross purposes.

      Delete
  2. Some quick thoughts:

    IOW, we have input (some energy patterns coming from the lake - lots of bits, in formal Shannon's terms), device, output. Output is binary, one bit, signifying ripples detected or not. Thus overall, "detecting the signal" is inevitably equivalent to keeping the relevant stuff and disregarding the irrelevant. Detecting the signal is actually the same thing as removing the rest. This is an important concept which I've linked to "lossy compression", and one which applies to decision making as well, so I'd need to make sure you appreciate the point. Do ask if it's unclear or unconvincing.
    I get that this is the Shannon style description. But the issue is that this does not describe the causal chain of events that a perceptual system goes through in order to resonate to the information and not the noise. So this (to me) again points to the limits of applying Shannon style analysis to perceptual systems (Sabrina's been interested in overlap for a while, but I'm much more suspicious that it's going to work out).

    The ecological analysis comes from EJ Gibson and her work on learning; an early key reference is Gibson & Gibson (1955).

    The idea is you have a perceptual system that can learn to resonate to a sufficiently stable signal, but that this learning process takes time. The optic array (for example) present in a given task dynamic is constantly transforming but contains stable higher order relational features (invariants-over-transformation) that are EI information variables created by the lawful interaction of light with surfaces and their properties. Given this set up, the only thing the perceptual system can learn to resonate to are the things that are stable for long enough for the learning, attunement process to take place. The noise hits the system but simply fails to have any long term consequences because it's too short lived.

    So while the effect can be described as 'filtering the noise and latching onto the signal' the actual system is not actively filtering the noise; there's no computational overhead, for example. The filtering is kind of embodied in the design of the perceptual system.

    I guess at one level you do need to identify that there is filtering required in order to get to this analysis; but the Shannon description of the final system remains inappropriate because it implies active filtering remains something the system has to do every time it interacts with an optic array.

    ReplyDelete
  3. Andrew,
    (thanks!)
    I'm with you on the high-level functional description on how regularity can drive learning, while white noise can't. I am also with you in saying that "many bits as inputs -> something -> less bits as output" is not (and does not try to be) a full mechanistic explanation of anything.

    However, I'm not with you in two important (I think) additional points. I'll frame both as questions.

    1. Shannon's info has limits, yes, agreed. That's why I felt the need to push it beyond its original domain. However, because it links to fundamental physics, once pushed a little further, it has the potential of grounding a full mechanistic model. Hence my interest.
    If the approach will work, it will allow to avoid having to model each particle in detail and just deal with structural properties (Shannon's info+), which seems like a very welcome possibility.
    In this context, I'm struggling to make sense of why the approach can't be used to start modelling perceptual systems. Will we need more ingredients? Sure (EI being my preferred candidate!). However, this need for more "ingredients" applies also to the G & G approach, so what's the catch you are avoiding by sticking exclusively to the latter? [My bet: some catch which applies to classic Cog.Sc., but doesn't apply if one links SI+ to EI straight away.]

    2. I guess at one level you do need to identify that there is filtering required in order to get to this analysis; but the Shannon description of the final system remains inappropriate because it implies active filtering remains something the system has to do every time it interacts with an optic array.
    Which forces me to ask: in a mechanistic model, what's the difference between active and passive filtering?
    On one level it's just stuff interacting, this neuron fires, contacts other neurons and so on. So you just don't find "filtering". If you shift to the S's informational view, you get that the filtering which happens, IS one and the same as responding to (some of) the regularities in the input. If you wish I'm merely repeating the first question form the opposite perspective: what's the insurmountable obstacle we hit by choosing this approach?

    Or, let me reassure you: no, I am not implying any special, separate and additional "active filtering". I'm trying to explain why I think that modelling perception can be simplified by deploying the filtering/compression concept in an ad-hoc manner. I'm also saying that doing so shows where the tension between classic Cog.Sc. and the ecological approach comes from, which (if true) would be very good news (to me).

    Does the above help a little?

    ReplyDelete
    Replies
    1. The thing I currently do not know is how to characterise something like tau in Shannon terms, and what you get from that analysis. The geometrical, ecological analysis points to the specific real structural feature in the optic array that specifies time-to-contact; what does the Shannon analysis buy me beyond that? If someone could actually do such an analysis and show me what it tells me, that would help.

      I think we might basically agree on the filtering issue; I just like to try and keep the actual mechanism in mind at all times these days because it often makes a difference. It may not here.

      Delete
  4. Sabrina and Andrew,
    I'll try to keep this conversation in here, to avoid too much fragmentation, after all I am hoping to help, so your blog is where the discussion belongs. (I will however fragment today's reply in multiple comments: it's long!)

    As per Andrew's suggestion, I'll try to tackle one point at the time, starting with Shannon's information, my way of reframing it and why I think it may help. One important disclaimer: I've mulled over these matters since I've wrote my first reply, and kept confabulating with more intensity after reading your reply. The main concern I have is that I can't separate my own thinking from your proposals, they are so tantalising close! This however has a dangerous consequence: I cannot find a way of not advertising/pushing my views, which in turn may make my replies less and less useful to you. This wouldn't be the intended effect, so I'll need you both to feel absolutely free to push me back on track, forcefully and explicitly, 'casue when I get excited I might become hard to steer. Agreed? I know it may feel awkward, but please don't be shy.

    FI, your paper has been mentioned a few times on Conscious Entities (also by yours truly) here and here, you may want to keep an eye on the latter thread as it might evolve in a lengthy/interesting conversation (covering the philosophical deep-end of our discussion).

    On information (Shannon's, lets call it SI, with my addendum SI+). The more I think about it, the more it seems to me that SI+ and Ecological Information (EI) are so closely related to be almost indistinguishable. To see why, I need to make explicit the part played by "structure". My shorthand definition is "SI+ = a structural difference that makes a difference". In the OA, I write that the difference made needs to depend on how stuff is assembled: if assembling the same blocks in different ways makes the result interact with the rest of the world in different ways, then the structural differences (how our medium is assembled) count as informational. Typical example is DNA, for protein synthesis ATG stands for Methionine but TGA is a stop codon. Same ingredients, different structure -> different effects. Crucially, the difference in effects is mechanic, a stop codon does what it does because of molecular interactions, nothing more. However, we can identify the structural differences and by doing so, we make studying DNA easier (we can represent it in terms of sequence, and forget all the details - we deal with informational content of DNA, leaving its embodiment aside, when convenient). In the lab I could plan changes and effects of DNA sequences, using "just" the letters, and then re-translate my plans into real-world DNA to see/measure the actual effects. Manipulating DNA sequences on paper is possible because it preserves the structural relations that happen to make the difference (that is: the difference we happen to care about). During "on paper" manipulations, the symbols A,T,C, and G stand for instances of the actual nucleotides - they are genuine representations, if there ever was one. On paper manipulations of DNA are useful/possible because we can capture what makes a difference and represent it in terms of pure information. We represent structural relations "A, then T, then G" and forget about the actual physical structure. IOW, we focus on the invariant part (the structural relationships) and discard the implementation details. Please keep this concept of "invariant side" in mind as I'll be using it again later on.
    (Continues)

    ReplyDelete
  5. (continued 2/4)
    Now consider EI and the distinction between kinetic (out there) and kinematic (perceivable) properties. Like the DNA example above, what matters is that the structural relations are preserved from one into the other. Kinematic properties can be useful only when they lawfully correspond to kinetic properties of the environment. Exactly like DNA representations, they work because they preserve structural relations. Thus, via Landauer, we can measure the SI potential (pSI) for a given physical system (via fundamental physics, we may theoretically enumerate all possible arrangements of the single components in the system), which will happen to represent the upper bound of all information available. Identifying EI then adds ecological constraints about what is perceivable and what is ecologically relevant. In SI+ terms, we look for structures which can make a difference for the biological subject we're investigating (these differences apply both to what is perceivable and what is ecologically relevant) also, do note that structures can be and often are manifest over the time domain, not necessarily over space alone - i.e. kinematic!
    So far, we have a theoretical reductionist picture of information encompassing all our definitions. At the bottom, we have potential SI, via Landauer: it is potential because it doesn't imply an observer/decoder. One notch above, we (theoretically) have (almost)SI+, which excludes all the differences that don't make a difference, so it's a bit sterile (if a difference doesn't make any difference, it is by definition undetectable and therefore not a difference that can be accounted for). However, if we go just one step beyond, and look at differences than can make a difference to some other structure/system/organism, lo and behold, we get EI (or, if I may SI+) straight out of the box. A couple of notches upwards, we then encounter "classic" SI, when signals are transmitted between sender and receiver. When the receiver is biological, these signals necessarily pass through a stage supported by EI/SI+ alone, as the receiver needs to be able to perceive them.
    You may agree, but you may also reply: "OK, but why do we care?".
    Few reasons. You ask about defining Tau in SI terms, and why it may help. Well, if you design an experiment, measure something which allows you (the experimenter) to derive Tau, maybe even manipulate it for experimental reasons, and then check whether the organism studied actually responds to Tau and not something else, you are, willingly or not, already defining Tau in SI terms. If time to contact according to Tau is 3 seconds, and you write it down on your paper, you (not me) are already doing what you're asking me. Thus, I don't need to, you've done the work already.
    The importance of all of the above is, to start with, about theoretical coherence. In other words, I'm suggesting a way to stop saying that "[EI] does not mean what most people mean when they talk about information (sorry)". It may not immediately correspond to what lay people mean with Information, but _underlies_ it without exception. Everything we normally refer to as "information" eventually gets implemented (embodied?) in some EI form.
    The other reasons are about useful conceptualisations.
    First of all, if we look at the proposed reductionist hierarchy, pSI, SI+/EI, [...] and (classic)SI, there is a quantitative gradient: pSI ≥ SI=/EI ≥ [...] ≥ (classic)SI. This directly links to the conceptualisation I'm proposing based on lossy compression. It may or may not be useful, but it directly allows to tap onto works on decision making & control in engineering, meaning that this move allows to expand the pool of theoretical tools we can deploy in empirical investigations. More options and tools, sounds like a positive development to me.
    (continues...)

    ReplyDelete
  6. (continued 3/4)
    Third, we may observe that (classic)SI is "an abstract description of how to reduce uncertainty between a sender and a receiver". But hang on, if EI is perceived, and is (or readily becomes) a representation, isn't it almost equivalent to (classic)SI? To me, yes it is: what is missing in the case of EI is an explicit sender, what is different is that the code used for the transmission is naturally occurring and not designed (it is solely determined by the regularities of the physical world). First consequence confirms something I've written already: this is why you can measure, quantify and textually represent EI, both EI and (classic)SI are part of the same conceptual lineage, so concepts translate well across levels.
    Furthermore, we can now say that perceiving EI reduces uncertainty in the receiver. I.e. when I perceive something, I'll know that something else isn't happening. This is important, because it opens up yet another theoretical framework, the one which is prediction-based. In this view, lossy compression is just another name for reduction of uncertainty, and reduction of uncertainty is one and the same as increased prediction power.
    The reason why this (tautological) passage is interesting is that prediction-based theories (with Clark, I'll call it "predictive processing" - PP) allow to link back to actual physical mechanisms. At the beginning, we had pSI, solidly linked to physical states. Going up the hierarchy, we kind of lost touch with physical mechanisms, but, for biological machinery, Friston's work allows us interpret it (stuff which has the ultimate function of preserving homoeostasis) in terms of Free Energy and thus prediction/reduction of uncertainty. You can then construct a picture that goes all the way up to explicit knowledge and at each step, PP allows to describe the process in terms of physical mechanisms (actual molecules bouncing around!). At the same time, you'll be showing how the same process is reducing uncertainty more and more, producing (on the flip-side) more and more "abstract" knowledge.

    A crucial passage in all this is represented by your paper: it addresses the philosophical dilemma of intentionality, and helps making the important point that cognition is necessarily embodied. It does so by allowing to piece all the elements in a coherent framework. I'm saying that your work is the glue that sticks all of the above together (hence my excitement).

    Anyway, the fourth reason is about the computational view. In your own post (linked on top of this comment), you say:
    "Behaviour simply is the activity of the kind of embodied system that we are in the presence of that particular information. [...]
    Therefore, in the radical embodied, ecological approach, it makes no sense to say that cognition involves information processing.
    "
    And you are both right and wrong. As I've argued here, the behaviour of a computer, also is "the activity of that kind of embodied system in the presence of that particular information". Computers are made of mechanisms which shift from one state to the other in lawful ways. Therefore, we can coherently assert that "it makes no sense to say that computers compute". Hang on a second, isn't this absurd? We do look at computers in computational terms, because it's useful... So what exactly makes it useful? The answer once again is lossy compression: in considering computations, we eliminate all the irrelevant physical details and consider only the invariant stuff that counts, i.e., in this case, pure symbolic information.
    (continues....)

    ReplyDelete
  7. (continued 4/4)
    Thus, the coherent picture sketched above allows for the potential of doing the same. For example, we might find that treating neurons as logic gates allows to predict the behaviour of actual (biological) neural networks in terms of both internal firing and outputs.

    In empirical terms, if we can predict the above, we would have a strong indication that we are indeed honing on the invariant stuff that counts, which, in our new overarching theoretical framework happens to be, surprise surprise, nothing more than EI (the structural difference which makes the ecologically relevant difference)!
    In other words, when you say "the nervous system doesn't take information onboard, it resonates to that information" you are right in a way, but it is an unhelpful observation. In the framework I'm proposing, employing the computational lens allows to produce empirical hypotheses on the purely mechanical level (via PP, but also following other more classic frameworks). That's because it focusses on the need that organisms have to systematically extract what counts for conserving homoeostasis, and this can only be taken from regularities found in the environment. Thus, by building a coherent picture which can be viewed through different lenses, we allow empirical cross-checking of the same hypotheses.
    For example, let's say we hypothesise organism X in condition Y responds to Tau in respect to Z. You guys can design experiments, manipulate perceivable Tau and see if the hypothesis holds. You can then pass the ball to wet lab people and ask them to open up X (ugh!), and find nervous signals which appear to follow the dynamic properties of Tau as measured above, and see at what level of processing the correspondence is maximal (according to my framework, it would be where input starts becoming output). If they can, they will then be able to pass the ball on theoretical neuroscientists asking them to produce a PP model of the levels involved.
    The result would be a complete mechanistic explanation of the observed behaviour, coupled with a synthetic description of the crucial passages. Because of this last step, we would also be able to design a system which behaves in the same way, using whatever mechanisms we may see fit, thus producing a final prediction that "a system which implements the same computations will react to Tau just like our original organism". If also this will work, you'll end up having a complete and really hard to refute picture.
    In the process, you would have included and reconciled traditionally opposing factions, which may be a strategic advantage as it doesn't require to prove other people "wrong". (Unlike what you propose in the "Are we Infomation Processers?" post.) If you wish, you may take the bait and remark that I am telling you that you are wrong. Yes, to some extent I am, but only on a limited scope, and I do think that I'm preserving the stuff you really do care about.

    Conclusion: the information-centric view has, if I'm right, theoretical, practical and strategic advantages - it is also inescapable, if you wish to communicate your results(!). It does come with plenty of other strategic dangers, but I won't discuss these today (I am sure you can spot them, probably better than I can!). What I'm interested about right now is to hear where I've lost you, and/or if you do / do not see the potential that is getting me over-excited.
    As per initial caveat, I'm not sure if any of the above can help you with the paper, so apologies for steering OT...

    ReplyDelete
  8. Oye! I look at the fascinating discussions that happen when I take a break!

    The answer to the question re Shannon is that there are two different types of "error" that are typically not distinguished in these discussions, and that causes much of the tension. One is the statistical equivalent to random measurement error, the other is the other is an error of measure validity. It may be the case that Tau exactly specifies time to impact, but you are measuring it imperfectly (measurement error). That point can't really be denied, as any particular organism in any particular circumstance will have its imperfections, and the ecological psychologist shouldn't be bothered by such a point in the slightest, because that point still leaves Gibson's story perfectly intact. (Especially as the organism can adjust in-process to correct such errors if it stays perceptually engaged.)

    It would be a different thing altogether to argue that time-to-impact cannot be perceived, and must only be inferred, based on mental calculations, from imperfectly available "cues". In that case, you still have error of measurement, but there is the much more salient source of error that there is nothing available to the organism which specifies the thing you are interested in, because specification has been a priori deemed impossible. (For a historic example, this line of thinking might be based on arguments about the inherent limitations of the "retinal image".) If we are forever stuck as imperfect perceivers of imperfect cues, then the Gibsonian story falls apart.

    There is an in-between point in which one argues for "error" by which we could be attuned to the wrong things. For example, one could acknowledge that Tau specifies time-to-contact, but have an experiment that shows a given organism is flinching as a function of some other optic-pattern. That argument doesn't necessarily break the Gibsonian story, but it takes us into the evolutionary and developmental side of things, which are not as well developed. We would not be, as in the case above FOREVER stuck as imperfect observers of imperfect cues, we are just temporarily in that state, regarding the particular invariants in question.

    Most computational models mush the two different types of error together, not considering that there might be excellent theoretical reasons to keep them separate.

    ReplyDelete