Computational Foundations of Perception & Action
28th Symposium: June 1-3, 2012
All talks & discussion sessions are in the Class of '62 Auditorium, Medical Center
All breaks and lunches are in the Flaum Atrium
Thursday, May 31
7:00 - 9:00 pm—Registration & Welcome Reception, Meliora Hall
Friday, June 1
8:00 am—Registration & Breakfast
8:45 am—Welcome, David Knill
Talk session I: Integration and prediction in perception (Moderator: Greg DeAngelis)
Accurate perception of a dynamic environment requires continuous multisensory calibration. When present, external feedback is particularly beneficial for multisensory calibration, since it serves as a "teacher". However, the principles of interaction between external feedback and relative cue-reliability, and their combined influence on multisensory calibration are currently unknown. Five monkeys were trained to perform a heading discrimination task in which they were required to report whether self-motion was to the right/left of straight ahead. Stimuli comprised either visual (optic-flow), vestibular (motion platform) or combined (visual-vestibular) passive motion. For each experimental session, coherence of the visual stimulus was manipulated such that either visual or vestibular reliability was relatively higher. A systematic heading discrepancy was introduced between the visual and vestibular stimuli and external feedback was congruent with either the more-reliable or less-reliable cue. When external feedback was aligned with the more-reliable cue, the less-reliable cue shifted towards the feedback, and the more-reliable cue (which was already accurate) did not shift. However, when external feedback was aligned with the less-reliable cue, a surprising form of calibration occurred: cues were yoked and shifted together in the same direction. Hence, whilst the more-reliable cue shifted to become more accurate, the less-reliable cue simultaneously shifted away from the external feedback, becoming less accurate. We propose two different mechanisms of multisensory calibration: 1) cue-specific (local) calibration, and 2) reference-frame (global) calibration. When the more-reliable cue is incongruent with external feedback, the entire reference frame (zero) is considered to be inaccurate. Hence cues are yoked and shift in conjunction. When the more-reliable cue is congruent with external feedback, the global reference frame is considered accurate and the less-reliable cue is calibrated individually/locally. These results suggest that the Bayesian-optimal cue-combination is used to assess global accuracy.
The accompanying figure simulates different conditions of multisensory calibration. (A) In the absence of external feedback discrepant cues shift towards one-another (at a fixed ratio, independent of relative cue reliability, as demonstrated by Zaidel et. al. J. Neurosc. 2011). (B) When feedback is contingent on the more-reliable cue, only the less-reliable cue shifts (demonstrated by the behavioral data in this study). (C) When feedback is contingent on the less-reliable cue, two scenarios are presented: i) only the more-reliable cue shifts (upper plot) ii) cues are yoked and therefore shift together in the same direction (lower plot). The behavioral data in this study follow the latter scenario, whereby cues are yoked and shift together (global calibration). This result is surprising since the less-reliable cue, which is initially accurate, shifts away from the external feedback.
The generality and robustness of these phenomena are demonstrated by the finding of complimentary results when the visual cue was more-reliable than the vestibular and vice-versa. Furthermore, each monkey individually demonstrated these results.
Humans often perceive, act and make decisions based on noisy and ambiguous sensory information. In order to optimize behavior, the CNS should learn and use knowledge of the statistics of objects and events in the environment to reduce uncertainty. Moreover, it should adapt to changes in those statistics over time. I will discuss two problems whose solutions rely on accurate models of the statistics of scenes – estimating a scene parameter from sparse and uncertain sensory information and integrating different sensory cues. As an example of the first type of problem, we have been studying the learning and application of statistical models of object speed for guiding a simple motor behavior - timing movements to hit a moving object. We use hierarchical Bayesian analysis techniques to fit the implicit statistical models that subjects use to plan their movements. Our results show that subjects learn accurate models of the first-order statistics (means and std. deviations) of object speeds, but overestimate the temporal correlations across trials. Despite this apparent sub-optimality, subjects do adapt their implicit models to stimulus sets with different temporal correlations, though they maintain a strong bias toward positive correlations. As an example of the second type of problem, we have been studying how the visual system adapts its internal model of the statistics of figure shape – something that determines the reliability of figural cues to surface slant. Our results show that subjects adapt their internal models very quickly to match the statistics of local scenes. These adaptive changes have significant impact on the contribution of figural relative to binocular cues to surface slant, leading to fast fluctuations in the relative "weighting" of figural cues in response to changes in environmental statistics. Furthermore, we show that subjects are able to learn multiple statistical models whose use is gated by context, showing further sophistication in the types of statistical models that subjects use to interpret sensory signals.
Sensory stimuli can be ambiguous and uncertain. Considerable recent research has focused on how animals can generate more accurate estimates of a parameter of interest by integrating visual information across time. I will argue that the same circumstances that lead animals to integrate information across time, ambiguous and uncertain stimuli, lead them to integrate information across sensory modalities. My laboratory has developed a novel multisensory decision task that uses dynamic, time varying auditory and visual stimuli. We have collected data from both rats and humans on the task and report three main findings. First, we have found that for multisensory stimuli, both species show improvements in accuracy that are close to the statistically optimal prediction. Next, we report that subjects make use of time in a similar way for unisensory and multisensory stimuli, and for reliable and unreliable stimuli. Finally, we report that synchronous activation of auditory and visual circuitry likely does not drive the improvements in accuracy, since a comparable improvement was evident even when auditory and visual stimuli were presented asynchronously.
Taken together, these findings identify two possible strategies, integrating across time and integrating across sensory modalities, that can help animals overcome sensory uncertainty to make better decisions. Because the inherent variability of cortical neurons renders all sensory stimuli, to some degree, uncertain, these strategies are likely used in many circumstances.
Integrating information across time is fundamental to important behaviors involving or requiring motion, like playing Angry Birds. Optimal temporal information involves Bayesian filtering, which cycles between making predictions using an internal model and correcting these predictions by integrating sensory information. Accurate integration depends on having an accurate internal model, however, little is known about how internal models for perception are acquired.
The goal of this talk is to computationally model and experimentally investigate the acquisition of internal models in trajectory prediction tasks. To be concrete, if I expose you to the projectile motions in a game like Angry Birds, how much will you learn about the trajectories of the birds? At one extreme, you might use extensive feedback to hone strategies for controlling the bird's destructive desires without understanding the details of the trajectory, which eliminates the need for an internal model. At the other extreme, you may acquire a highly accurate predictive model for trajectories that can allow for complex and novel interactions, like mid-flight control.
Current learning theory offers little guidance to predict what aspects of the environment will be learned, and what sorts of task and feedback will facilitate or inhibit learning richer internal models. I will describe experimental results from our lab that support a kind of minimalist learning strategy, we could phrase "only learn what you need." Minimalist learning predicts that internal models will only be acquired when both the task requires prediction/counterfactual reasoning, and that model improvement can be anticipated to improve performance. I show counter-intuitive experimental evidence for this hypothesis in a family of trajectory tasks, including more learning with less-reliable data, no learning with full feedback, and how subtle changes to the predictive requirements in a task can lead to large differences in internal model learning. I will show how modeling learning from a Bayesian adaptive control perspective with cognitive costs can provide a normative framework for miminalist learning, and will argue that miminalist learning may be critical for skill formation.
12:10 pm—Panel Discussion
1:00 - 2:00 PM—Lunch
Talk session II: Sensorimotor control (Moderator: David Knill)
Movement variability and motor learning are ubiquitous features of sensorimotor control, and the relationship between them is of longstanding interest in the field. The current, prevalent view is that that motor variability is due to a combination of central and peripheral "noise" sources and that sensorimotor learning acts to minimize movement variability in the face of that noise. We have examined both of elements of this view, using a combination of behavioral and physiological approaches. I will show that neural variability in cortical motor planning areas only weakly predicts trial-by-trial fluctuations in reaching movements. Instead, these areas contribute to movement variability through a slow, random drift in neural activity patterns (across hundreds of movements) that is linked to a parallel drift in movement metrics. These drifts are strikingly captured by a simple model in which noise continually accumulates in an online error-corrective learning process. Thus, while the goal of motor learning is to minimize errors, a substantial component of motor variability appears to arise as a result of learning. I will discuss interpretations of this apparently paradoxical state of affairs.
The fields of decision making and sensorimotor control have developed in parallel although both require acting in real time on streams of noisy evidence. I will review our recent work showing the intimate interactions between decision making and sensorimotor control processes. This includes the relation between vacillation and changes of mind in decision making and the bidirectional flow of information between elements of decision formations such as accumulated evidence and motor processes such as reflex gains and effort.
I will present new fMRI data on the representation of trained and untrained movements in cortical networks. Applying novel multivariate methods, we can show that trained movements are represented more distinctly than untrained movements, often without any change of overall activity levels. I will relate these findings to a normative network model that predicts the representation of hand movements in primary motor cortex from the natural statistics of these movements.
Most theories of motor cortex have assumed that neural activity represents movement parameters. This view derives from an analogous approach to primary visual cortex, where neural activity represents patterns of light. Yet it is unclear how well that analogy holds. Single-neuron responses in motor cortex appear strikingly complex, and there is marked disagreement regarding which movement parameters are represented. A better analogy might be with other motor systems, where a common principle is rhythmic neural activity. We found that motor cortex responses during reaching contain a brief but strong oscillatory component, something quite unexpected for a non-periodic behavior. Oscillation amplitude and phase followed naturally from the preparatory state, suggesting a mechanistic role for preparatory neural activity. These results demonstrate unexpected yet surprisingly simple structure in the population response. That underlying structure explains many of the confusing features of individual-neuron responses.
5:20 pm—Panel Discussion
6:00 - 7:00 pm—Poster session
7:30 - 9:30 pm—Dinner, Flaum Atrium
Saturday, June 2
Talk session III: Sensory coding (Moderator: Xaq Pitkow)
A pressing problem in systems neuroscience is determining the neural code. We know that neurons send their signals in the form of trains of action potentials, but we don't know what the code is, that is, we don't know what the unit of information is. Is it the number of spikes per unit time? Is it the individual spike or some pattern of spikes? Getting a clear answer to this affects a great deal of work in systems neuroscience, both basic and applied. For basic research, it tells us what quantity we need for building models of neural computations (i.e., what spike train features we need). For applied research, it tells us what quantity we need to effectively transmit information from one brain area to another via prosthetic devices. Here we describe a strategy for finding neural codes and use it to develop a novel retinal prosthetic approach. We then test the approach in an animal (mouse) model and show its advantages.
Our percepts rely on an internal model of the environment, relating physical processes of the world to inputs received by our senses, and thus their veracity critically hinges upon how well this internal model is adapted to the statistical properties of the environment. I will describe two recent studies each addressing a fundamental question about internal models relevant for perception: how are they acquired, and what is their neural underpinning?
Theories of learning can be divided into two qualitatively different classes: supervised learning predicts that we develop specialised representations for each task; unsupervised learning predicts general-purpose representations that can be used in a wide range of tasks with different tasks requiring different operations on the same underlying representation. We developed a novel method that can extract complex, multi-dimensional mental representations from behaviour. We showed that the representations of human faces vary dramatically across subjects, but are invariant across tasks within a subject. This provides evidence for a strong unsupervised learning-based representation of faces in humans.
Although a number of behavioural studies have demonstrated that internal models are optimally adapted to the statistics of the environment, the neural underpinning of this adaptation is unknown. Using a Bayesian model of sensory cortical processing, we related stimulus-evoked and spontaneous neural activities to inferences and prior expectations in an internal model and predicted that they should match if the model is statistically optimal. To test this prediction, we analysed visual cortical activity of awake ferrets during development. Similarity between spontaneous and evoked activities increased with age and was specific to responses evoked by natural scenes. This demonstrates the progressive adaptation of internal models to the statistics of natural stimuli at the neural level.
Spiking activity in cortex is coordinated on a range of spatial and temporal scales. Numerous studies have shown that external events and internal states can alter this coordination, and suggested that this affects encoding by neuronal populations. Much less explored is how coordinated activity influences the relaying of signals between cortical areas and the computations they perform. To tackle this issue, we have recorded simultaneously from populations of neurons in the superficial layers of primary visual cortex (V1) of macaque monkeys, and from their downstream targets in the middle layers of V2. We find that spiking activity in V2 neurons is associated with a brief increase in V1 spiking correlations. Stimulus manipulations that enhance brief timescale V1 synchrony lead to stronger coupling between these networks. Our results suggest that the coordination of spiking activity within a cortical area influences its coupling with downstream areas.
The nonlinear dynamics of single neurons allows them to act as sophisticated computational elements. During development, individual neurons approach a state in which they are able to encode incoming fluctuations in units normalized to the typical fluctuation scale. We show how the biophysics of neurons provides the basis for the ability of this neuronal population to adapt to stimulus variance.The intrinsic properties of single neurons can strongly impact the ability of a network of such neurons to encode and propagate information. We discuss the implications of this change in intrinsic properties on a network's ability to propagate waves of activity.
12:10 pm—Panel Discussion
1:00 - 2:00 PM—Lunch
Talk Session IV: Decision-making (Moderator: Alex Pouget)
The anterior cingulate cortex sits at the interface between the reward system and the motor system and is therefore well positioned to use reward information to influence eye movements. I will present data from recent studies arguing that anterior cingulate cortex monitors diverse sources of reward and task-relevant information to generate a high-level control signal that can be used to govern eye movement decisions. I will argue that, while ACC has long been associated with tracking reward values, its activity is more consistent with computing a control signal derived from reward inputs. I will argue further that ACC carries a decision variable that reflects the current estimate of the value of switching, changing, or learning. Finally, I will discuss the implications of these results for our understanding of saccadic decision-making.
Intensive neurophysiological investigations in monkeys have identified key operations involved in visual attention, including modulations of sensory responses in feature selective visual areas, saccade generation processes in frontal and subcortical areas and an intermediate stage of target selection in the parietal and frontal lobe. A challenging open question concerns the nature of the target selection response: how do frontal and parietal neurons know what target to select and how do they determine the attention worthiness of a visual cue?
Here I present a framework for addressing this question that is based on the normative role of attention in selecting information ? i.e., learning and updating predictions. Consistent with recent findings from studies of oculomotor decisions I propose that target selective neurons in the lateral intraparietal area (LIP) convey an estimate of the relative utility of alternative options. However, rather than encoding values based solely on physical rewards neurons also assign credit for learning and information selection. I discuss two experiments that address this point. One is a sequential decision making task showing that LIP neurons have elevated learning rates and non-spatial responses that assign credit to informative steps in temporally-extended actions. A second is a stimulus-reward (Pavlovian) paradigm showing that neurons assign attentional weight to visual stimuli based on their Pavlovian associations independently of operant rewards. I discuss the implications of these findings for attentional selection and the possibility of multiple attentional mechanisms.
All voluntary behaviors are learned. Actions followed by pleasant outcomes are repeated, whereas those leading to aversive outcomes are avoided. We have studied the contribution of the prefrontal cortex and striatum in non-human primates using computer-simulated competitive games. In particular, we examined activity related to rewards and penalties using a biased matching pennies task in which the positive and negative payoffs were implemented by delivery or removal of conditioned reinforcers. Activity in the prefrontal cortex and striatum often encoded the animal's choice and its outcome conjunctively. Moreover, activity during the feedback period tended to encode the effect of gains on the action values, whereas activity related to the losses from specific actions was temporally delayed. We have also investigated how the neurons in the prefrontal cortex process the information about hypothetical outcomes from actions not chosen by the animal during a rock-paper-scissors task. Signals related to actual and hypothetical outcomes were present in both the dorsolateral prefrontal and orbital prefrontal cortex. Signals related to hypothetical outcomes associated with a particular action were found more frequently in the dorsolateral prefrontal cortex than in the orbitofrontal cortex, suggesting that different cortical areas focus on cognitive and emotional aspects of counterfactual learning.
The posterior parietal cortex (PPC) has an important role in many cognitive behaviors, including decision-making, movement planning, and spatial attention. However, the neural circuit dynamics underlying PPC function are not well understood. We optically imaged the spatial and temporal activity patterns of neuronal populations in mice performing a PPC-dependent task that combined a perceptual decision and memory-guided navigation in a virtual environment. Individual neurons had transient activation staggered relative to one another in time, forming a sequence of neuronal activation spanning the entire length of a task trial. Distinct sequences of neurons were triggered on trials with opposite behavioral choices and defined divergent, choice-specific trajectories through a state space of neuronal population activity. Cells participating in the different sequences and at distinct time points in the task were anatomically intermixed over microcircuit length scales (< 100 micrometers). During working memory decision tasks the PPC may therefore perform computations through sequence-based circuit dynamics, rather than long-lived stable states, implemented using anatomically intermingled microcircuits.
5:20 pm—Panel Discussion
6:00 - 7:00 pm—Poster session
7:30 - 9:30 pm—Dinner, Johnson House
Sunday, June 3
Talk session V: Memory and learning (Moderator: Robert Jacobs)
Visual working memory (VWM) capacity is severely limited. These limits constrain human performance across many tasks, and stand in stark contrast to the seemingly infinite storage capacity of visual long- term memory. However, the exact nature of the limits on VWM remain elusive. Existing models are largely divided between those that assume a fixed number of discrete storage "slots" in memory, and models that assume that memory is not limited by the number of objects stored, but rather by the resolution of memory representations. Missing from this debate, however, is a theoretically grounded explanation of why capacity limits should arise, and how these limits should relate to observed behavioral performance. By applying results from information theory, it is possible to develop an "ideal observer analysis" of visual working memory. This analysis yields a quantitative and task- independent definition of memory capacity, generates novel behavioral predictions, and further offers a principled re-interpretation of existing models of visual working memory.
We learn from experience to make more advantageous decisions, often by adjusting our expectations to match past outcomes. In a dynamic world, this adjustment process must itself be adaptive, because changes can occur that render past outcomes irrelevant to future expectations. For example, historical yields from a fruit tree that has since died should no longer affect future expectations. A history of stable stock prices can become irrelevant after a major change in corporate leadership. I will talk about ongoing work in my lab that has begun to identify the neural mechanisms responsible for making effective decisions in these kinds of dynamic environments. I will describe an ideal-observer model to recognize and respond appropriately to environmental change-points under certain conditions, which we have systematically reduced to a simple analytic form that makes specific predictions about the underlying neural computations. Human behavior on a novel predictive-inference task is consistent with many of the principles described by the model. Moreover, several key computations are reflected in non-luminance-mediated changes in pupil diameter of human subjects performing the task. This work demonstrates that pupil-linked arousal systems can help regulate the influence of incoming data on existing beliefs to calibrate expectations in a dynamic environment.
Task-irrelevant perceptual learning (TIPL) has captured a growing interest in the field of perceptual learning. The basic phenomenon is that stimulus features that are irrelevant to a subject's task (i.e. convey no useful information to that task) can be learned due to their consistent presentation during task-performance. While TIPL has been touted as an example of learning without attention, recent research shows a complex role of attention in TIPL, where attention sometimes disrupts and at other times facilitates such learning. Here I give an overview of existing research on TIPL with an emphasis on recent findings. I introduce a new form of fast-TIPL where enhanced memorization of visual scenes is seen on the time-scale of a single experimental trial. With this fast-TIPL paradigm, in which the scenes to be memorized are important the participants, although irrelevant to the training task, we find that TIPL can occur for salient stimuli and is enhanced by attention. These results show a counterpoint to previous studies of TIPL that found that learning of salient, but unimportant, and often distracting, stimuli is suppressed by attention. I will also discuss recent results showing that TIPL can result in equal to or greater learning than direct training procedures on the same stimuli; a finding that suggests that attention can be disruptive to direct training of difficult to discriminate, complex, stimuli. Together these studies suggest of model of learning in which attention and reinforcement signals work in concert to gate learning.
Recent experiments have demonstrated that humans and animals typically reason probabilistically about their environment. These experiments and the corresponding neural models have largely focused on simple situations for which probabilistic inference and learning is both straightforward and tractable. Unfortunately, for most of the problems faced by cortex this is not the case. Here we will consider a problem faced by the nervous system in a more general form, namely the identification of latent causes of complex patterns of sensory input, and the spikes that encode them. Formally, this problem can be described within the framework of topic models used for document classification: topics distributions are distributions over latent causes, topics are patterns of spiking activity, and documents are the observed mixtures patterns of spikes.
Since exact Bayesian inference for probabilistic models of this kind is intractable, I will propose that neural circuits approximate posterior inference over latent causes by using a Variational Bayesian Expectation Maximization (VBEM) algorithm. This algorithm maps nicely onto a biologically plausible neural network which encodes probability distributions via a probabilistic population code, a particular type of neural code that represents probability distributions in a way that is consistent with the distribution of neural variability. This implementation requires that neural circuits implement a specific form of nonlinearity known as divisive normalization, an operation which is found in all neural circuits. Additionally, this inference algorithm can be implemented in a single layer of cortex and the learning of model parameters can be accomplished using a purely Hebbian mechanism. Finally, I will describe some extensions of this work which makes it possible to deal with more complex problem of time varying inputs, correlated latent causes, and hierarchical learning.
12:10 pm—Panel Discussion
1:00 - 2:00 PM—Closing Lunch