Symposium Session 5 - Nature and nurture revisited: new insights about core knowledge and visual development across cognitive systems
Symposium Session 5: Monday, March 31, 2025, 10:00 am – 12:00 pm EDT, Grand BallroomChairs: Gabriel Kreiman1,2, Elisabetta Versace3; 1Harvard Medical School, 2Boston Children's Hospital, 3Queen Mary University of London
Presenters: Elisabetta Versace, Tomer Ullman, Judit Gervain, Eugenio Piasini, Davide Zoccolan
The convergence between advances in Artificial Intelligence (AI) and models of how brains perform computations highlights a central question common to cognition and AI: the extent to which natural and artificial circuits are hard-wired to solve specific problems versus how much is learned from experience. This question re-centers the age-old debate of nature vs. nurture in terms of innate priors and inductive biases (nature) versus experience and learning (nurture). The extraordinary successes of artificial neural networks in AI mainly rely on supervised learning through vast amounts of data. All the weights in a network are initially randomized and then learned from millions of labeled examples via backpropagation, an algorithm that is unlikely to be implemented in the brain. In stark contrast, humans and other animals are born with somewhat functional neuronal circuits and, after birth, most refinement of these circuits likely occurs via unsupervised interactions with the environment. Cognitive circuits reflect information available at multiple time scales: (a) the evolutionary past shapes architectural structures; (b) the experience available during early development sculpts circuits via largely unsupervised mechanisms; and (c) learning in adults is a combination of supervised and unsupervised processes. Using vision as a paradigmatic example, this symposium will highlight recent advances in our understanding of inductive biases across multiple species (chicks, rats, humans), the nature of core knowledge, and the fundamental constraints imposed by exposure to the statistical regularities of the environment.
Presentations
The combinatorial advantage of predispositions
Elisabetta Versace1; 1Queen Mary University of London
The more we observe newborn animals, the more we discover cognitive preparedness. For example, our studies show that shortly after hatching, inexperienced chicks and tortoise hatchlings are naturally drawn to certain stimuli, such as upward movement against gravity, changes in speed, hollow objects, specific colours or combinations of stimuli. These features are reliable indicators of the presence of animate living beings or even adult animals, suggesting that predispositions that appear widespread across different species—including humans—aid young animals in identifying and responding to stimuli that have high adaptive relevance. Interestingly, early predispositions are not merely reflections of the most frequent environmental patterns. These expectation are flexible and transient rather than rigid and unchanging. Why is this the case if they should help inexperienced animals to respond to the stimuli that they expect to encounter, as a result of evolutionary pressures? We have analysed the role of these predispositions at the beginning of life, treating them as predictors with varying strengths. By examining the trade-offs associated with false positives and false negatives at different developmental stages, we propose a new model that underscores the adaptive function of multiple predispositions in early life. This model mitigates the risk of errors by integrating various features. Furthermore, we highlight how insights from biology can inform computational and artificial intelligence models, enabling the development of general intelligence systems that prioritise a combination of predictors, enhancing the ability of inexperienced cognitive systems to navigate complex environments.
Through a glass, darkly: Approximations, hacks, and workarounds in intuitive physics and imagination
Tomer Ullman1; 1Harvard University
People can handle everyday interactions with everyday objects with remarkable ease. One current model of human 'intuitive physics' supposes that people are carrying out a mental simulation, moving objects in the mind step by step. This proposal is motivated by the use of world simulators in other areas, including game engines. While successful in many cases, even the people who champion this proposal recognize that humans can't be running a perfect simulation. I consider several principled bounds and approximations that may underlie imperfect mental simulation in humans, including approximate bodies in tracking, lazy evaluation in imagery, and bounds on the number of objects that can be simulated at once. I will also consider the computational models that capture these approximations, and behavioral studies that inform the arguments through empirical findings.
The efficient coding of visual textures in rats, chicks and human infants
Judit Gervain1, Eugenio Piasini2; 1University of Padua, Italy, 2SISSA
Neural encoding should be adapted to the statistics of natural stimuli to efficiently process sensory inputs. This principle, known as efficient coding, successfully explains many aspects of the early stages of sensory processing. Recently, human adults’ sensitivity to visual textures defined by multipoint correlations has been found to match the variability of these correlations in natural images, suggesting that visual perception obeys efficient coding principles at the cortical level, too. How does efficient coding of visual textures arise ontogenetically? Do young infants already show sensitivities similar to adults early in development or is experience needed for these to emerge? We measured human infants’ sensitivity to multipoint correlations (one-, two-, three- or four-point correlations) tested in adults. In one study, we measured 6-12-month-old infants’ spontaneous looking-time preference for images with different multipoint correlations, and found that infants preferred, i.e. showed the longest looking times, to images with 1-point correlations (i.e., light intensity statistics) over all other patterns, while they showed the shortest looking times to 2-point correlations, i.e., lines. Preliminary observations in a second study testing whether infants can discriminate these correlation patterns from one another and from white noise suggest that all statistical patterns are discriminated. These results suggest that early sensitivity (discrimination ability) to image statistics is already present from the onset of relatively mature visual acuity (4-6 months) in human development. Preference patterns may reflect additional developmental specificities, different from those of adults, the functional relevance of which needs to be better understood.
Mechanisms of unsupervised learning exposed via controlled rearing experiments
Davide Zoccolan1; 1SISSA
Unsupervised learning of mid-level visual representations is becoming a central topic in cognitive, computational and neurodevelopmental studies. In this talk, I will illustrate convergent evidence in the study of animal models via controlled-rearing , showing how visual cortex harnesses the statistical structure of the environment without the need of explicit training targets or rewards to learn shape representations. Such statistical regularities occur both in the spatial and time domains. Degrading the temporal continuity of visual experience during passive exposure in early postnatal life impairs the development of complex cells in the rat primary visual cortex while leaving simple cells unaffected. This observation emphasizes the significance of unsupervised learning of structured dynamic stimuli in the development of transformation tolerance. As further evidence of the relevance of temporal dynamics in shaping visual cortical representations, I will discuss how deeper brain regions encode slower features of dynamic sensory input, as both stimulus-driven responses and intrinsic neuronal activity extend over longer time scales across visual cortical areas in rats and mice. Overall, these findings indicate that unsupervised temporal learning is crucial for shaping mid-level visual representations by fostering neuronal plasticity, driven by the natural spatiotemporal patterns of visual inputs.