2025.
Gianluca Manzan, LISN
When: Friday, December 12th 2026, at 4:00pm
Where: LISN, bat 660 salle 2014 (2° étage)
From Hopfield Inference to Federated Learning: challenges and solutions in Teacher–Student models
In this work, we investigate inference in neural networks through the teacher–student framework, which provides a controlled setting to quantify how a student model learns the underlying signal from data generated by a teacher. Beginning with the Hopfield model, interpreted as a dual formulation of associative memory, we characterize the transition between non-informative and learning phases as a function of dataset size, noise level, and temperature. Extending the analysis to Restricted Boltzmann Machines, we show how choices in unit priors and regularization shape the emergence of the signal-retrieval phase and thus determine learning efficiency. We then address the limitations of the single-student scenario by introducing a collective learning strategy in which multiple student networks are coupled during inference. Recent advances in statistical-physics of learning show that interactions among students enhance generalization, thereby facilitating the teacher recovery. Our analysis of y-interacting Hopfield students confirms this cooperative effect, demonstrating that coupling expands the region of successful inference by lowering data requirements. This collective perspective has a natural application to federated learning (FL), a decentralized paradigm where multiple clients collaboratively train local models without sharing their private data. In FL, each client performs local updates based on its own dataset and communicates only through model parameters. The cooperative mechanism observed in coupled teacher–student systems provides a theoretical analogue to this framework: just as interacting students benefit from mutual alignment, federated clients collectively enrich the global solution while retaining data locality.
Jerome Garnier-Brun, Bocconi University
When: Thursday, December 4th 2025, at 4:00 pm
Where: LPTMS, Salle des Séminaires (2° étage)
Uncovering Structure: How Neural Networks Learn and Generalize from Tree-based Data
Statistical-physics approaches have provided key insights into the functioning of neural networks, yet most analyses assume high-dimensional random data. In contrast, real-world data possess rich underlying structure that likely shapes how learning and generalization unfold. In an attempt to bridge this gap, this talk will focus on models trained on tree-based data, a setting where Bayes-optimal performance can importantly still be computed exactly. By introducing a controlled filtering procedure that tunes the degree of correlation in the data, we first probe how transformers progressively uncover structure in both supervised and self-supervised inference tasks. The results reveal a hierarchical discovery of correlations—first in time, during training, and in space, across attention layers—which closely mirrors the exact inference algorithm. In a second part, we turn to generative diffusion models, where the same controlled data model exposes a novel biased generalization regime that precedes overt overfitting. There, access to Bayes-optimal benchmarks allows a precise characterization of when and how this bias emerges, suggesting new directions for optimizing diffusion schedules
Christophe Giraud, Paris Saclay University
When: Thursday, November 13th 2025, at 4:00 pm
Where: LISN, bat 660 salle 2014 (2° étage)
Clustering and Community Recovery in Polynomial Time below the Kesten-Stigum Threshold
Predictions based on the cavity and replica methods stipulate that clustering in the Gaussian Mixture Model and community recovery in the Stochastic Block Model cannot be achieved in polynomial time below the Kesten-Stigum (KS) threshold. These predictions have stimulated an active line of mathematical research, and have been confirmed rigorously in a wide range of regimes: spectral based algorithms succeed above the KS threshold, while computational hardness below the KS threshold has been proved within the low-degree polynomials framework. However, the picture changes when the number of clusters or communities is large. In such regimes, the computational barrier appears to lie below the KS threshold, and seems to be disconnected from spectral algorithms. In this talk,I will present these recent results, and discuss some related open questions.
Damien Barbier, Bocconi University
When: Thursday, November 6th 2025, at 4:00 pm
Where: LISN, bat 660 salle 2014 (2° étage)
The strange case of the symmetric binary perceptron: when standard statistical mechanics fails
We define and study a statistical mechanics ensemble that characterizes connected solutions in constraint satisfaction problems (CSPs). Built around a well-known local entropy bias, it allows us to better identify hardness transitions in problems where the energy landscape is dominated by isolated solutions. We apply this new device to the symmetric binary perceptron model (SBP), and study how its manifold of connected solutions behaves. We choose this particular problem because, while its typical solutions are isolated, it can be solved using local algorithms for a certain range of constraint density α and threshold κ. With this new ensemble, we unveil the presence of a cluster composed of delocalized connected solutions. In particular, we demonstrate its stability until a critical threshold κ_{no−mem loc. stab.} (dependent on α). This transition appears as paths of solutions shatter, a phenomenon that more conventional statistical mechanics approaches fail to grasp. Finally, we compared our predictions to simulations. For this, we used a modified Monte-Carlo algorithm, designed specifically to target these delocalized solutions. We obtained, as predicted, that the algorithm finds solutions until κ ≈ κ_{no−mem loc. stab.}.
Jorge Fernanzed-de-Cossio-Diaz, IPhT
When: Thursday, October 30th 2025, at 4:00 pm
Where: LISN, bat 660 salle 2014 (2° étage)
Generative models of RNA
Riboswitches are structured allosteric RNA molecules that change conformation upon metabolite binding, triggering a regulatory response. We used Restricted Boltzmann machines to generate novel riboswitch aptamers. We validated experimentally their ability to respond to the ligand with a conformational change, like natural riboswitches, achieving a success rate of ~30% among generated sequences. In a second part of the talk (if time allows), I will briefly cover ongoing work on the phase diagram of undersampled Boltzmann machines trained on data. Reference: J.FdCD, P. Hardouin, et al (Nature Communications 2025, accepted). biorxiv:2023.05.10.540155
Beatrice Achilli, Bocconi University
When: Friday, April 25th 2025, at 11:00 am
Where: IPhT, salle Itzykson *change of location!!*
Life, death and miracles of diffusion models
In recent years, generative diffusion models have emerged as powerful tools in unsupervised learning, achieving impressive results in image, text, and data generation. In this presentation I will analyze from a geometric and statistical physics perspective how these models learn, generalize, and eventually memorize data. In particular, I will highlight how the intrinsic structure of manifold data impacts the dynamics of the diffusion process. Our analysis reveals the existence of distinct phases during the generative process, in particular a manifold coverage phase where the diffusion process fits the distribution internal to the manifold, and a consolidation phase where the score becomes orthogonal to the manifold. By mapping diffusion models driven by the empirical score function onto the Random Energy Model (REM), we are able to characterize memorization and generalization timescales. These insights clarify the role of data structure in mitigating the curse of dimensionality and contribute to a deeper understanding of how diffusion models capture complex data distributions.
Pierfrancesco Urbani, IPhT Saclay
When: Friday, March 21 2025, at 15:00 pm
Where: LISN, bat 660 salle 2014 (2° étage)
Generalization and overfitting in overparametrized two-layer neural networks
Understanding the generalization properties of large, overparametrized, neural networks is a central problem in theoretical machine learning. Several insightful ideas have been proposed in this regard, among them: the implicit regularization hypothesis, the possibility of having benign overfitting and the existence of feature learning regimes where neural networks learn the latent structure of data. However a precise understanding of the emergence/validity of these behaviors cannot be disentangled from the study of the non-linear training dynamics. We use a technique from statistical physics, dynamical mean field theory, to study the training dynamics and obtain a rich picture of how generalization and overfitting arise in large overparametrized models. In particular we point out: (i) the emergence of a separation of timescales controlling feature learning and overfitting, (ii) a non-monotone behavior of the test error and, correspondingly, a 'feature unlearning' phase at large times and (iii) the emergence of algorithmic inductive bias towards small complexity. Joint work with Andrea Montanari.
Hugo Cui, Harvard
When: Friday, March 14 2025, at 14:00 pm
Where: LISN, bat 660 salle 2014 (2° étage)
Learning diffusion models: asymptotic insights
We consider the problem of learning a generative model parametrized by a two-layer auto-encoder, and trained with online stochastic gradient descent, to sample from a high-dimensional data distribution with an underlying low-dimensional structure. We provide a tight asymptotic characterization of low-dimensional projections of the resulting generated density, and evidence how mode(l) collapse can arise. On the other hand, we discuss how in a case where the architectural bias is suited to the target density, these simple models can efficiently learn to sample from a binary Gaussian mixture target distribution. Based on joint works with Yue M Lu, Cengiz Pehlevan, Lenka Zdeborová, Florent Krzakala and Eric Vanden-Eijnden.
2024.
Mini-workshop on Class Imbalance
When: Friday, November 15 2024, all day
Where: LPTMS, petit Amphi (1° étage)
Emanuele Francazi – A theoretical analysis of the learning dynamics under class imbalance
Stefano Sarao-Mannelli – Bias-inducing geometries: exactly solvable data model with fairness implications
Mauro Pastore – Restoring balance: principled under/oversampling of data for optimal classification
Francesco Saverio Pezzicoli – Anomaly-Detection Class Imbalance in Exactly Solvable Models
Gabriele Sicuro, University of Bologna
When: Friday, October 4 2024, at 11:00am
Where: LISN, bat 660 salle 2014 (2° étage)
Heavy-tailed covariates in high dimensions
Machine learning theoretical models very often assume a dataset obtained from a Gaussian distribution, or from a Gaussian mixture. The possible limitations of such a Gaussian assumption have been recently object of investigation, and theoretically characterization, leading to a number of "Gaussian universality" results. In this talk I will present an analytical treatment of the performance in high dimensions of simple architectures on heavy-tailed distributed datasets, showing that even simple generalized linear models exhibit a striking dependence on non-Gaussian features in both classification and regression tasks.