MLP@P- Machine Learning Physics @ Plateau
Informal meetings on statistical physics & machine learning
Organized by:
Sergio Chibbaro (LISN)
Cyril Furtlehner (LISN)
Valentina Ros (LPTMS)
Pierfrancesco Urbani (IPhT)
To subscribe to the mailing list, write to valentina.ros@cnrs.fr
Seminar by Beatrice Achilli, Bocconi University
When: Friday, April 25th 2025, at 11:00 am
Where: IPhT, salle Itzykson *change of location!!*
Life, death and miracles of diffusion models
In recent years, generative diffusion models have emerged as powerful tools in unsupervised learning, achieving impressive results in image, text, and data generation. In this presentation I will analyze from a geometric and statistical physics perspective how these models learn, generalize, and eventually memorize data. In particular, I will highlight how the intrinsic structure of manifold data impacts the dynamics of the diffusion process. Our analysis reveals the existence of distinct phases during the generative process, in particular a manifold coverage phase where the diffusion process fits the distribution internal to the manifold, and a consolidation phase where the score becomes orthogonal to the manifold. By mapping diffusion models driven by the empirical score function onto the Random Energy Model (REM), we are able to characterize memorization and generalization timescales. These insights clarify the role of data structure in mitigating the curse of dimensionality and contribute to a deeper understanding of how diffusion models capture complex data distributions.
Seminar by Pierfrancesco Urbani, IPhT Saclay
When: Friday, March 21 2025, at 15:00 pm
Where: LISN, bat 660 salle 2014 (2° étage)
Generalization and overfitting in overparametrized two-layer neural networks
Understanding the generalization properties of large, overparametrized, neural networks is a central problem in theoretical machine learning. Several insightful ideas have been proposed in this regard, among them: the implicit regularization hypothesis, the possibility of having benign overfitting and the existence of feature learning regimes where neural networks learn the latent structure of data. However a precise understanding of the emergence/validity of these behaviors cannot be disentangled from the study of the non-linear training dynamics. We use a technique from statistical physics, dynamical mean field theory, to study the training dynamics and obtain a rich picture of how generalization and overfitting arise in large overparametrized models. In particular we point out: (i) the emergence of a separation of timescales controlling feature learning and overfitting, (ii) a non-monotone behavior of the test error and, correspondingly, a 'feature unlearning' phase at large times and (iii) the emergence of algorithmic inductive bias towards small complexity. Joint work with Andrea Montanari.
Seminar by Hugo Cui, Harvard
When: Friday, March 14 2025, at 14:00 pm
Where: LISN, bat 660 salle 2014 (2° étage)
Learning diffusion models: asymptotic insights
We consider the problem of learning a generative model parametrized by a two-layer auto-encoder, and trained with online stochastic gradient descent, to sample from a high-dimensional data distribution with an underlying low-dimensional structure. We provide a tight asymptotic characterization of low-dimensional projections of the resulting generated density, and evidence how mode(l) collapse can arise. On the other hand, we discuss how in a case where the architectural bias is suited to the target density, these simple models can efficiently learn to sample from a binary Gaussian mixture target distribution. Based on joint works with Yue M Lu, Cengiz Pehlevan, Lenka Zdeborová, Florent Krzakala and Eric Vanden-Eijnden.
Mini-workshop on Class Imbalance
When: Friday, November 15 2024, all day
Where: LPTMS, petit Amphi (1° étage)
Emanuele Francazi – A theoretical analysis of the learning dynamics under class imbalance
Stefano Sarao-Mannelli – Bias-inducing geometries: exactly solvable data model with fairness implications
Mauro Pastore – Restoring balance: principled under/oversampling of data for optimal classification
Francesco Saverio Pezzicoli – Anomaly-Detection Class Imbalance in Exactly Solvable Models
Seminar by Gabriele Sicuro, University of Bologna
When: Friday, October 4 2024, at 11:00am
Where: LISN, bat 660 salle 2014 (2° étage)
Heavy-tailed covariates in high dimensions
Machine learning theoretical models very often assume a dataset obtained from a Gaussian distribution, or from a Gaussian mixture. The possible limitations of such a Gaussian assumption have been recently object of investigation, and theoretically characterization, leading to a number of "Gaussian universality" results. In this talk I will present an analytical treatment of the performance in high dimensions of simple architectures on heavy-tailed distributed datasets, showing that even simple generalized linear models exhibit a striking dependence on non-Gaussian features in both classification and regression tasks.