Mean-Field Neural ODEs via Relaxed Optimal Control
A talk in the Oberseminar Numerik series by
David Siska
Abstract: | ID 926 5310 0938, Password 1928
We develop a framework for the analysis of deep neural
networks and neural ODE models that are trained with stochastic
gradient algorithms.
We do that by identifying the connections between high-dimensional
data-driven control problems, deep learning and theory of statistical
sampling. In particular, we derive and study a mean-field (over-damped)
Langevin algorithm for solving relaxed data-driven control problems. A key
step in the analysis is to derive Pontryagin's optimality principle for
data-driven relaxed control problems. Subsequently, we study uniform-in-time propagation of chaos of time-discretised Mean-Field (overdamped) Langevin dynamics. We derive explicit convergence rate in terms of the learning rate, the number of particles/model parameters and the number of iterations of the gradient algorithm.
In addition, we study the error arising when using a
finite training data set and thus provide quantitive bounds on the
generalisation error. Crucially, the obtained rates are dimension-independent. This is possible by exploiting the regularity of the model with respect to the measure over the parameter space (relaxed
control).
This is joint work with J.-F. Jabir and L. Szpruch. |