Loading…

Loading grant details…

Active STUDENTSHIP UKRI Gateway to Research

Wasserstein-Type Gradient Flow via Propagation by Chaos for a Continuous Formulation of a Shallow Neural Network


Funder Engineering and Physical Sciences Research Council
Recipient Organization University of Oxford
Country United Kingdom
Start Date Sep 30, 2023
End Date Mar 30, 2027
Duration 1,277 days
Number of Grantees 1
Roles Student
Data Source UKRI Gateway to Research
Grant ID 2879236
Grant Description

Scientific disciplines such as statistics, data analysis and artificial intelligence have witnessed a sudden surge of progress and popularity during the last few decades. This growth can be attributed to the discovery of the machine learning optimization framework, which is a simple, scalable and parameterizable method of interpolating data. Its applications span a wide area of industry including finance, computer programming, 3D graphics, data analysis, health care, epidemiology, law, urban study, animal behaviour and many more.

Even though almost everyone is aware that these algorithms are becoming more and more intertwined and ubiquitous in our daily life, rather counterintuitively, we have yet to develop a satisfactory mathematical model to describe the behaviour of these machines and are still at the early stages of research. This is reflected in the fact we call these algorithms "black boxes".

It is therefore imperative that we research the machine learning framework and the gradient descent algorithm. To optimize the performance and efficiency of these machines, we must gain an understanding of how they work internally. This will allow us to adjust these machines to produce better results with less computational resources and data.

We focus on the specific problem of approximating a dataset with a single layer neural network with respect to the mean square error and some regularization by means of the gradient descent algorithm. A single layer neural network is comprised of finite nodes, each parameterised by a parameter and a weight. Today, one of the most widely used mathematical frameworks to analyse this problem is to employ a mean-field description of the single layer neural network and to investigate the well-posedness and properties of the gradient flow with respect to the square Wasserstein metric.

However, this representation contains redundancy, in the sense that the neural network is invariant to the variance of the weight of a node. There is much evidence that suggests that the variance in the weight of a node will monotonically decrease during the gradient descent algorithm.

In particular an initial zero variance in the weight parameter will be propagated throughout the entire gradient flow, that is we end up with a curve of so-called Young measures.

The aim of this project is to establish a rigorous mathematical justification to prove the stated conjecture. Moreover, this new framework may provide us with the missing tools to determine quantitative convergence bounds or properties of the asymptotic limit of the gradient descent algorithm, both of which are still poorly understood. Additionally, this framework gives us the unique opportunity to also analyse the properties in evolution and regularity of the weights of the single layer neural network expressed as a function of the parameters.

The project is novel, in the sense that there are no known sources that study the geometry of Young measures with respect to the gradient descent equation. Furthermore, this research will give us more insight in the geometry of mean-field solutions to the gradient flow equation.

All Grantees

University of Oxford

Advertisement
Apply for grants with GrantFunds
Advertisement
Browse Grants on GrantFunds
Interested in applying for this grant?

Complete our application form to express your interest and we'll guide you through the process.

Apply for This Grant