Understanding the Training of Neural Nets to Solve PDEs

The University of Manchester

About the Project

The UKRI AI Centre for Doctoral Training (CDT) in Decision Making for Complex Systems is a joint CDT between The University of Manchester and the University of Cambridge. The CDT provides funding for four years of advanced studies towards a PhD. The first year consists of a taught program at Manchester that will cover the fundamentals of Machine Learning. This year is followed by three years of research at either at Manchester or Cambridge. Please note the research element of the PhD will take place at the host institution of the supervisor listed for each project.

Description: This project will expand the popular antiSMASH pipeline for the annotation of biosynthetic gene clusters (BGCs) in microbial genomes, with the aim of facilitating the engineering of novel biosynthetic pathways in heterologous expression systems.

We will apply statistical machine learning / AI to a growing body of bacterial genomes to (a) improve the identification of gene cluster boundaries; (b) refine the delimitation of co-evolving functional “modules” (sub-clusters) within BGCs; (c) define assembly rules for the design of hybrid gene clusters and for the successful domain-shuffling of multi-domain enzymes that often are responsible for the core biosynthesis of natural products.

The project will use tools from sequential optimisation to discovery engineering rules that help create novel functional gene clusters from the components identified in the genomes.

In the last few years, there has been a surge in the literature on provable training of various kinds of neural nets in certain regimes of their widths or depths, or for very specifically structured data. This quest for provable deep learning is turning out to be an exciting pathway into hitherto unexplored regimes of mathematics. Motivated by the abundance of experimental studies it has often been surmised that Stochastic Gradient Descent (SGD) on neural net losses – with proper initialization and learning rate – converges to a low-complexity solution, one that generalizes – when it exists (ZLR+17). But only very recently there have appeared ways to try proofs of convergence of SGD on neural losses without either an assumption on the width or the data. And rarer still are studies of the limitations on the quality of neural nets obtainable by any algorithm at all. In this Ph.D. project, the student will explore both the above questions – specifically for setups being developed for scientific ML, like Physics Informed Neural Nets (PINNs) and Neural Operators (NOs.) (CLL, RPO23)

In (GM22), in a first-of-its-kind result, certain recent developments in the theory of SDEs and Villani functions were leveraged to show that continuous-time SGD converges to the global minima of an appropriately Frobenius norm regularized squared loss on any depth 2 neural net with smooth activations – for arbitrary width and data. This immediately leads to the following intriguing question: Are there PDEs corresponding to which loss functions can be written down in either the PINN or the NO setups such that they are Villani functions?

Identification of the kinds of PDEs where the above question has an affirmative answer immediately and significantly pushes further our understanding of what PDEs are possibly efficiently solvable via neural nets. One can also see works like (BHT23) for another perspective on this question which can also be explored.

In works like (MR23) the authors have initiated a study of provably necessary architectural properties of the nets involved in a DeepONet setup (an instantiation of the Neural Operator formalism) so that low error predictors can at all exist. On the other hand, motivated by failure modes of using neural nets in medical imaging (MRR+21), a novel approach to investigating the limitations of training neural nets has been shown in (CAH22). It was demonstrated that there are well-conditioned imaging problems where accurate and stable neural networks provably exist, yet for any integer K > 2, such a class of imaging problems can be constructed such that one cannot compute a neural network that provides a 10^(−K)-accurate solution for this problem but a training algorithm exists that computes a 10−(K−1)-accurate neural network using an arbitrarily large amount of data. Moreover, a 10−(K−2)-accurate neural network can be trained with just a few training samples. Notably, these results hold for any distribution of the training data.

The proof techniques of (CAH22) stem from the mathematics behind the Solvability Complexity Index (SCI) hierarchy (BACH+15). The second part of this Ph.D. plan would aim to extend these methods to scientific M.L. and discover provable limitations of solving PDEs via deep-learning methods. It is envisioned that this framework will reveal stability vs accuracy trade-offs in solving PDEs via neural nets.

Lastly, we note that similar questions as above can also be asked in the context of Koopman operators for nonlinear dynamical systems (BBKK22, CT24). 

Entry requirements

Applicants should have, or expect to achieve, at least a 2.1 honours degree or a master’s (or international equivalent) in a relevant science or engineering related discipline.

How to Apply

As the CDT has only recently been awarded we strongly encourage you to contact the supervisor of the project you are interested in with your CV and supporting documents. You will have a chance to meet with prospective supervisors prior to submitting an application – further details will be provided.

Equality, diversity and inclusion is fundamental to the success of The University of Manchester, and is at the heart of all of our activities. We know that diversity strengthens our research community, leading to enhanced research creativity, productivity and quality, and societal and economic impact.

AI_CDT_DecisionMaking

We actively encourage applicants from diverse career paths and backgrounds and from all sections of the community, regardless of age, disability, ethnicity, gender, gender expression, sexual orientation and transgender status.

We also support applications from those returning from a career break or other roles. We consider offering flexible study arrangements (including part-time: 50%, 60% or 80%, depending on the project/funder).

To help us track our recruitment effort, please indicate in your email – cover/motivation letter where (jobs-near-me.eu) you saw this job posting.