Constrained machine learning methods for biomedical data analysis

Danks, Dominic James (2023). Constrained machine learning methods for biomedical data analysis. University of Birmingham. Ph.D.

Preview

Danks2023PhD.pdf
Text - Accepted Version
Available under License All rights reserved.
Download (9MB) | Preview

Abstract

In recent years, machine learning (ML) methods have shown great promise in a variety of application settings, with ML-based systems now representing the state of the art in a wide variety of intelligence-based computational tasks ranging from image classification to natural language processing. Their impact in health data science is also becoming more tangible, with ML-based systems beginning to be adopted as genuinely attractive prospects by clinicians and health professionals as tools to improve disease understanding and clinical pathways. The importance of such models is also likely to continue to grow as health settings become increasingly data-aware and data-driven.

However, the field of ML-based health data science is far from fully developed, with many clinically relevant settings not yet fully explored and with limited efforts to consider how general ML methodology can be tailored to the health data science setting. In this thesis, we show that by considering the constraints relevant to health-based ML problems and explicitly involving these in our models, it is possible to derive novel approaches which tend to retain the performance synonymous with machine learning whilst offering the plausibility and interpretability of model outputs typically associated with more traditional statistical approaches.

This thesis demonstrates this approach in three health-relevant settings. We begin by considering the problem of pseudotemporal modelling, where we highlight that it is often the case that some prior knowledge is present about the pseudotemporal trajectory which can be exploited to aid learning and reduce human burden. Next, we consider the problem of time-to-event modelling (survival analysis) from a novel machine learning perspective and demonstrate that by suitably constraining a neural network it is possible to model general survival functions and in turn obtain strong model performance with little model tuning or computational difficulty. Finally, we consider the problem of survival analysis in the presence of longitudinal data and present a series of approaches which derive from our constrained general survival model and offer various benefits over established methods.

Type of Work:

Thesis (Doctorates > Ph.D.)

Award Type:

Doctorates > Ph.D.

Supervisor(s):