Tonks, Samuel Mark
ORCID: 0009-0008-1408-1493
(2025).
Quantifying the uncertainty and generalisability of machine learning-based virtual staining for high-throughput screening.
University of Birmingham.
Ph.D.
This is the latest version of this item.
|
Tonks2025PhD.pdf
Text - Accepted Version Available under License All rights reserved. Download (19MB) | Preview |
Abstract
High-throughput screening (HTS) [1] with fluorescence microscopy [2] enables different biological structures within cells to be simultaneously revealed, at scale, so that drug compound effects can be quantified. Fluorescence microscopy, however, is expensive, time-consuming [3] and can negatively impact cell samples [4]. While label-free microscopy [5, 6], is cost-effective and widely available, the biological information is not as easily accessible compared to when using fluorescence microscopy because of a lack of intrinsic optical contrast between cellular components. Virtual staining via image-to-image translation [4, 7–12] – a machine learning (ML) technique – aims to translate unstained label-free microscopy images into multiple fluorescent microscopy images to obtain the benefits of fluorescence staining without the associated costs.
Despite impressive qualitative results [4, 7–10], little is known about the feasibility of virtual staining with HTS due to the lack of quantitative, biologically meaningful evaluation metrics. HTS data is continuously being generated across different laboratory settings and new compound perturbations [1]. Despite this, existing virtual staining works [4, 7–12] have been trained and tested on images sampled from the same setting-specific training data distribution. What they have not done is evaluate the generalisability of virtual staining to never-before-seen data, sampled from a different laboratory or cell type. In light of this, we can not know whether virtual staining models trained on setting specific data can generalise to other experimental settings putting the scalability of virtual staining with HTS in doubt.
Virtual staining is an ill-posed problem; given access to a bright-field image, it is very challenging to define a single virtual staining solution. This uncertainty is expressed through the posterior distributions of possible virtual staining solutions and instance biological features given a bright-field image. Currently, very little understanding exists of how well virtual staining models can describe these posterior distributions. Furthermore, the challenge of comparing different predicted instance posterior distributions to a single target sample value from the true posterior distribution remains under-explored.
In the first part of this thesis, we introduce the motivations and core questions this work will address. We formally explore the use of HTS with conventional fluorescence microscopy and different virtual staining approaches. We then provide a background of the mathematical basis of some of the core approaches to virtual staining explored within this work and how they are connected. We provide a comprehensive review of the literature around different adaptations of the core approaches to virtual staining, approaches to evaluating virtual staining and generalisation within virtual staining.
In Chapter 4 we benchmark different virtual staining approaches and present the results of a newly developed evaluation pipeline that provides biologically meaningful results.
In Chapter 5 we further utilise this pipeline and provide an extensive analysis of the generalisation performance of virtual staining models for three common generalisation tasks within HTS.
In Chapter 6 we explore the predicted instance posterior distributions of two generative modelling approaches and present a principled method for directly predicting these distributions from the bright-field image. We systematically compare the predicted instance posterior distributions of both generative modelling approaches using three newly proposed evaluation criteria and compare our findings to population metrics.
| Type of Work: | Thesis (Doctorates > Ph.D.) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Award Type: | Doctorates > Ph.D. | |||||||||
| Supervisor(s): |
|
|||||||||
| Licence: | All rights reserved | |||||||||
| College/Faculty: | Colleges > College of Engineering & Physical Sciences | |||||||||
| School or Department: | School of Computer Science | |||||||||
| Funders: | Engineering and Physical Sciences Research Council | |||||||||
| Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science | |||||||||
| URI: | http://etheses.bham.ac.uk/id/eprint/16352 |
Available Versions of this Item
- Quantifying the uncertainty and generalisability of machine learning-based virtual staining for high-throughput screening. (deposited 18 Jul 2025 14:23) [Currently Displayed]
Actions
![]() |
Request a Correction |
![]() |
View Item |
Downloads
Downloads per month over past year

