Weaver, Christopher Dennis
ORCID: 0000-0002-2116-9089
(2024).
Enhanced sensing and electronic characterisation of single-molecules using machine learning.
University of Birmingham.
Ph.D.
|
Weaver2024PhD.pdf
Text - Accepted Version Available under License All rights reserved. Download (15MB) | Preview |
Abstract
Due to the complex and inherently stochastic nature of single-molecule science, experimental measurements typically need to produce large datasets to be able to infer meaningful conclusions regarding the properties of single molecules. Due to both the large volume of data and the highly complex nature of single-molecule systems, experimental results are notoriously difficult to analyse effectively by traditional means. Typically, statistical techniques, such as hypothesis testing, are employed such that the average behaviour of single molecules can be inferred. However, these approaches can be ill-suited to these datasets as assumptions are made to allow for statistical inference. These assumptions can result in limited predictive accuracies due to information loss that coincides with holding potentially misleading prior expectations. As an alternative to statistical methods, the techniques provided by the field of machine learning might be better suited to data analysis of complex datasets as they can yield predictions that do not rely on the presence of assumptions.
Therefore, this thesis explores the benefits of the application of machine learning algorithms to the analysis of single-molecule datasets. To this end, scanning tunnelling spectroscopy techniques were implemented to measure the electrical properties of single molecules and acquire challenging datasets. From the desire to measure single molecules, a new current-time technique workflow was developed whereby current-distance spectroscopy was used to experimentally determine the tunnelling decay constant in analyte solutions. This tunnelling decay constant was subsequently used to construct a calibration curve that relates any measured tunnelling current to the corresponding tunnelling distance. It has been demonstrated that this calibration allows for a tunnelling distance to be optimized for a given molecules physical dimensions, and result in reliable detection of current-time binding events when using an STM in constant-current mode.
Complementary to the development of a reliable current-time technique workflow, advances were made regarding the detection of anomalies in current-time traces. The techniques tested include a simple moving average, a zero-crossing rate detector, and a convolutional autoencoder. It has been demonstrated that current-time events can be detected within a trace with an area under receiver operating characteristic score of 0.97 at a signal-to-noise ratio of only 0.1 when employing a moving average. Whilst exploring the optimisation of these anomaly detection methods it was also discovered that the window size used for each technique directly influences the width of baseline and event distributions as per the standard error of means. As a result, anomaly detection at lower signal-to-noise ratios requires sufficient temporal resolution such that larger window sizes can be implemented. Because of this, it was summarised that effective detection of I(t) events requires high sampling frequencies.
In previous studies, the current-time technique has been extensively applied to the detection of DNA nucleotides for the trialling of a potential next generation sequencing technology. Here, an additional limitation of this technique was revealed where different molecular species produce current-time events of similar appearance. Indeed, the traditionally measured properties of these events demonstrate considerable overlap which makes effective classification not possible. However, it has been demonstrated here that a dataset that possesses the same overlap limitation can be classified with an accuracy of 95%. Implementations of convolutional neural networks and random forest classifiers have been able to classify these complex datasets when traditional statistical means cannot.
As a side to the enhancements made with the current-time technique, machine learning techniques have also been demonstrated to enhance the analysis within scanning tunnelling microscopy break junction experiments through implementation of the dimensionality reduction techniques principal component analysis, t-distributed stochastic neighbour embedding, and uniform manifold approximation and projection. To ease the cleaning and segmentation of experimental data, dimensionality reduction was used such that trends in the overall shapes of traces could be visualised in low-dimensional space. It has been shown that traces with similar shapes are located adjacent to one another in these embedded spaces which facilitated the clustering of traces. To demonstrate the versatility of this method, dimensionality reduction was also applied to a cyclic voltammetry dataset where the differences in complex voltametric curves could be easily visualised and related to experimental parameters.
Stemming from the development of a machine learning enhanced analysis process of break junctions, a novel pseudo-rotaxane molecule has been characterised. From studying this molecule, multiple molecular conductance values of -1.25, -1.96, and -3.08 log(G/G\(_0\)) have been observed. All of these values are far higher than one would expect given the molecules length and lack of conjugation. This highlights rotaxane-like molecules, in general, as potentially highly conductive molecular wires. However, whilst this molecule was shown to possess high conductivity, it also possessed a large variety of plateau break-off distances which is believed to result from strong sulphur-gold interactions. To address these displacement variations a novel alignment procedure was developed. Here, change-point detection strategies from the field of anomaly detection were implemented such that the displacement at which molecular plateaus start could be found on a trace-by-trace basis.
Overall, the implementation of machine learning techniques has been shown to improve the analysis processes of single-molecule scanning tunnelling microscopy techniques, such that the previously challenging classification of current-time events has been achieved with relative ease. Additionally, machine learning has enhanced the characterisation of single molecules for nanoscale electronics such that promising molecules have been highlighted.
| Type of Work: | Thesis (Doctorates > Ph.D.) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Award Type: | Doctorates > Ph.D. | |||||||||
| Supervisor(s): |
|
|||||||||
| Licence: | All rights reserved | |||||||||
| College/Faculty: | Colleges > College of Engineering & Physical Sciences | |||||||||
| School or Department: | School of Chemistry | |||||||||
| Funders: | Engineering and Physical Sciences Research Council | |||||||||
| Subjects: | Q Science > QD Chemistry | |||||||||
| URI: | http://etheses.bham.ac.uk/id/eprint/14528 |
Actions
![]() |
Request a Correction |
![]() |
View Item |
Downloads
Downloads per month over past year

