Evaluating and improving true and false positive annotations in non-targeted (nano)ESI-DIMS and U(H)PLC-MS based clinical and environmental metabolomics

Ngere, Judith Blessing (2022). Evaluating and improving true and false positive annotations in non-targeted (nano)ESI-DIMS and U(H)PLC-MS based clinical and environmental metabolomics. University of Birmingham. Ph.D.

[img]
Preview
Ngere2022PhD.pdf
Text - Accepted Version
Available under License All rights reserved.

Download (7MB) | Preview

Abstract

Interest in the chemical exposome is increasing due to mounting evidence of the ubiquity of chemicals in the environment. Metabolomics informs on biological perturbations in response to stressors, including chemicals. Metabolomics and exposomics can therefore find utility in chemical risk assessment. However, since exposomics, the study of all non-genetic exposures an organism experiences from conception to death, is still emerging, the chemical coverage, detection reproducibility, and limitations of non-targeted analysis (NTA) methods applied are unknown. Moreover, all NTA methods face challenges in providing confident identification of analytes. The current strategy for confident identification is through fragmentation. However, good quality fragmentation requires sufficient ion intensities, yet it is known that ~70% of metabolites are too low intensity to give good quality fragmentation spectra, whilst chemicals found in biological samples are about ~1,000 times lower in concentration than endogenous metabolites. As such, they seldom yield good quality spectra for confident spectral matching. This means a large proportion of NTA rely on MS1 data annotation, yet there have been comparatively few investigations into how annotation parameters affect accuracy of annotations.

To characterise NTA chemical coverage, reproducibility, and limitations, direct infusion mass spectrometry (DIMS) and liquid chromatography mass spectrometry (LC-MS) were applied for analysis of chemical mixtures (standards in solvent, fortified serum, house-dust, and wristband extracts) as part of a global ring trial called the US Environmental Protection Agency’s (EPA) Non-Targeted Analysis Collaborative Trial (ENTACT). Initially, sample compositions were unrevealed, and annotation was achieved by matching against a suspect screening reference list with 4,462 chemicals (ToxCast library) based on m/z only for both techniques. The second time around, sample compositions were revealed, and this knowledge was used to create small databases containing only chemicals revealed to be in each sample for the DIMS methods, and retention time (RT) databases containing only chemicals detected using the LC-MS methods, against which to match. True and false positive rates (TPR and FPR) were calculated to evaluate method performance. To fill the knowledge gap about how annotation parameters affect accuracy of MS1 annotations, a software called Birmingham mEtabolite Annotation for Mass Spectrometry (BEAMS) was used to annotate four LC-MS datasets (serum), varying all parameters used in the annotation steps to ascertain which parameters impacted annotation the most through calculation of TPR and FPR.

For both techniques (DIMS and LC-MS, respectively), using the ToxCast library for annotation, lower sample complexity yielded higher TPRs of (48-74% and 0-94% for analysis of standard mixtures in a clean solvent matrix), which were reduced by increasing sample complexity. LCMS methods yielded higher TPRs than DIMS methods. However, both techniques resulted in high FPRs (254-879% and 650-2031%), which increased with increasing sample complexity. The use of smaller tailored databases during annotation for DIMS and RT databases for LC-MS reduced FPRs (0-24% and 23-130%). However, for LC-MS methods, the TPR of annotation also decreased (0-74%) since RT databases created were incomplete, containing only chemicals that had been detected repeatedly in three MS1 injections. For DIMS methods, annotation against smaller databases increased the TPR (53-84%).

Optimisation of BEAMS parameters showed that using RT similarity and correlation analysis to group degenerate features, the maximum RT difference parameter had no big impact on the total annotations achieved. However, tighter correlation thresholds reduced the total number of annotations, including both TPRs and FPRs. Mass error tolerances also affected the number of annotations achieved, with tolerances between 0.5-3ppm reducing both true and false positive annotations, with the former highest between 3-10ppm. Finally, the reference lists used for annotation of degenerate features (adducts, isotopes, and neutral losses) TPRs and FPRs of annotation, with longer, more accurate lists created based on each dataset increasing true positive annotation for the positive ion mode datasets relative to shorter default lists. However, these results also demonstrated the pitfalls of using longer reference lists, as TPRs of annotation were reduced for some negative ion mode datasets.

Chemicals in environmental and biological samples can be screened for using both DIMS and LC-MS, yielding high TPRs. However, these techniques do not offer 100% TPRs, therefore new methods are required to increase chemical coverage. The use of smaller tailored reference lists and RT databases for annotation can reduce occurrence of false positive annotations but exemplifies the challenges in creating such small databases. BEAMS optimisations show which parameters affect MS1 data annotation, reduce FPR and maximise TPR. Although these results are specific to datasets and instruments applied herein, they are relevant to anyone using such approaches to group degenerate features, they can be used to guide selection of appropriate annotation parameters. Continued efforts into maximising MS1 data annotation are required.

Type of Work: Thesis (Doctorates > Ph.D.)
Award Type: Doctorates > Ph.D.
Supervisor(s):
Supervisor(s)EmailORCID
Dunn, WarwickUNSPECIFIEDUNSPECIFIED
Licence: All rights reserved
College/Faculty: Colleges (2008 onwards) > College of Life & Environmental Sciences
School or Department: School of Biosciences
Funders: Natural Environment Research Council
Subjects: Q Science > QP Physiology
URI: http://etheses.bham.ac.uk/id/eprint/12448

Actions

Request a Correction Request a Correction
View Item View Item

Downloads

Downloads per month over past year