Predicting multidomain protein structure and function via co-evolved amino acids and application to polyketide synthases

Downloads

Downloads per month over past year

Oruc, Tugce (2021). Predicting multidomain protein structure and function via co-evolved amino acids and application to polyketide synthases. University of Birmingham. Ph.D.

[img]
Preview
Oruc2021PhD.pdf
Text - Accepted Version
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download (21MB) | Preview

Abstract

Proteins are an important building block of life, and they are responsible for many processes in living organisms. Therefore, understanding their functions and working mechanisms has vital importance to answer many questions about diseases and is a basis for the development of novel drugs. Three dimensional (3D) structure of proteins determine their functions; therefore, the determination of the 3D structures of proteins has been studied widely. Although many experimental techniques have been developed to determine the structures of proteins, they have limitations, especially for large protein complexes. Protein structure can help understand protein function, as can looking at conserved residues, but typically time consuming mutagenesis experiments combined with protein function assays are needed. As an alternative to the experimental methods, researchers have been working on developing computational approaches. While it is relatively easy to predict structures when the structure of a homologous protein is known, as it can be used as a template, the prediction of protein structures in the absence of a template is more challenging. For template-free predictions, coevolved amino acid residue pairs, predicted from the alignment of the homologous sequences, provided promising improvements in the field. More recently, successful implementation of the artificial neural networks, fed by the predicted coevolved residue pairs, improved the accuracy of the predicted structures further. Although there are promising developments in the coevolution based approaches, especially for the structure prediction of small/medium-sized proteins, more developments are needed for predicting protein structure, particularly of large protein complexes. Here, we show that the prediction of distances between residue pairs, via deep neural networks fed by predictions of coevolved residue pairs, improves the accuracy of structure prediction in small/medium-sized proteins. The prediction of residue pair distances, using a similar approach, in two interacting domains also allows us to predict how two domains on the same chain interact with each other. Further, we show that prediction of coevolved residue groups, via statistical coupling analysis, allows us to determine functional boundaries of domains and diverged amino acid patterns in the sub-types of the domains in a multi-domain protein complex, a polyketide synthase. We found that using predicted distances, in addition to the predicted residue pairs in contact, allows us to generate structures closer to the experimental structures, and to select them as the final models in a straightforward approach. Additionally, we reveal that the distances of the residue pairs on interacting domain pairs can be predicted accurately leading to the successful prediction of the structural interface between two interacting proteins when the interface surface is large, and the sequence alignment is comprehensive enough. Finally, we found that functional domain boundaries, which are consistent with the experimental studies, can be determined. Also, some coevolved residue groups have distinct amino acid patterns in different domain sub-types including the positions that have already known as the fingerprint motifs of the different sub-types. These approaches can be applied to predict the structures of individual domains and to predict how two domains interact with each other, which can be used to predict the structure of multi-domain proteins. The work on polyketides here demonstrates how these developments might be applied, since identifying domain boundaries and residues important for substrate specificity should aid in the design of novel polyketide synthases and thus of novel polyketides. This in itself is an important development given the commercial and medicinal importance of polyketides, but also opens the way to similar analysis on other multidomain proteins.

Type of Work: Thesis (Doctorates > Ph.D.)
Award Type: Doctorates > Ph.D.
Supervisor(s):
Supervisor(s)EmailORCID
Winn, Peter J.UNSPECIFIEDUNSPECIFIED
Thomas, Chris M.UNSPECIFIEDUNSPECIFIED
Licence: Creative Commons: Attribution-Noncommercial-Share Alike 4.0
College/Faculty: Colleges (2008 onwards) > College of Life & Environmental Sciences
School or Department: School of Biosciences
Funders: Other
Other Funders: The Darwin Trust of Edinburgh
Subjects: Q Science > QP Physiology
URI: http://etheses.bham.ac.uk/id/eprint/11530

Actions

Request a Correction Request a Correction
View Item View Item

Downloads

Downloads per month over past year