A rule-based approach for recognition of chemical structure diagrams

Sadawi, Noureddin (2013). A rule-based approach for recognition of chemical structure diagrams. University of Birmingham. Ph.D.

PDF - Accepted Version

Download (1MB)


In chemical literature much information is given in the form of diagrams depicting chemical structures. In order to access this information electronically, diagrams have to be recognized and translated into a processable format. Although a number of approaches have been proposed for the recognition of molecule diagrams in the literature, they traditionally employ procedural methods with limited flexibility and extensibility. This thesis presents a novel approach that models the principal recognition steps for molecule diagrams in a strictly rule based system. We develop a framework that enables the definition of a set of rules for the recognition of different bond types and arrangements as well as for resolving possible ambiguities. This allows us to view the diagram recognition problem as a process of rewriting an initial set of geometric artefacts into a graph representation of a chemical diagram without the need to adhere to a rigid procedure. We demonstrate the flexibility of the approach by extending it to capture new bond types and compositions. In experimental evaluation we can show that an implementation of our approach outperforms the currently available leading open source system. Finally, we discuss how our framework could be applied to other automatic diagram recognition tasks.

Type of Work: Thesis (Doctorates > Ph.D.)
Award Type: Doctorates > Ph.D.
College/Faculty: Colleges (2008 onwards) > College of Engineering & Physical Sciences
School or Department: School of Computer Science
Funders: None/not applicable
Subjects: Q Science > QA Mathematics > QA76 Computer software
Q Science > QD Chemistry
URI: http://etheses.bham.ac.uk/id/eprint/4325


Request a Correction Request a Correction
View Item View Item


Downloads per month over past year