Table recognition in mathematical documents

Alkalai, Mohamed A. (2015). Table recognition in mathematical documents. University of Birmingham. Ph.D.

[img]
Preview
Alkalai15PhD.pdf
PDF - Accepted Version

Download (4MB)

Abstract

While a number of techniques have been developed for table recognition in ordinary text documents, when
dealing with tables in mathematical documents these techniques are often ineffective as tables containing
mathematical structures can differ quite significantly from ordinary text tables. In fact, it is even difficult to clearly distinguish table recognition in mathematics from layout analysis of mathematical formulas. Again, it is not straight forward to adapt general layout analysis techniques for mathematical formulas. However, a reliable understanding of formula layout is often a necessary prerequisite to further semantic interpretation of the represented formulae.

In this thesis, we present the necessary preprocessing steps towards a table recognition technique that
specialises on tables in mathematical documents. It is based on our novel robust line recognition technique for mathematical expressions, which is fully independent of understanding the content or specialist fonts of
expressions.

We also present a graph representation for complex mathematical table structures. A set of rewriting rules
applied to the graph allows for reliable re-composition of cells in order to identify several valid table
interpretations. We demonstrate the effectiveness of our technique by applying them to a set of mathematical tables from standard text book that has been manually ground-truthed.

Type of Work: Thesis (Doctorates > Ph.D.)
Award Type: Doctorates > Ph.D.
Supervisor(s):
Supervisor(s)EmailORCID
Lee, MarkUNSPECIFIEDUNSPECIFIED
Rowe, JonathanUNSPECIFIEDUNSPECIFIED
Licence:
College/Faculty: Colleges (2008 onwards) > College of Engineering & Physical Sciences
School or Department: School of Computer Science
Funders: None/not applicable
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
URI: http://etheses.bham.ac.uk/id/eprint/6333

Actions

Request a Correction Request a Correction
View Item View Item

Downloads

Downloads per month over past year