We report on a new project to design a semantic ground truth set for mathematical document analysis. The ground truth set will be generated by annotating recognised mathematical symbols with respect to both their global meaning in the context of the considered documents and their local function within the particular mathematical formula they occur. The aim of our work is to have a reliable database available for semantic classification during the formula recognition process with the aim of enabling correct interpretations of mathematical formulae and generating semantic markup such as Content MathML.
@article{702571, title = {Designing a~Semantic~Ground~Truth for~Mathematical~Formulae}, booktitle = {Towards a Digital Mathematics Library. Paris, France, July 7-8th, 2010}, series = {GDML\_Books}, publisher = {Masaryk University Press}, address = {Brno, Czech Republic}, year = {2010}, pages = {37-42}, url = {http://dml.mathdoc.fr/item/702571} }
Sexton, Alan; Sorge, Volker; Suzuki, Masakazu. Designing a Semantic Ground Truth for Mathematical Formulae, dans Towards a Digital Mathematics Library. Paris, France, July 7-8th, 2010, GDML_Books, (2010), pp. 37-42. http://gdmltest.u-ga.fr/item/702571/
2000 Mathematics Subject Classification, 2000. http://www.ams.org/msc/. (2000)
Infty—an integrated OCR system for mathematical documents, In: Proceedings of ACM Symposium on Document Engineering, pages 95–104. ACM Press, 2003. (2003)
The OpenMath Standard, The OpenMath Society, June 2004. (2004)
A ground-truthed mathematical character and symbol image database, In: Proceedings of ICDAR 2005, pages 675–679. IEEE Society Press, 2005. (2005)
Automated OCR ground truth generation, In: Proceedings of DAS 2008, Sep 2008. (2008)
Statistical classification of spatial relationships among mathematical symbols, In: Proceedings of ICDAR 2009, pages 1350–1354. IEEE Society Press, 2009. (2009)
A linear grammar approach to mathematical formula recognition from PDF, In: Proceedings of Intelligent Computer Mathematics, LNAI. Springer Verlag, Germany, 2009. (2009)
Faithful mathematical formula recognition from PDF documents, In: Proceedings of DAS 2010, 2010. Forthcoming. (2010)