Version 1
: Received: 5 January 2022 / Approved: 6 January 2022 / Online: 6 January 2022 (12:56:23 CET)
How to cite:
Afzal, M. Z.; Hashmi, K. A.; Pagani, A.; Liwicki, M.; Stricker, D. DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images. Preprints2022, 2022010090. https://doi.org/10.20944/preprints202201.0090.v1
Afzal, M. Z.; Hashmi, K. A.; Pagani, A.; Liwicki, M.; Stricker, D. DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images. Preprints 2022, 2022010090. https://doi.org/10.20944/preprints202201.0090.v1
Afzal, M. Z.; Hashmi, K. A.; Pagani, A.; Liwicki, M.; Stricker, D. DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images. Preprints2022, 2022010090. https://doi.org/10.20944/preprints202201.0090.v1
APA Style
Afzal, M. Z., Hashmi, K. A., Pagani, A., Liwicki, M., & Stricker, D. (2022). DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images. Preprints. https://doi.org/10.20944/preprints202201.0090.v1
Chicago/Turabian Style
Afzal, M. Z., Marcus Liwicki and Didier Stricker. 2022 "DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images" Preprints. https://doi.org/10.20944/preprints202201.0090.v1
Abstract
This work presents an approach for detecting mathematical formulas in scanned document images. The proposed approach is end-to-end trainable. Since many OCR engines cannot reliably work with the formulas, it is essential to isolate them to obtain the clean text for information extraction from the document. Our proposed pipeline comprises a hybrid task cascade network with deformable convolutions and a Resnext101 backbone. Both of these modifications help in better detection. We evaluate the proposed approaches on the ICDAR-2017 POD and Marmot datasets and achieve an overall accuracy of 96% for the ICDAR-2017 POD dataset. We achieve an overall reduction of error of 13%. Furthermore, the results on Marmot datasets are improved for the isolated and embedded formulas. We achieved an accuracy of 98.78% for the isolated formula and 90.21% overall accuracy for embedded formulas. Consequently, it results in an error reduction rate of 43% for isolated and 17.9% for embedded formulas.
Keywords
formula detection; Hybrid Task Cascade network; mathematical expression detection; document image analysis; deep neural networks; computer vision
Subject
Computer Science and Mathematics, Computer Vision and Graphics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.