DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images

Muhammad Zeshan Afzal; Khurram Azeem Hashmi; Alain Pagani; Marcus Liwicki; Didier Stricker

doi:10.20944/preprints202201.0090.v1

Submitted:

05 January 2022

Posted:

06 January 2022

You are already at the latest version

Abstract

This work presents an approach for detecting mathematical formulas in scanned document images. The proposed approach is end-to-end trainable. Since many OCR engines cannot reliably work with the formulas, it is essential to isolate them to obtain the clean text for information extraction from the document. Our proposed pipeline comprises a hybrid task cascade network with deformable convolutions and a Resnext101 backbone. Both of these modifications help in better detection. We evaluate the proposed approaches on the ICDAR-2017 POD and Marmot datasets and achieve an overall accuracy of 96% for the ICDAR-2017 POD dataset. We achieve an overall reduction of error of 13%. Furthermore, the results on Marmot datasets are improved for the isolated and embedded formulas. We achieved an accuracy of 98.78% for the isolated formula and 90.21% overall accuracy for embedded formulas. Consequently, it results in an error reduction rate of 43% for isolated and 17.9% for embedded formulas.

Keywords:

formula detection

;

Hybrid Task Cascade network

;

mathematical expression detection

;

document image analysis

;

deep neural networks

;

computer vision

Subject:

Computer Science and Mathematics - Computer Vision and Graphics

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.

DeHyFoNet: Deformable Hybrid Network for Formula Detection in Scanned Document Images

Abstract

Keywords:

Subject:

MDPI Initiatives

Important Links

Subscribe