Preprint Article Version 1 This version is not peer-reviewed

MFGFF-BiLSTM-EGP: A Chinese Nested Named Entity Recognition Model for Chicken Disease Based on Multiple Fine-Grained Feature Fusion and Efficient Global Pointer

Version 1 : Received: 21 August 2024 / Approved: 22 August 2024 / Online: 22 August 2024 (07:41:02 CEST)

How to cite: Wang, X.; Peng, C.; Li, Q.; Yu, Q.; Lin, L.; Li, P.; Gao, R.; Wu, W.; Jiang, R.; Yu, L.; Ding, L.; Zhu, L. MFGFF-BiLSTM-EGP: A Chinese Nested Named Entity Recognition Model for Chicken Disease Based on Multiple Fine-Grained Feature Fusion and Efficient Global Pointer. Preprints 2024, 2024081620. https://doi.org/10.20944/preprints202408.1620.v1 Wang, X.; Peng, C.; Li, Q.; Yu, Q.; Lin, L.; Li, P.; Gao, R.; Wu, W.; Jiang, R.; Yu, L.; Ding, L.; Zhu, L. MFGFF-BiLSTM-EGP: A Chinese Nested Named Entity Recognition Model for Chicken Disease Based on Multiple Fine-Grained Feature Fusion and Efficient Global Pointer. Preprints 2024, 2024081620. https://doi.org/10.20944/preprints202408.1620.v1

Abstract

Extracting entities from large volumes of chicken epidemic texts is crucial for knowledge sharing, integration, and application. However, Named Entity Recognition (NER) encounters significant challenges in this domain, particularly due to the prevalence of nested entities and domain-specific named entities, coupled with a scarcity of labeled data. To address these challenges, we compiled a corpus from 50 books on chicken diseases, covering 28 different disease types. Utilizing this corpus, we developed a nested NER model, MFGFF-BiLSTM-EGP. This model integrates the Multiple Fine-Grained Feature Fusion (MFGFF) module with a BiLSTM neural network and employs an Efficient Global Pointer (EGP) for predicts the entity location encoding. In MFGFF module we designed three encoders, character encoder, word encoder and sentence encoder, this design effectively captures fine-grained features and improves the recognition accuracy of nested entities. Experimental results show that the model performs robustly, with F1 scores of 91.98%, 73.32%, and 82.54% on the CDNER, CMeEE V2, and CLUENER datasets, respectively, outperforming other commonly used NER models. Specifically, on the CDNER dataset, the model achieved an F1 score of 79.68% for nested entity recognition. This research not only advances the development of knowledge graph and intelligent question-answering system for chicken diseases but also provides a viable solution for extracting disease information that can be applied to other livestock species.

Keywords

nested named entity recognition; chicken disease; multiple fine-grained feature fusion; RoBERTa; efficient global pointer

Subject

Biology and Life Sciences, Animal Science, Veterinary Science and Zoology

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.