Preprint Article Version 1 This version is not peer-reviewed

CJE-PCHF: Chinese Joint Entity and Relation Extraction Model based on Progressive Contrastive Learning and Heterogeneous Feature Fusion

Version 1 : Received: 30 July 2024 / Approved: 30 July 2024 / Online: 30 July 2024 (05:26:48 CEST)

How to cite: He, M.; Bai, Y.; Wei, D. CJE-PCHF: Chinese Joint Entity and Relation Extraction Model based on Progressive Contrastive Learning and Heterogeneous Feature Fusion. Preprints 2024, 2024072384. https://doi.org/10.20944/preprints202407.2384.v1 He, M.; Bai, Y.; Wei, D. CJE-PCHF: Chinese Joint Entity and Relation Extraction Model based on Progressive Contrastive Learning and Heterogeneous Feature Fusion. Preprints 2024, 2024072384. https://doi.org/10.20944/preprints202407.2384.v1

Abstract

The joint extraction of entities and relations is a critical task in information extraction, and its performance directly affects the performance of downstream tasks. However, existing joint extraction models based on deep learning exhibit weak processing capabilities for the phenomenon of multiple pronunciations of one character and multiple characters of one pronunciation when processing Chinese texts, resulting in performance loss. To address these issues, this paper introduces part-of-speech (POS) and pinyin features to aid the model in learning semantic features that are more contextually appropriate. A Chinese Joint Entity and Relation Extraction Model based on Progressive Contrastive Learning and Heterogeneous Feature Fusion is proposed (CJE-PCHF). During model training, an interactive fusion network based on progressive contrastive learning is employed to learn the dependencies between pinyin, POS, and semantic features. This guides the model in heterogeneous feature fusion, capturing higher-order semantic associations between heterogeneous features. On the commonly used DuIE evaluation dataset for joint extraction, this model achieved a significant improvement, with the F1 score increasing by 5.4% compared to the benchmark model CasRel.

Keywords

joint extraction; contrastive learning; heterogeneous features; interactive fusion; semantic association

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.