Preprint Article Version 1 This version is not peer-reviewed

A Chinese Short Text Similarity Method Integrating Sentence-level and Phrase-level Semantics

Version 1 : Received: 5 November 2024 / Approved: 6 November 2024 / Online: 7 November 2024 (10:09:31 CET)

How to cite: Shen, Z.; Xiao, Z. A Chinese Short Text Similarity Method Integrating Sentence-level and Phrase-level Semantics. Preprints 2024, 2024110453. https://doi.org/10.20944/preprints202411.0453.v1 Shen, Z.; Xiao, Z. A Chinese Short Text Similarity Method Integrating Sentence-level and Phrase-level Semantics. Preprints 2024, 2024110453. https://doi.org/10.20944/preprints202411.0453.v1

Abstract

Short text similarity, as a pivotal research domain within Natural Language Processing (NLP), has been extensively utilized in intelligent search, recommendation systems, and question-answering systems. The majority of existing models for short text similarity concentrate on aligning the overall semantic content of entire sentences, frequently neglecting the semantic correlations between individual phrases within the sentences. This challenge is particularly acute in the Chinese language context, where synonyms and near-synonyms can introduce substantial interference in the computation of text similarity. In this paper, we introduce a short text similarity computation methodology that integrates both sentence-level and phrase-level semantics. By harnessing vector representations of Chinese words/phrases as external knowledge, our approach amalgamates global sentence characteristics with local phrase features to compute short text similarity from diverse perspectives, spanning from the global to the local level. Experimental findings substantiate that the proposed model surpasses previous approaches in Chinese short text similarity tasks. Specifically, it attains an accuracy of 90.16% on the LCQMC, marking an enhancement of 2.23% over ERNIE and 1.46% over the previously top-performing model, Glyce + BERT.

Keywords

Short text similarity; Chinese sentence pair classification; BERT; external knowledge integration

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.