Preprint
Article

The New Version of the Anddigest Tool with Improved AI-Based Short Names Recognition

Altmetrics

Downloads

117

Views

38

Comments

0

A peer-reviewed article of this preprint also exists.

Submitted:

14 October 2022

Posted:

18 October 2022

You are already at the latest version

Alerts
Abstract
The body of scientific literature continues to grow annually. Over 1.5 million abstracts of biomedical publications were added to the PubMed database in 2021. Therefore, developing cognitive systems that provide a specialized search for information in scientific publications based on subject area ontology and modern artificial intelligence methods is urgently needed. We previously developed a web-based information retrieval system, ANDDigest, designed to search and analyze information in the PubMed database using a customized domain ontology. This paper presents an improved ANDDigest version that uses fine-tuned PubMedBERT classifiers to enhance the quality of short name recognition for molecular-genetics entities in PubMed abstracts on eight biological object types: cell components, diseases, side effects, genes, proteins, pathways, drugs, and metabolites. This approach increased average short name recognition accuracy by 13%. The new ANDDigest version (01.2022) has a web interface and is freely available to users at https://anddigest.sysbio.ru/.
Keywords: 
Subject: Social Sciences  -   Library and Information Sciences
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated