Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Comparison of Large Language Models Versus Traditional Information Extraction Methods for Real World Evidence of Patient Symptomatology in Acute and Post-Acute Sequelae of SARS-CoV-2

Version 1 : Received: 18 September 2024 / Approved: 19 September 2024 / Online: 20 September 2024 (02:48:03 CEST)

How to cite: Thakkar, V.; Silverman, G. M.; Kc, A.; Ingraham, N. E.; Jones, E.; King, S.; Tignanelli, C. J. Comparison of Large Language Models Versus Traditional Information Extraction Methods for Real World Evidence of Patient Symptomatology in Acute and Post-Acute Sequelae of SARS-CoV-2. Preprints 2024, 2024091518. https://doi.org/10.20944/preprints202409.1518.v1 Thakkar, V.; Silverman, G. M.; Kc, A.; Ingraham, N. E.; Jones, E.; King, S.; Tignanelli, C. J. Comparison of Large Language Models Versus Traditional Information Extraction Methods for Real World Evidence of Patient Symptomatology in Acute and Post-Acute Sequelae of SARS-CoV-2. Preprints 2024, 2024091518. https://doi.org/10.20944/preprints202409.1518.v1

Abstract

Patient symptoms play a critical role in disease progression and diagnosis, yet they are most often captured in unstructured clinical notes. This study explores use of large language models (LLMs) for extracting patient symptoms from clinical notes, comparing their performance against rule-based information extraction (IE) systems, like BioMedICUS. By fine-tuning an LLM on diverse corpora from multiple healthcare institutions, we aimed to improve symptom extraction accuracy and efficiency with symptoms related to acute and post-acute sequelae of SARS-CoV-2. We also conducted prevalence analysis to highlight significant differences in symptom prevalence across corpora and performed fairness analysis to assess the model's equity across race and gender. Our findings indicate that while LLMs can match the effectiveness of rule-based IE methods, they face significant challenges related to demographic bias and generalizability due to variability in training corpora. This evaluation revealed overfitting and insufficient generalization, especially when models were trained predominantly on limited datasets with single annotator bias. This study also revealed that LLMs offer substantial advantages in efficiency, adaptability, and scalability for biomedical IE, marking a transformative shift in clinical data extraction processes. These results provide real-world evidence of the necessity for diverse, high-quality training datasets and robust annotation processes to enhance LLM’s performance and reliability in clinical applications.

Keywords

Information Extraction; Natural Language Processing; Large Language Models; Electronic Health Records

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.