Preprint Article Version 1 This version is not peer-reviewed

Exploring the Behavior and Performance of Large Language Models: Can LLMs Infer Answers to Questions Involving Restricted Information?

Version 1 : Received: 6 November 2024 / Approved: 7 November 2024 / Online: 7 November 2024 (08:48:07 CET)

How to cite: Cadena-Bautista, Á.; López-Ponce, F.; Ojeda-Trueba, S.; Sierra, G.; Bel-Enguix, G. Exploring the Behavior and Performance of Large Language Models: Can LLMs Infer Answers to Questions Involving Restricted Information?. Preprints 2024, 2024110502. https://doi.org/10.20944/preprints202411.0502.v1 Cadena-Bautista, Á.; López-Ponce, F.; Ojeda-Trueba, S.; Sierra, G.; Bel-Enguix, G. Exploring the Behavior and Performance of Large Language Models: Can LLMs Infer Answers to Questions Involving Restricted Information?. Preprints 2024, 2024110502. https://doi.org/10.20944/preprints202411.0502.v1

Abstract

In this paper various LLMs are tested in a specific domain using a Retrieval-Augmented Generation (RAG) system. The study focuses on the performance and behaviour of the models and was conducted in Spanish. A questionnaire based on The Bible, which consist of questions that vary in complexity of reasoning, was created in order to evaluate the reasoning capabilities of each model. The RAG system matches a question with the most similar passage from The Bible and feeds the pair to each LLM. The evaluation aims to determine whether each model can reason solely with the provided information or if it disregards the instructions given and makes use on its pretrained knowledge.

Keywords

RAG; Large Language Models; Information Retrieval; Bible corpus

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.