Preprint Review Version 1 This version is not peer-reviewed

In-Context Learning in Large Language Models: A Comprehensive Survey

Version 1 : Received: 11 July 2024 / Approved: 11 July 2024 / Online: 11 July 2024 (10:26:13 CEST)

How to cite: Highmore, C. In-Context Learning in Large Language Models: A Comprehensive Survey. Preprints 2024, 2024070926. https://doi.org/10.20944/preprints202407.0926.v1 Highmore, C. In-Context Learning in Large Language Models: A Comprehensive Survey. Preprints 2024, 2024070926. https://doi.org/10.20944/preprints202407.0926.v1

Abstract

This survey provides a comprehensive overview of in-context learning (ICL) in large language models (LLMs), a phenomenon where models can adapt to new tasks without parameter updates by leveraging task-relevant information within the input context. We explore the definition and mechanisms of ICL, investigate the factors contributing to its emergence, and discuss strategies for optimizing and effectively utilizing ICL in various applications. Through a systematic review of recent literature, we first clarify what ICL is, distinguishing it from traditional fine-tuning approaches and highlighting its unique characteristics. We then delve into the underlying causes of ICL, examining theories ranging from implicit meta-learning during pre-training to the emergence of task vectors in LLMs. The survey also covers various approaches to enhance ICL performance, including prompt engineering techniques, demonstration selection strategies, and methods for improving generalization across diverse tasks. Additionally, we discuss the limitations and challenges of ICL, such as its sensitivity to demonstration ordering and potential biases. By synthesizing findings from numerous studies, we aim to provide researchers and practitioners with a clear understanding of the current state of ICL research, its practical implications, and promising directions for future investigation. This survey serves as a valuable resource for those seeking to leverage ICL capabilities in LLMs and contributes to the ongoing discourse on the remarkable adaptability of these models.

Keywords

large language model; in-context learning

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.