Review
Version 1
This version is not peer-reviewed
In-Context Learning in Large Language Models: A Comprehensive Survey
Version 1
: Received: 11 July 2024 / Approved: 11 July 2024 / Online: 11 July 2024 (10:26:13 CEST)
How to cite: Highmore, C. In-Context Learning in Large Language Models: A Comprehensive Survey. Preprints 2024, 2024070926. https://doi.org/10.20944/preprints202407.0926.v1 Highmore, C. In-Context Learning in Large Language Models: A Comprehensive Survey. Preprints 2024, 2024070926. https://doi.org/10.20944/preprints202407.0926.v1
Abstract
This survey provides a comprehensive overview of in-context learning (ICL) in large language models (LLMs), a phenomenon where models can adapt to new tasks without parameter updates by leveraging task-relevant information within the input context. We explore the definition and mechanisms of ICL, investigate the factors contributing to its emergence, and discuss strategies for optimizing and effectively utilizing ICL in various applications. Through a systematic review of recent literature, we first clarify what ICL is, distinguishing it from traditional fine-tuning approaches and highlighting its unique characteristics. We then delve into the underlying causes of ICL, examining theories ranging from implicit meta-learning during pre-training to the emergence of task vectors in LLMs. The survey also covers various approaches to enhance ICL performance, including prompt engineering techniques, demonstration selection strategies, and methods for improving generalization across diverse tasks. Additionally, we discuss the limitations and challenges of ICL, such as its sensitivity to demonstration ordering and potential biases. By synthesizing findings from numerous studies, we aim to provide researchers and practitioners with a clear understanding of the current state of ICL research, its practical implications, and promising directions for future investigation. This survey serves as a valuable resource for those seeking to leverage ICL capabilities in LLMs and contributes to the ongoing discourse on the remarkable adaptability of these models.
Keywords
large language model; in-context learning
Subject
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright: This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Comments (0)
We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.
Leave a public commentSend a private comment to the author(s)
* All users must log in before leaving a comment