Submitted:
07 April 2025
Posted:
09 April 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Background and Preliminaries
2.1. Foundation Models and Their Importance
2.2. Traditional Fine-Tuning and Its Limitations
- High Computational and Memory Costs: Updating billions of parameters requires significant GPU/TPU resources, making full fine-tuning infeasible for many users and organizations with limited computational budgets [26].
- Catastrophic Forgetting: Fine-tuning a model on a new task may lead to the loss of previously learned knowledge, making it difficult to maintain multi-task generalization [27].
- Storage and Deployment Overhead: For each downstream task, a separately fine-tuned model must be stored, leading to excessive storage requirements and complicating deployment.
- Data Efficiency: Full fine-tuning typically requires substantial labeled data for each new task, which is impractical in many real-world scenarios with limited task-specific annotations [28].
2.3. Parameter-Efficient Fine-Tuning (PEFT): A New Paradigm
- Reduced Training Costs: PEFT methods drastically lower the number of trainable parameters, leading to faster training and lower memory consumption [32].
- Improved Knowledge Retention: Since most parameters remain unchanged, PEFT helps retain the generalization capabilities of foundation models and mitigates catastrophic forgetting [33].
- Efficient Multi-Task Adaptation: Instead of training separate models for each downstream task, PEFT allows multiple tasks to be handled using lightweight task-specific modifications, facilitating scalable deployment [34].
2.4. Overview of PEFT Techniques
- Low-Rank Adaptation (LoRA): Decomposes weight updates into low-rank matrices, reducing the number of trainable parameters while preserving model expressiveness [39].
- Prefix and Prompt Tuning: Modify input representations rather than model weights, allowing task adaptation through learnable prompts that guide the model’s behavior.
- BitFit and Other Selective Fine-Tuning Methods: Fine-tune only a small subset of parameters, such as bias terms, to achieve efficiency while maintaining performance [40].
3. Taxonomy of Parameter-Efficient Fine-Tuning Methods
3.1. Adapter-Based Methods
3.1.1. Standard Adapters
3.1.2. Compacter and HyperAdapters
3.1.3. Residual and Parallel Adapters
3.2. Low-Rank Adaptation (LoRA)
3.2.1. LoRA Mechanism
3.2.2. Advantages and Limitations
3.3. Prefix Tuning and Prompt Tuning
3.3.1. Prefix Tuning
3.3.2. Prompt Tuning
3.3.3. Comparison with Other PEFT Methods
3.4. BitFit and Other Selective Fine-Tuning Approaches
3.4.1. BitFit Mechanism
3.4.2. Layerwise and Tokenwise Fine-Tuning
3.5. Comparison of PEFT Methods
3.6. Summary
4. Practical Applications and Real-World Implementations
4.1. Natural Language Processing (NLP)
4.1.1. Text Classification and Sentiment Analysis
4.1.2. Machine Translation
4.1.3. Dialogue Systems and Chatbots
4.2. Computer Vision
4.2.1. Image Classification and Object Detection
4.2.2. Few-Shot and Zero-Shot Learning
4.3. Speech Processing
4.3.1. Speech Recognition and Transcription
4.3.2. Speaker Identification and Emotion Recognition
4.4. Multimodal Learning
4.4.1. Vision-Language Models
4.4.2. Audio-Visual Learning
4.5. Industry Adoption and Deployment
- OpenAI and Microsoft: LoRA and prompt tuning have been used to efficiently adapt large language models for enterprise-specific applications [75].
- Google and DeepMind: Adapter-based methods have been deployed in vision and language models to improve fine-tuning efficiency [76].
- Meta AI: PEFT techniques have been applied in multimodal models for content moderation and recommendation systems [77].
4.6. Challenges and Future Directions
- Task-Specific Trade-offs: Choosing the best PEFT method for a given task requires extensive experimentation and benchmarking.
- Scalability to Diverse Tasks: Some PEFT methods may struggle with generalization across highly diverse tasks [78].
- Optimization Strategies: Finding optimal hyperparameters for PEFT methods remains an open research problem.
4.7. Summary
5. Theoretical Foundations of Parameter-Efficient Fine-Tuning
5.1. Transfer Learning and Representational Reuse
5.1.1. Pre-Trained Feature Extractors
5.1.2. Linear Mode Connectivity
5.2. Low-Rank Subspace Hypothesis
5.2.1. Low-Rank Decomposition of Weight Updates
5.2.2. Empirical Evidence for Low-Rank Adaptation
5.3. Sparsity and Selective Adaptation
5.3.1. Lottery Ticket Hypothesis and Selective Fine-Tuning
5.3.2. Gradient-Based Parameter Selection
5.4. Generalization Properties of PEFT Methods
5.4.1. Implicit Regularization
5.4.2. Robustness to Distribution Shifts
5.5. Theoretical Limitations and Open Problems
- Optimal Rank Selection: LoRA and other low-rank methods require careful selection of the rank parameter r. Finding the optimal balance between efficiency and expressiveness remains an open question [92].
- Task-Specific Adaptation Boundaries: While PEFT works well for many tasks, some require deeper model modifications [93]. Understanding the theoretical limits of parameter efficiency is an area of ongoing research.
- Interaction Between PEFT Methods: Combining different PEFT techniques, such as LoRA with adapters, is an emerging area that requires deeper theoretical insights [94].
5.6. Summary
6. Empirical Performance and Benchmarking of PEFT Methods
6.1. Evaluation Metrics
- Task Performance: Measured using accuracy (classification), BLEU score (translation), perplexity (language modeling), or mean squared error (regression) [97].
- Number of Trainable Parameters: The fraction of parameters updated during fine-tuning.
- Computational Cost: Training time and memory usage compared to full fine-tuning [98].
- Generalization Performance: Evaluated through cross-domain robustness and performance on few-shot learning tasks [99].
6.2. Benchmarking Studies on NLP Tasks
6.2.1. Performance on Text Classification
6.2.2. Machine Translation and Summarization
6.2.3. Open-Ended Language Generation
6.3. Empirical Results on Vision Tasks
6.3.1. Image Classification
6.3.2. Object Detection and Segmentation
6.4. Performance in Speech Processing
6.5. Comparison of PEFT Methods Across Tasks
6.6. Analysis of Trade-Offs
- Task-Specific Performance: Adapters and LoRA consistently match full fine-tuning in most tasks, whereas prompt tuning can underperform in specialized domains.
- Parameter Efficiency: Prompt tuning and BitFit require the least number of trainable parameters but may require additional prompt optimization for best results [108].
- Computational Overhead: LoRA and adapters strike a balance between efficiency and performance, making them ideal choices for real-world deployment.
6.7. Summary
7. Emerging Trends and Future Directions in PEFT
7.1. Hybrid PEFT Approaches
7.1.1. LoRA with Adapters
7.1.2. Prompt Tuning with LoRA
7.2. Cross-Task Generalization and Multi-Task PEFT
7.2.1. Task-Agnostic Fine-Tuning
7.2.2. Meta-Learning for PEFT
7.3. Memory-Efficient and Hardware-Aware PEFT
7.3.1. Sparse and Quantized PEFT
7.3.2. Hardware-Aware Optimization
7.4. PEFT for Continual Learning and Lifelong Adaptation
7.4.1. Dynamic Parameter Allocation
7.4.2. Memory-Augmented PEFT
7.5. Beyond Traditional ML: PEFT in Scientific and Edge AI Applications
7.5.1. PEFT for Scientific Machine Learning
7.5.2. PEFT in Edge AI and On-Device Learning
7.6. Challenges and Open Questions
- Scalability to Extremely Large Models: As foundation models grow beyond a trillion parameters, ensuring that PEFT remains efficient is an open question.
- Understanding the Limits of PEFT: The theoretical boundaries of parameter-efficient adaptation are still being explored.
- Security and Robustness: Fine-tuning methods may introduce vulnerabilities, such as adversarial attacks and unintended model behaviors, requiring further research [130].
7.7. Summary
8. Conclusion
8.1. Key Takeaways
- Effectiveness of PEFT: Techniques such as adapters, LoRA, prefix tuning, and prompt tuning enable competitive performance compared to full fine-tuning while significantly reducing computational and memory requirements.
- Theoretical Justification: The success of PEFT is supported by principles such as the low-rank subspace hypothesis, sparsity, and transfer learning, explaining why only a small fraction of parameters need to be updated for effective adaptation.
- Empirical Validation: Large-scale benchmarking studies across NLP, vision, and speech tasks demonstrate that PEFT methods achieve near full fine-tuning performance while dramatically improving efficiency.
- Emerging Trends: Hybrid approaches, memory-efficient PEFT, continual learning, and hardware-aware optimizations represent promising directions for further enhancing model adaptation.
8.2. Future Research Directions
8.2.1. Scaling PEFT to Ultra-Large Models
8.2.2. Task-Agnostic and Universal PEFT
8.2.3. Robustness, Security, and Interpretability
8.2.4. PEFT for Resource-Constrained Environments
8.3. Final Thoughts
References
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
- Salimans, T.; Kingma, D.P. Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks. In Proceedings of the Advances in neural information processing systems, 2016, p. 901.
- Zha, Y.; Wang, J.; Dai, T.; Chen, B.; Wang, Z.; Xia, S.T. Instance-aware dynamic prompt tuning for pre-trained point cloud models. In Proceedings of the Proceedings of the IEEE/CVF International Conference on Computer Vision; 2023. [Google Scholar]
- Gao, P.; Geng, S.; Zhang, R.; Ma, T.; Fang, R.; Zhang, Y.; Li, H.; Qiao, Y. Clip-adapter: Better vision-language models with feature adapters. International Journal of Computer Vision 2024. [Google Scholar] [CrossRef]
- Zhou, X.; Liang, D.; Xu, W.; Zhu, X.; Xu, Y.; Zou, Z.; Bai, X. Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024. [Google Scholar]
- Pavlyshenko, B.M. Financial News Analytics Using Fine-Tuned Llama 2 GPT Model. arXiv preprint 2023. arXiv:2308.13032.
- Xing, Z.; Dai, Q.; Hu, H.; Wu, Z.; Jiang, Y.G. Simda: Simple diffusion adapter for efficient video generation. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024. [Google Scholar]
- Li, Y.; Ma, T.; Zhang, H. Algorithmic Regularization in Over-parameterized Matrix Sensing and Neural Networks with Quadratic Activations. In Proceedings of the Annual Conference Computational Learning Theory; 2017. [Google Scholar]
- Sanh, V.; Debut, L.; Chaumond, J.; Wolf, T. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. ArXiv 2019, abs/1910.01108.
- Blattmann, A.; Dockhorn, T.; Kulal, S.; Mendelevitch, D.; Kilian, M.; Lorenz, D.; Levi, Y.; English, Z.; Voleti, V.; Letts, A.; et al. Stable video diffusion: Scaling latent video diffusion models to large datasets. arXiv preprint 2023. arXiv:2311.15127.
- Zhang, C.; Mao, Y.; Fan, Y.; Mi, Y.; Gao, Y.; Chen, L.; Lou, D.; Lin, J. FinSQL: Model-Agnostic LLMs-based Text-to-SQL Framework for Financial Analysis. In Proceedings of the Companion of the 2024 International Conference on Management of Data, SIGMOD/PODS; 2024; pp. 93–105. [Google Scholar]
- Vu, T.; Lester, B.; Constant, N.; Al-Rfou, R.; Cer, D.M. SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer. In Proceedings of the Annual Meeting of the Association for Computational Linguistics; 2021. [Google Scholar]
- Jiang, T.; Huang, S.; Luo, S.; Zhang, Z.; Huang, H.; Wei, F.; Deng, W.; Sun, F.; Zhang, Q.; Wang, D.; et al. MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning. arXiv preprint 2024. arXiv:2405.12130.
- Bałazy, K.; Banaei, M.; Aberer, K.; Tabor, J. LoRA-XS: Low-Rank Adaptation with Extremely Small Number of Parameters. arXiv preprint 2024. arXiv:2405.17604.
- Bai, J.; Gao, K.; Min, S.; Xia, S.T.; Li, Z.; Liu, W. BadCLIP: Trigger-Aware Prompt Learning for Backdoor Attacks on CLIP. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2024. [Google Scholar]
- Mao, Y.; Huang, K.; Guan, C.; Bao, G.; Mo, F.; Xu, J. DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution. arXiv preprint 2024. arXiv:2405.17357.
- Chen, S.; Ge, C.; Tong, Z.; Wang, J.; Song, Y.; Wang, J.; Luo, P. Adaptformer: Adapting vision transformers for scalable visual recognition. Advances in Neural Information Processing Systems 2022. [Google Scholar]
- Liu, X.Y.; Zhu, R.; Zha, D.; Gao, J.; Zhong, S.; Qiu, M. Differentially private low-rank adaptation of large language model using federated learning. arXiv preprint 2023. arXiv:2312.17493.
- Clark, K.; Luong, M.T.; Le, Q.V.; Manning, C.D. Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint 2020. arXiv:2003.10555.
- Kong, Z.; Zhang, Y.; Yang, T.; Wang, T.; Zhang, K.; Wu, B.; Chen, G.; Liu, W.; Luo, W. OMG: Occlusion-friendly Personalized Multi-concept Generation in Diffusion Models. arXiv preprint 2024. arXiv:2403.10983.
- Wang, Y.; Lin, Y.; Zeng, X.; Zhang, G. MultiLoRA: Democratizing LoRA for Better Multi-Task Learning. arXiv preprint 2023. arXiv:2311.11501.
- Shen, Y.; Song, K.; Tan, X.; Li, D.; Lu, W.; Zhuang, Y. Hugginggpt: Solving ai tasks with chatgpt and its friends in hugging face. Advances in Neural Information Processing Systems 2024. [Google Scholar]
- Liao, B.; Monz, C. ApiQ: Finetuning of 2-Bit Quantized Large Language Model. arXiv preprint 2024. arXiv:2402.05147.
- Pan, J.; Sadé, A.; Kim, J.; Soriano, E.; Sole, G.; Flamant, S. SteloCoder: a Decoder-Only LLM for Multi-Language to Python Code Translation. arXiv preprint 2023. arXiv:2310.15539.
- Zhang, Y.; Zhou, K.; Liu, Z. Neural prompt search. arXiv preprint 2022. arXiv:2206.04673.
- Tang, Z.; Yang, Z.; Zhu, C.; Zeng, M.; Bansal, M. Any-to-any generation via composable diffusion. Advances in Neural Information Processing Systems 2024. [Google Scholar]
- Li, H.; Koto, F.; Wu, M.; Aji, A.F.; Baldwin, T. Bactrian-X: Multilingual Replicable Instruction-Following Models with Low-Rank Adaptation. arXiv preprint 2023. arXiv:2305.15011.
- Zhang, Q.; Chen, M.; Bukharin, A.; He, P.; Cheng, Y.; Chen, W.; Zhao, T. Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning. In Proceedings of the The Eleventh International Conference on Learning Representations; 2023. [Google Scholar]
- Yin, D.; Yang, Y.; Wang, Z.; Yu, H.; Wei, K.; Sun, X. 1% vs 100%: Parameter-efficient low rank adapter for dense predictions. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023. [Google Scholar]
- Chen, J.; Zhang, A.; Shi, X.; Li, M.; Smola, A.J.; Yang, D. Parameter-Efficient Fine-Tuning Design Spaces. ArXiv 2023, abs/2301.01821.
- Sun, J.; Fu, D.; Hu, Y.; Wang, S.; Rassin, R.; Juan, D.C.; Alon, D.; Herrmann, C.; van Steenkiste, S.; Krishna, R.; et al. Dreamsync: Aligning text-to-image generation with image understanding feedback. In Proceedings of the Synthetic Data for Computer Vision Workshop@ CVPR 2024; 2023. [Google Scholar]
- He, S.; Ding, L.; Dong, D.; Zhang, M.; Tao, D. SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters. ArXiv, 2210. [Google Scholar]
- Chai, S.; Jain, R.K.; Teng, S.; Liu, J.; Li, Y.; Tateyama, T.; Chen, Y.w. Ladder fine-tuning approach for sam integrating complementary network. arXiv preprint 2023. arXiv:2306.12737.
- Jumper, J.; Evans, R.; Pritzel, A.; Green, T.; Figurnov, M.; Ronneberger, O.; Tunyasuvunakool, K.; Bates, R.; Žídek, A.; Potapenko, A.; et al. Highly accurate protein structure prediction with AlphaFold. nature 2021, 596, 583–589. [Google Scholar] [CrossRef]
- Qiu, X.; Sun, T.; Xu, Y.; Shao, Y.; Dai, N.; Huang, X. Pre-trained models for natural language processing: A survey. Science China Technological Sciences 2020, 63, 1872–1897. [Google Scholar] [CrossRef]
- Yang, Y.; Jiang, P.; Hou, Q.; Zhang, H.; Chen, J.; Li, B. Multi-Task Dense Prediction via Mixture of Low-Rank Experts. arXiv preprint 2024. arXiv:2403.17749.
- Shao, Z.; Yu, Z.; Wang, M.; Yu, J. Prompting large language models with answer heuristics for knowledge-based visual question answering. In Proceedings of the Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2023, pp. 14974–14983.
- Shi, J.; Hua, H. Space Narrative: Generating Images and 3D Scenes of Chinese Garden from Text Using Deep Learning. In Proceedings of the xArch–creativity in the age of digital reproduction symposium; 2023; pp. 236–243. [Google Scholar]
- Huang, X.; Huang, Z.; Li, S.; Qu, W.; He, T.; Hou, Y.; Zuo, Y.; Ouyang, W. Frozen CLIP Transformer Is an Efficient Point Cloud Encoder. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence; 2024. [Google Scholar]
- Liu, X.; Chen, Q.; Deng, C.; Zeng, H.J.; Chen, J.; Li, D.; Tang, B. LCQMC:A Large-scale Chinese Question Matching Corpus. In Proceedings of the International Conference on Computational Linguistics; 2018. [Google Scholar]
- Liu, W.; Shen, X.; Pun, C.M.; Cun, X. Explicit visual prompting for low-level structure segmentations. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023. [Google Scholar]
- Koubbi, H.; Boussard, M.; Hernandez, L. The Impact of LoRA on the Emergence of Clusters in Transformers. arXiv preprint 2024. arXiv:2402.15415.
- Chen, G.; Liu, F.; Meng, Z.; Liang, S. Revisiting Parameter-Efficient Tuning: Are We Really There Yet? In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2022.
- Chen, Z.; Wang, Z.; Wang, Z.; Liu, H.; Yin, Z.; Liu, S.; Sheng, L.; Ouyang, W.; Qiao, Y.; Shao, J. Octavius: Mitigating Task Interference in MLLMs via MoE. arXiv preprint 2023. arXiv:2311.02684.
- Chitale, R.; Vaidya, A.; Kane, A.; Ghotkar, A. Task Arithmetic with LoRA for Continual Learning. arXiv preprint 2023.
- Xu, M.; Zhang, Z.; Wei, F.; Hu, H.; Bai, X. Side adapter network for open-vocabulary semantic segmentation. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; 2023. [Google Scholar]
- Li, S. DiffStyler: Diffusion-based Localized Image Style Transfer. arXiv preprint 2024. arXiv:2403.18461.
- Pennington, J.; Socher, R.; Manning, C.D. Glove: Global vectors for word representation. In Proceedings of the Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), 2014, pp. 1532–1543.
- Wang, Z.; Wang, X.; Xie, L.; Qi, Z.; Shan, Y.; Wang, W.; Luo, P. Styleadapter: A single-pass lora-free model for stylized image generation. arXiv preprint 2023. arXiv:2309.01770.
- Ramesh, A.; Pavlov, M.; Goh, G.; Gray, S.; Voss, C.; Radford, A.; Chen, M.; Sutskever, I. Zero-shot text-to-image generation. In Proceedings of the International conference on machine learning. PMLR; 2021. [Google Scholar]
- Chen, X.; Wang, C.; Ning, H.; Li, S. SAM-OCTA: Prompting Segment-Anything for OCTA Image Segmentation. arXiv preprint 2023. arXiv:2310.07183.
- Sung, Y.L.; Cho, J.; Bansal, M. LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning. ArXiv 2022, abs/2206.06522.
- Wu, T.; Wang, J.; Zhao, Z.; Wong, N. Mixture-of-Subspaces in Low-Rank Adaptation. arXiv preprint 2024. arXiv:2406.11909.
- Ba, J.; Kiros, J.R.; Hinton, G.E. Layer Normalization. ArXiv 2016, abs/1607.06450.
- Hao, Y.; Cao, Y.; Mou, L. Flora: Low-Rank Adapters Are Secretly Gradient Compressors. arXiv preprint 2024. arXiv:2402.03293.
- Yeo, J.H.; Han, S.; Kim, M.; Ro, Y.M. Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing. arXiv preprint 2024. arXiv:2402.15151.
- Fu, C.L.; Chen, Z.C.; Lee, Y.R.; Lee, H.y. Adapterbias: Parameter-efficient token-dependent representation shift for adapters in nlp tasks. NAACL 2022. [Google Scholar]
- Sang, E.T.K.; Meulder, F.D. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition. In Proceedings of the Conference on Computational Natural Language Learning; 2003. [Google Scholar]
- Ye, Q.; Xu, H.; Xu, G.; Ye, J.; Yan, M.; Zhou, Y.; Wang, J.; Hu, A.; Shi, P.; Shi, Y.; et al. mplug-owl: Modularization empowers large language models with multimodality. arXiv preprint 2023. arXiv:2304.14178.
- Gou, Y.; Liu, Z.; Chen, K.; Hong, L.; Xu, H.; Li, A.; Yeung, D.; Kwok, J.T.; Zhang, Y. Mixture of Cluster-conditional LoRA Experts for Vision-language Instruction Tuning. arXiv preprint 2023. arXiv:2312.12379.
- Hendrycks, D.; Burns, C.; Basart, S.; Zou, A.; Mazeika, M.; Song, D.; Steinhardt, J. Measuring massive multitask language understanding. arXiv preprint 2020. arXiv:2009.03300.
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A survey of large language models. arXiv preprint 2023. arXiv:2303.18223.
- Shi, H.; Dao, S.D.; Cai, J. LLMFormer: Large Language Model for Open-Vocabulary Semantic Segmentation. International Journal of Computer Vision 2024. [Google Scholar] [CrossRef]
- Liu, S.; Wang, C.; Yin, H.; Molchanov, P.; Wang, Y.F.; Cheng, K.; Chen, M. DoRA: Weight-Decomposed Low-Rank Adaptation. arXiv preprint 2024. arXiv:2402.09353.
- Jie, S.; Deng, Z.H. Convolutional bypasses are better vision transformer adapters. arXiv preprint 2022. arXiv:2207.07039.
- Bai, J.; Chen, D.; Qian, B.; Yao, L.; Li, Y. Federated Fine-tuning of Large Language Models under Heterogeneous Language Tasks and Client Resources. arXiv preprint 2024. arXiv:2402.11505.
- Renduchintala, A.; Konuk, T.; Kuchaiev, O. Tied-LoRA: Enhancing parameter efficiency of LoRA with Weight Tying. In Proceedings of the Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2024. [Google Scholar]
- Wu, P.; Li, K.; Wang, T.; Wang, F. FedMS: Federated Learning with Mixture of Sparsely Activated Foundations Models. arXiv preprint 2023. arXiv:2312.15926.
- Li, X.L.; Liang, P. Prefix-Tuning: Optimizing Continuous Prompts for Generation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), 2101. [Google Scholar]
- Biderman, D.; Ortiz, J.J.G.; Portes, J.; Paul, M.; Greengard, P.; Jennings, C.; King, D.; Havens, S.; Chiley, V.; Frankle, J.; et al. LoRA Learns Less and Forgets Less. arXiv preprint 2024. arXiv:2405.09673.
- Lee, B.; Park, B.; Kim, C.W.; Ro, Y.M. CoLLaVO: Crayon Large Language and Vision mOdel. arXiv preprint 2024. arXiv:2402.11248.
- Fu, M.; Zhu, K.; Wu, J. Dtl: Disentangled transfer learning for visual recognition. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence; 2024. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser. ; Polosukhin, I. Attention is all you need. Advances in Neural Information Processing Systems (NeurIPS) 2017, 30. [Google Scholar]
- Mujadia, V.; Urlana, A.; Bhaskar, Y.; Pavani, P.A.; Shravya, K.; Krishnamurthy, P.; Sharma, D.M. Assessing Translation Capabilities of Large Language Models Involving English and Indian Languages. arXiv preprint 2023. arXiv:2311.09216.
- Reuther, A.; Michaleas, P.; Jones, M.; Gadepally, V.; Samsi, S.; Kepner, J. Survey and Benchmarking of Machine Learning Accelerators. 2019 IEEE High Performance Extreme Computing Conference (HPEC). 2019, pp. 1–9.
- Gema, A.P.; Daines, L.; Minervini, P.; Alex, B. Parameter-Efficient Fine-Tuning of LLaMA for the Clinical Domain. arXiv preprint 2023. arXiv:2307.03042.
- Zhong, M.; Shen, Y.; Wang, S.; Lu, Y.; Jiao, Y.; Ouyang, S.; Yu, D.; Han, J.; Chen, W. Multi-LoRA Composition for Image Generation. arXiv preprint 2024. arXiv:2402.16843.
- Pan, J.; Lin, Z.; Zhu, X.; Shao, J.; Li, H. St-adapter: Parameter-efficient image-to-video transfer learning. Advances in Neural Information Processing Systems 2022. [Google Scholar]
- Chen, A.; Yao, Y.; Chen, P.Y.; Zhang, Y.; Liu, S. Understanding and improving visual prompting: A label-mapping perspective. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19133–19143.
- Phang, J.; Févry, T.; Bowman, S.R. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks. ArXiv 2018, abs/1811.01088.
- Yang, A.X.; Robeyns, M.; Coste, T.; Wang, J.; Bou-Ammar, H.; Aitchison, L. Bayesian Reward Models for LLM Alignment. arXiv preprint 2024. arXiv:2402.13210.
- Wang, A.; Singh, A.; Michael, J.; Hill, F.; Levy, O.; Bowman, S.R. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint 2018. arXiv:1804.07461.
- Qin, J.; Wu, J.; Yan, P.; Li, M.; Yuxi, R.; Xiao, X.; Wang, Y.; Wang, R.; Wen, S.; Pan, X.; et al. Freeseg: Unified, universal and open-vocabulary image segmentation. In Proceedings of the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, pp. 19446–19455.
- Li, J.; Li, D.; Savarese, S.; Hoi, S. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In Proceedings of the International conference on machine learning; 2023. [Google Scholar]
- Zhao, Z.; Gan, L.; Wang, G.; Zhou, W.; Yang, H.; Kuang, K.; Wu, F. LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild. arXiv preprint 2024. arXiv:2402.09997.
- Zniyed, Y.; Nguyen, T.P.; et al. Enhanced network compression through tensor decompositions and pruning. IEEE Transactions on Neural Networks and Learning Systems 2024. [Google Scholar]
- Liu, Y. Roberta: A robustly optimized bert pretraining approach. arXiv preprint 2019. arXiv:1907.11692.
- Liu, Q.; Wu, X.; Zhao, X.; Zhu, Y.; Xu, D.; Tian, F.; Zheng, Y. Moelora: An moe-based parameter efficient fine-tuning method for multi-task medical applications. arXiv preprint 2023. arXiv:2310.18339.
- Gurrola-Ramos, J.; Dalmau, O.; Alarcón, T.E. A residual dense u-net neural network for image denoising. IEEE Access 2021, 9, 31742–31754. [Google Scholar] [CrossRef]
- Lin, Z.; Madotto, A.; Fung, P. Exploring Versatile Generative Language Model Via Parameter-Efficient Transfer Learning. In Proceedings of the Findings; 2020. [Google Scholar]
- Wen, Z.; Zhang, J.; Fang, Y. SIBO: A Simple Booster for Parameter-Efficient Fine-Tuning. arXiv preprint 2024. arXiv:2402.11896.
- Guo, D.; Rush, A.M.; Kim, Y. Parameter-Efficient Transfer Learning with Diff Pruning. In Proceedings of the Annual Meeting of the Association for Computational Linguistics; 2020. [Google Scholar]
- Jeon, H.; Kim, Y.; Kim, J.j. L4q: Parameter efficient quantization-aware training on large language models via lora-wise lsq. arXiv preprint 2024. arXiv:2402.04902.
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. PaLM: Scaling Language Modeling with Pathways. J. Mach. Learn. Res. 2023, 24, 240:1–240:113. [Google Scholar]
- OpenAI. GPT-4 Technical Report. ArXiv 2023, abs/2303.08774.
- Santacroce, M.; Lu, Y.; Yu, H.; Li, Y.; Shen, Y. Efficient RLHF: Reducing the Memory Usage of PPO. arXiv preprint 2023. arXiv:2309.00754.
- Shen, Y.; Xu, Z.; Wang, Q.; Cheng, Y.; Yin, W.; Huang, L. Multimodal Instruction Tuning with Conditional Mixture of LoRA. arXiv preprint 2024. arXiv:2402.15896.
- Sun, S.; Gupta, D.; Iyyer, M. Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF. arXiv preprint 2023. arXiv:2309.09055.
- Wu, J.; Li, X.; Wei, C.; Wang, H.; Yuille, A.; Zhou, Y.; Xie, C. Unleashing the power of visual prompting at the pixel level. arXiv preprint 2022. arXiv:2212.10556.
- Zheng, Y.; Zhang, R.; Zhang, J.; Ye, Y.; Luo, Z. Llamafactory: Unified efficient fine-tuning of 100+ language models. arXiv preprint 2024. arXiv:2403.13372.
- Kalajdzievski, D. A rank stabilization scaling factor for fine-tuning with lora. arXiv preprint 2023. arXiv:2312.03732.
- Finn, C.; Abbeel, P.; Levine, S. Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks. In Proceedings of the Proceedings of the 34th International Conference on Machine Learning, 2017, pp. 1126–1135.
- Wang, B.; Wang, W. TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning. arXiv preprint 2024. arXiv:2408.10688.
- Sander, M.E.; Ablin, P.; Blondel, M.; Peyré, G. Sinkformers: Transformers with Doubly Stochastic Attention. In Proceedings of the International Conference on Artificial Intelligence and Statistics; 2022; pp. 3515–3530. [Google Scholar]
- Fan, T.; Kang, Y.; Ma, G.; Chen, W.; Wei, W.; Fan, L.; Yang, Q. Fate-llm: A industrial grade federated learning framework for large language models. arXiv preprint 2023. arXiv:2310.10049.
- Schneider, S.; Baevski, A.; Collobert, R.; Auli, M. wav2vec: Unsupervised pre-training for speech recognition. arXiv preprint 2019. arXiv:1904.05862.
- Woo, S.; Park, B.; Kim, B.; Jo, M.; Kwon, S.; Jeon, D.; Lee, D. DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation. arXiv preprint 2024. arXiv:2402.17812.
- Yang, X.; Huang, J.Y.; Zhou, W.; Chen, M. Parameter-Efficient Tuning with Special Token Adaptation. ArXiv 2022, abs/2210.04382.
- Almazrouei, E.; Alobeidli, H.; Alshamsi, A.; Cappelli, A.; Cojocaru, R.; Debbah, M.; Goffinet, É.; Hesslow, D.; Launay, J.; Malartic, Q.; et al. The falcon series of open language models. arXiv preprint 2023. arXiv:2311.16867.
- Huang, C.; Liu, Q.; Lin, B.Y.; Pang, T.; Du, C.; Lin, M. LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition. arXiv preprint 2023. arXiv:2307.13269.
- Tu, M.; Berisha, V.; Woolf, M.; sun Seo, J.; Cao, Y. Ranking the parameters of deep neural networks using the fisher information. 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016, pp. 2647–2651.
- Feng, W.; Hao, C.; Zhang, Y.; Han, Y.; Wang, H. Mixture-of-LoRAs: An Efficient Multitask Tuning Method for Large Language Models. In Proceedings of the Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation, 2024, pp. 11371–11380.
- Ren, Y.; Zhou, Y.; Yang, J.; Shi, J.; Liu, D.; Liu, F.; Kwon, M.; Shrivastava, A. Customize-a-video: One-shot motion customization of text-to-video diffusion models. ECCV 2024. [Google Scholar]
- Liu, M.; Li, B.; Yu, Y. OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning. arXiv preprint 2024. arXiv:2408.06158.
- Wang, H.; Chang, J.; Zhai, Y.; Luo, X.; Sun, J.; Lin, Z.; Tian, Q. Lion: Implicit vision prompt tuning. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence; 2024. [Google Scholar]
- Hong, W.; Wang, W.; Ding, M.; Yu, W.; Lv, Q.; Wang, Y.; Cheng, Y.; Huang, S.; Ji, J.; Xue, Z.; et al. CogVLM2: Visual Language Models for Image and Video Understanding. arXiv preprint 2024. arXiv:2408.16500.
- Basu, S.; Hu, S.; Massiceti, D.; Feizi, S. Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning. In Proceedings of the Proceedings of the AAAI Conference on Artificial Intelligence; 2024. [Google Scholar]
- Han, Z.; Gao, C.; Liu, J.; Zhang, S.Q.; et al. Parameter-efficient fine-tuning for large models: A comprehensive survey. arXiv preprint 2024. arXiv:2403.14608.
- Belofsky, J. Token-Level Adaptation of LoRA Adapters for Downstream Task Generalization. In Proceedings of the 6th Artificial Intelligence and Cloud Computing Conference; 2023; pp. 168–172. [Google Scholar]
- Zhang, L.; Zhang, L.; Shi, S.; Chu, X.; Li, B. Lora-fa: Memory-efficient low-rank adaptation for large language models fine-tuning. arXiv preprint 2023. arXiv:2308.03303.
- Sidahmed, H.; Phatale, S.; Hutcheson, A.; Lin, Z.; Chen, Z.; Yu, Z.; Jin, J.; Komarytsia, R.; Ahlheim, C.; Zhu, Y.; et al. PERL:Parameter Efficient Reinforcement Learning from Human Feedback. arXiv preprint 2024. arXiv:2403.10704.
- Liao, Q.; Xia, G.; Wang, Z. Calliffusion: Chinese Calligraphy Generation and Style Transfer with Diffusion Modeling. arXiv preprint 2023. arXiv:2305.19124.
- Gal, Y.; Ghahramani, Z. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In Proceedings of the Proceedings of the 33nd International Conference on Machine Learning, 2016, pp. 1050–1059.
- Cheng, J.; Xie, P.; Xia, X.; Li, J.; Wu, J.; Ren, Y.; Li, H.; Xiao, X.; Zheng, M.; Fu, L. ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models. arXiv preprint 2024. arXiv:2403.02084.
- Zhao, J.; Wang, T.; Abid, W.; Angus, G.; Garg, A.; Kinnison, J.; Sherstinsky, A.; Molino, P.; Addair, T.; Rishi, D. LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report. arXiv preprint 2024. arXiv:2405.00732.
- Sung, Y.L.; Cho, J.; Bansal, M. VL-ADAPTER: Parameter-Efficient Transfer Learning for Vision-and-Language Tasks. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2021, pp. 5217–5227.
- Hu, Y.; Xie, Y.; Wang, T.; Chen, M.; Pan, Z. Structure-Aware Low-Rank Adaptation for Parameter-Efficient Fine-Tuning. Mathematics 2023, 11, 4317. [Google Scholar] [CrossRef]
- Li, Y.; Yu, Y.; Liang, C.; He, P.; Karampatziakis, N.; Chen, W.; Zhao, T. Loftq: Lora-fine-tuning-aware quantization for large language models. arXiv preprint 2023. arXiv:2310.08659.
- Zhang, L.; Zhang, L.; Shi, S.; Chu, X.; Li, B. LoRA-FA: Memory-efficient low-rank adaptation for large language models fine-tuning. arXiv preprint 2023. arXiv:2308.03303.
- Xu, J.; Liu, X.; Wu, Y.; Tong, Y.; Li, Q.; Ding, M.; Tang, J.; Dong, Y. Imagereward: Learning and evaluating human preferences for text-to-image generation. Advances in Neural Information Processing Systems 2024, 36. [Google Scholar]
- Luo, S.; Tan, Y.; Patil, S.; Gu, D.; von Platen, P.; Passos, A.; Huang, L.; Li, J.; Zhao, H. LCM-LoRA: A Universal Stable-Diffusion Acceleration Module. arXiv preprint 2023. arXiv:2311.05556.
- Ahmad, S.; Chanda, S.; Rawat, Y.S. EZ-CLIP: Efficient Zeroshot Video Action Recognition. arXiv preprint 2023. arXiv:2312.08010.
- Ren, W.; Li, X.; Wang, L.; Zhao, T.; Qin, W. Analyzing and Reducing Catastrophic Forgetting in Parameter Efficient Tuning. arXiv preprint 2024. arXiv:2402.18865.
- Zniyed, Y.; Nguyen, T.P.; et al. Efficient tensor decomposition-based filter pruning. Neural Networks 2024, 178, 106393. [Google Scholar]
- Zi, B.; Qi, X.; Wang, L.; Wang, J.; Wong, K.F.; Zhang, L. Delta-LoRA: Fine-Tuning High-Rank Parameters with the Delta of Low-Rank Matrices. ArXiv, 2309. [Google Scholar]
- Workshop, B.; Scao, T.L.; Fan, A.; Akiki, C.; Pavlick, E.; Ilić, S.; Hesslow, D.; Castagné, R.; Luccioni, A.S.; Yvon, F.; et al. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint 2022. arXiv:2211.05100.
| Method | Parameters | Cost | Modification | Generalization |
|---|---|---|---|---|
| Fine-Tuning | High | High | Yes | High |
| Adapter-Based | Moderate | Moderate | Yes | High |
| LoRA | Low | Low | Minimal | High |
| Prefix Tuning | Very Low | Low | No | Moderate |
| Prompt Tuning | Very Low | Low | No | Task-Specific |
| BitFit | Extremely Low | Very Low | Minimal | Moderate |
| Method | GLUE | ImageNet | Librispeech | #Params |
|---|---|---|---|---|
| Full Fine-Tuning | 100% | 100% | 100% | 100% |
| Adapters | 98% | 96% | 97% | 3-5% |
| LoRA | 98% | 97% | 98% | 0.5-1% |
| Prefix Tuning | 95% | 93% | 94% | 0.1-0.3% |
| Prompt Tuning | 92% | 90% | 91% | 0.01-0.1% |
| BitFit | 94% | 92% | 95% | 0.01-0.1% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
