Preprint Review Version 1 This version is not peer-reviewed

Advances in the Neural Network Quantization: A Comprehen-sive Review

Version 1 : Received: 1 July 2024 / Approved: 1 July 2024 / Online: 1 July 2024 (15:03:00 CEST)

How to cite: Wei, L.; Ma, Z.; Yang, C.; Yao, Q. Advances in the Neural Network Quantization: A Comprehen-sive Review. Preprints 2024, 2024070076. https://doi.org/10.20944/preprints202407.0076.v1 Wei, L.; Ma, Z.; Yang, C.; Yao, Q. Advances in the Neural Network Quantization: A Comprehen-sive Review. Preprints 2024, 2024070076. https://doi.org/10.20944/preprints202407.0076.v1

Abstract

Artificial intelligence technologies based on deep convolutional neural networks and large lan-guage models have made significant breakthroughs in many tasks such as image recognition, target detection, semantic segmentation and natural language processing, but also face the problem of contradiction between the high computational capacity of the algorithms and the limited deployment resources. Quantization, which converts floating-point neural networks into low-bit-width integer networks, is an important and essential technique for efficient deploy-ment and cost reduction in edge computing. This paper analyze various existing quantization methods, showcase the deployment accuracy of advanced techniques, and discuss the future challenges and trends in this domain.

Keywords

Quantization; neural network; large language model; deployment; accuracy

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.