Version 1
: Received: 15 June 2024 / Approved: 17 June 2024 / Online: 17 June 2024 (12:11:58 CEST)
How to cite:
Yin, S.; Jiang, L. Distilling Knowledge from Multiple Foundation Models for Zero-shot Image classification. Preprints2024, 2024061153. https://doi.org/10.20944/preprints202406.1153.v1
Yin, S.; Jiang, L. Distilling Knowledge from Multiple Foundation Models for Zero-shot Image classification. Preprints 2024, 2024061153. https://doi.org/10.20944/preprints202406.1153.v1
Yin, S.; Jiang, L. Distilling Knowledge from Multiple Foundation Models for Zero-shot Image classification. Preprints2024, 2024061153. https://doi.org/10.20944/preprints202406.1153.v1
APA Style
Yin, S., & Jiang, L. (2024). Distilling Knowledge from Multiple Foundation Models for Zero-shot Image classification. Preprints. https://doi.org/10.20944/preprints202406.1153.v1
Chicago/Turabian Style
Yin, S. and Lifan Jiang. 2024 "Distilling Knowledge from Multiple Foundation Models for Zero-shot Image classification" Preprints. https://doi.org/10.20944/preprints202406.1153.v1
Abstract
This paper introduces a novel framework for zero-shot learning (ZSL), i.e., to recognize new categories that are unseen during training, by distilling knowledge from foundation models. Specifically, we first employ ChatGPT and DALL-E to synthesize reference images of unseen categories from text prompts. Then, the test image is aligned with text and reference images using CLIP and DINO. Finally, the predicted logits are aggregated according to their confidence to produce the final prediction.Experiments are conducted on multiple datasets, including CIFAR-10, CIFAR-100, and TinyImageNet. The results demonstrate that our model can significantly improve classification accuracy compared to previous approaches, achieving AUROC scores above 96\% across all test datasets. Our code is available at https://github.com/1134112149/MICW-ZIC.
Computer Science and Mathematics, Computer Vision and Graphics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.