Anomaly detection and segmentation aim at distinguishing abnormal images from normal images and further localizing the anomalous regions. Feature reconstruction based method has become one of the mainstream methods for this task. This kind of method has two assumptions: (1) The features extracted by neural network is a good representation of the image. (2) The autoencoder solely trained on the features of normal images cannot reconstruct the features of anomalous regions well. But these two assumptions are hard to meet. In this paper, we propose a new anomaly segmentation method based on feature reconstruction. Our approach mainly consists of two parts: (1) We use a pretrained vision transformer (ViT) to extract the features of the input image. (2) We design a self-attention autoencoder to reconstruct the features. We regard that the self-attention operation which has a global receptive field is beneficial to the methods based on feature reconstruction both in feature extraction and reconstruction. The experiments show that our method outperforms the state-of-the-art approaches for anomaly segmentation on the MVTec dataset. It is both effective and time-efficient.
Keywords:
Subject: Computer Science and Mathematics - Computer Science
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.