Preprint
Article

A Saturation Artifacts Inpainting Method Based on Two-Stage GAN for Fluorescence Microscope Images

Altmetrics

Downloads

88

Views

27

Comments

0

A peer-reviewed article of this preprint also exists.

This version is not peer-reviewed

Submitted:

04 June 2024

Posted:

06 June 2024

You are already at the latest version

Alerts
Abstract
Fluorescence microscopic images of cells contain a large number of morphological features that are used as an unbiased source of quantitative information about cell status, through which researchers can extract quantitative information about cells and study the biological phenomena of cells through statistical and analytical analysis. As an important research object of phenotypic analysis, images have a great influence on the research results. Saturation artifacts present in the image result in a loss of grayscale information that does not reveal the true value of fluorescence intensity. From the perspective of data post-processing, we propose a two-stage cell image recovery model based on generative adversarial network to solve the problem of phenotypic feature loss caused by saturation artifacts. The model is capable of restoring large areas of missing phenotypic features. In the experiment, we adopt the strategy of progressive restore to improve the robustness of the training effect, and add the contextual attention structure to enhance the stability of the restore effect. We hope to use deep learning methods to mitigate the effects of saturation artifacts to reveal how chemical, genetic, and environmental factors affect cell state, providing an effective tool for studying the field of biological variability and improving image quality in analysis.
Keywords: 
Subject: Computer Science and Mathematics  -   Computer Vision and Graphics

1. Introduction

Fluorescence microscope images are obtained by imaging cells or tissues labeled with fluorescent dyes and fluorescent proteins. It contains many biologically relevant phenotypic features, which allows people to characterize experimentally gene expression, protein expression, and molecular interactions in a living cell [1] . The application of cell analysis based on fluorescence microscopy images is diverse, including identifying disease phenotypes, gene functions, and mechanisms of action, toxicity, or targets of drugs [2]. Analysis methods based on fluorescence microscope images, such as cell classification, segmentation, colocalization analysis, and morphological analysis, require high-quality microscopic images. However, due to the protein binds to an excessive amount of fluorescent dye, long exposure times, and inhomogeneous illumination [3], there are usually artifacts such as blur, boundary shadows, and saturation artifacts in microscopic images. These artifacts may severely affect the analysis results. For example, uneven illumination increased the error detection and missed detection of yeast cell images by 35% by CellProfiler [1], and saturation artifacts will make the measurement of protein position invalid in colocalization location analysis.
At present, the research on processing fluorescence image artifacts mainly focuses on inhomogeneous illumination, super-resolution reconstruction, and denoising. Smith et al. [4], Goswami et al.[5] and wang et al.[6] use prospective methods or retrospective methods to correct illumination between different fluorescence images and reduce abiotic structural differences between different images. These methods reduce abiotic structural differences between different images or remove the artifact noise on a single microscopic image. However, none of them can eliminate saturation artifacts in a single microscopic image. Saturation artifacts can be regarded as extreme illumination imbalances. Excessive exposure makes the artifact area blank, and a large area of biological structure information is missing. Often these microphotographs with a large amount of missing biological structure information will be screened out in quantitative analysis experiments [7]. There is still no method that can eliminate the influence of saturation artifacts.
Generative adversial networks (GAN) were proposed by Goodfellow et al. [8] in 2014 as a tool for generating data. GANs and improved GANs algorithms are widely used in image generation, image inpainting, and other fields by data-driven approaches in recent years and have excellent performance. Zhang et al. [9] used GAN to provide an effective method for medical image data enhancement and forgery detection, effectively improving the accuracy and reliability of computer-aided diagnostic tasks. GANs have also had stunning success in image processing of fluorescence microscopy. Chen et al. [10] use the GAN method to realize super-resolution reconstruction of fluorescence microscope images, making the biological structure information stand out clearly from the artifact. In this paper, we proposed a method to restore the missing biological structure information caused by saturation artifacts in each image. To our best knowledge, this is the first study to deal with the lost biological information. Belthangady et al. [11] showed that CNN-based techniques for inpainting missing image regions are well-positioned to address the problem of losing information. Their work inspired us to believe that the deep learning method is a good way to solve the problem of losing biological information by saturation artifacts.
In this work, we further explore GAN-based methods to solve the problem of missing biological information due to saturation artifacts in the fluorescence microscope images. The method is based on EdgeConnect GAN [12], we call it Two-stage Cell image GAN (TC-GAN). To obtain more stable and credible inpainting results, the model adopts a two-step progressive repair method. In the first stage, the shape features of the cell and the context features between cells are restored by the proposed Edge-GAN. In the second part, the texture features and intensity-based features of the cell are restored by the proposed Content-GAN based on the edge information. We introduce the contextual attention [13] architecture into the model to learn where to borrow or copy feature information from known background patches to generate missing patches. With this model, images with losing information can restore the phenotypic features according to their existing phenotypic features and make up for the scarce samples in morphological analysis experiments.
The structure of the paper is as follows. Section 2 describes the structure and loss function of the model. Section 3 introduces the data and processing methods. The image inpainting experiment and verification experiment are presented in detail in Section 4. The conclusion is made in Section 5.

2. Methodology

This chapter describes in detail the structure of the proposed fluorescence microscope cell image inpainting model and the loss function used.

2.1. Generative Adversarial Networks

GAN was proposed by Goodfellow et al. [8] in 2014 to generate signals with the same feature distribution as the training set. A typical GAN consists of a generator and a discriminator, where the generator tries to generate data that matches the distribution of the training set; the discriminator determines whether the input signal is the original signal or the signal generated by the generator.
However, the content generated based on GAN usually has the problem of blurred edge of restored content or semantic mismatch between restored content and background content. We focus on the performance of the restoration results on four numerical features of a cell fluorescence microscopic image and use a two-stage GAN with a contextual attention layer to restore different feature contents.

2.2. Feature Restoration

The phenotypic features of cells provide the raw data for profiling. It can be extracted to quantitatively describe complex cell-morphology phenotypes. Here, these phenotypic features can be separated into four categories [1], [14]: (1) Shape features, which represent boundaries, size, or shape of nuclei, cells, or other organelles. (2) Microenvironment and context features, including the distribution among cells and subcellular structures in the field of view. (3) Texture features, which describe the distribution of pixel-intensity values within the cellular structure. These features can intuitively display the fluorescent protein structure of a single cell. (4) Intensity-based features, which are computed from actual intensity values on a single-cell basis. Intensity-based features are closely related to texture features. The intensity-based features dominate when analyzing between a few pixels, as the number of distinguishable discrete intensities increases within a small area, the texture features will dominate [15]. In the fluorescence microscope image, saturation artifacts will cause sparse texture features.
We used a two-stage network (from Edge-GAN to Content-GAN) to restore the above four features from saturated artifacts. Edge-GAN is used to generate the phenotypes of shape features and context features, which including the cell shape and centroid direction, established the basic morphology of cell phenotype. After determining this most basic and important information in saturation artifacts, the texture features can be further restored by using the Content-GAN, which are usually shown as the protein structure, organelle structure, and cytoplasm of the cell in our eyes. Contextual attention architecture [13] is added to the network structure, to make the boundary and texture features of the patched area consistent with the surrounding cells in morphology.

2.3. Model Strutcture

The modules of our network are shown in Figure 1. The model is divided into two parts, we use Edge-GAN and Content-GAN respectively in these two parts. The Edge-GAN consists of generator G 1 and discriminator D 1 . The original grayscale image, the imaging mask, and masked edge image obtained from the region of saturation artifacts by the Canny operator are the input of the Edge-GAN, generator G 1 and discriminator D 1 . By learning the distribution of the features extracted from the input image, the Edge-GAN outputs the edge image. The Content-GAN consists of G 2 and D 2 . The original grayscale image and edge grayscale image are the input of Content-GAN. By learning the texture features from the original image and the shape features from the edge image, the output is the restored image without saturation artifacts.
The generator of Edge-GAN, G 1 , is composed of an encoder-decoder convolution architecture with a contextual attention architecture [13]. Specifically, the encoder-decoder architecture consists of the encoder, ResNet module, and decoder. The contextual attention architecture is parallel to the encoder architecture. The discriminator of Edge-GAN, D 1 , follows the same architecture of 70×70 PatchGAN [16], the detailed structure of G 1 and D 1 are shown in Table 1.
The architecture of the generator of Content-GAN, G 2 , is the same as G 1 , except all the spectral normalization is removed from G 2 . And the architecture of the discriminator of Content-GAN, D 2 , is the same as D 1 . D 2 is used to judge whether the semantic information of the generated content by G 2 is reasonable or not.

2.4. Contextual Attention

Contextual attention architecture is proposed by Yu et al. [13] to learn where to borrow or copy feature information from known background patches to generate missing patches. Its detailed structure is shown in the contextual attention architecture of Table 1. We use the contextual attention layer to accelerate the convergence rate of model training and enhance the semantic rationality of the generating region. The similarity of a patch centered in the patch to be restored f x , y and the background patch b x ' , y ' is defined as:
S x , y , x ' , y ' = f x , y f x , y , b x ' , y ' b x ' , y '
According to the calculated similarity score S x , y , x ' , y ' , contextual attention layer can learn which part of the background features should be used from the repaired texture information.

2.5. Edge-GAN Loss Function

The Edge-GAN is trained with adversarial loss and feature-matching loss [17] as:
min G 1 max D 1 L G 1 = min G 1 λ a d v 1 max D 1 L a d v 1 + λ F M L F M
where L a d v 1 is adversarial loss, L F M is feature-matching loss, λ a d v 1 and λ F M are regularization parameters. The adversarial loss L a d v 1 is defined as:
L a d v 1 = E E , I log D 1 E , I + E I log 1 D 1 Z ~ p r e d , I
where I is the ground truth images, E is the edge map of I . Z ~ p r e d is predicted edge map for the masked region.
The feature-matching loss L F M extracts the middle feature layer of the discriminator for comparison. The L F M is defined as:
L F M = E i = 1 L 1 N i D 1 i E D 1 i Z ~ p r e d 1
where i means the number of feature layers, L is the final layer of D 1 , N i is the number of elements in i th layer, D 1 i is the ith layer of D 1 .
In our experiments, λ a d v 1 = 1 and λ F M = 10 .

2.6. Content-GAN Loss Function

The Content-GAN is trained by four losses. The overall loss function is to min G 2 max D 2 L G 1 , which is defined as:
min G 2 max D 2 L G 1 = min G 2 λ l 1 L l 1 + λ a d v 2 max D 2 L a d v 2 + λ p L p r e c + λ s L s t y l e
where L l 1 is l 1 loss, L a d v 2 is adversarial loss, L p e r c is perceptual loss [18], L s t y l e is style loss [19]. L l 1 , L a d v 2 , λ p and λ s are regularization parameter.
Adversarial loss, L a d v 2 is defined as:
L a d v 2 = E I , Z ~ c o m p log D 2 I , Z ~ c o m p + E Z ~ c o m p log 1 D 2 Z p r e d , Z ~ c o m p
where the composite edge map Z ~ c o m p = E 1 M + Z ~ p r e d M , the inpainting color image Z p r e d = G 2 Z , Z ~ c o m p .
The perceptual loss is similar to feature-matching loss, which extracts the middle feature layer for comparison in D 2 . Using perceptual loss can avoid generating final content that is the same as the input image, as long as the abstract features are the same. It is defined as:
L p e r c = E i = 1 1 N i ϕ i I ϕ i Z p r e d 1
where ϕ i is the activated feature map of the ith layer of D 2 . Here we use the VGG-19 pre-trained parameter on the ImageNet dataset [20] to be the parameter of ϕ i .
The style loss is used to punish the non-intensity affine transformation and reduce the distortion of cell morphological transformation. It is defined as:
L s t y l e = E j G j ϕ I M ¯ + Z p r e d M G j ϕ Z 1
where G j ϕ is a Gram matrix constructed of C j × C j from feature maps ϕ j , M ¯ = 1 M .

3. Data and Processing

3.1. Data of Fluorescence Microscope Image

The data used in this study is obtained from the training set of RxRx1 in the NeurIPS 2019 competition track (https://www.kaggle.com/c/recursion-cellular-image-classification/data). This database contains fluorescence microscope images of cells collected from each well plate in high throughput screening (HTS).
There are 1,108 different small interference RNAs (siRNAs) introduced into four types of cells to create distinct genetic conditions. The experiment uses a modified cell painting staining protocol which uses 6 different stains to adhere to different parts of the cell. The stains fluoresce at different wavelengths and are therefore captured by different imaging channels. Thus there are 6 channels per imaging site in a well.
Different types of cell information are reflected in the morphological differences of fluorescence microscope images. The morphological analysis of cells is usually based on these morphological features. The most significant influence on the features of morphological differences is the saturation artifacts, which is shown in Figure 2.
This section may be divided by subheadings. It should provide a concise and precise description of the experimental results, their interpretation, as well as the experimental conclusions that can be drawn.

3.2. Training Set Preparation

The data used in this study were divided into 2 groups (T1 and SET1). The data set T1 is used to train TC-GAN models, which are used to restore with the saturation artifacts in the image. SET1 is used to evaluate the validity of the restored feature.
The data in T1 were selected from RxRx1 that did not contain saturation artifacts and were morphologically rich. This ensures that the trained algorithm can fill with rich textures. SET1 includes 5 images without saturation artifact selected in original RxRx1 data, 5 masked images with a regularly distributed rectangular mask added to simulate saturation artifacts in 20% of the area, and 5 restored images restored by TC-GAN.

4. Training Strategy and Analysis

In this section, we first introduce the experimental progressive training strategy and its ablation experiment results. and the training process and experimental results of TC-GAN are introduced. We used Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index (SSIM), and Frechet Inception Distance (FID) are used to evaluate the validity of restoration results.

4.1. Training Strategy

It is challenging to restore the phenotypic features directly, especially shape features and context features of cells in a large area of saturation artifacts. We use the method of progressive generation in the process of Edge-GAN training. And by using the results of Edge-GAN, Content-GAN can restore apparent texture features.
Specifically, the Edge-GAN trains on low-resolution images for pre-training, then we use transfer learning based on pre-training to train Edge-GAN on high-resolution images. The results of shape features and context features restoration in stages are shown in Figure 3. As it shows, the phenotypic features between cells are gradually restored.
In this study, an ablation experiment was carried out, and the experimental results of progressive restore and no progressive restore were compared, and the restore results were shown in Figure 4 showing the restore results of missing edge information at the 50,000th step in Edge-GAN.

4.2. Model Training and Sesult

The restoration TC-GAN models are trained by T1, which have rich phenotypic features without saturation artifacts. Images in the training set, imaging mask, and the masked edge consist of the input of TC-GAN. And the final output of TC-GAN is the restored image without saturation artifacts.
For Edge-GAN, the optimizer is Adam [21] with a learning rate of α = 0.0001, β 1 = 0, β 2 = 0.9. The total training iteration is 1,000,000. For Content-GAN, the optimizer is the same as Edge-GAN, and the training iteration is 200,000.
The result of the two-stage restoration of the network is shown in Figure 5. As was shown, the restored image fills the lost phenotypic features in the original saturation artifacts area.

4.3. Evaluation of Validity

4.3.1. Evaluation Indicators

We use several quantitative statistical features to verify the effectiveness of the method. The peak signal-to-noise ratio (PSNR), the structural similarity index (SSIM) [22], and Fréchet Inception Distance (FID) [23] are used to evaluate the quality of generation features quantitatively.
PSNR evaluates the quality of the generated features compared with the original features. The higher the PSNR, the smaller the distortion of the generated features.
SSIM is an index to measure the similarity of features in two images. The closer the SSIM is to 1, the closer the patched features are to the original cell.

4.3.2. Evaluation Methods

By comparing the verification metrics of the mask group image with artificial artifacts and the restoration group image, we intend to verify whether the restoration result can effectively restore the image information. We calculate the PSNR, SSIM, and FID between the 5 original images without saturation artifacts and the 5 masked images after they are artificially covered in SET1 to obtain the data of the mask group. Then we calculate the PSNR, SSIM, and FID between the 5 original images and the 5 restored images after being covered in SET1 to obtain the restoration group data. Calculation results of the mask group and restoration group can show the image quality before and after restoration. In addition, they can reflect the similarity between the restoration area and the original cell.

4.3.3. Result

We use PSNR, SSIM, and FID to evaluate the validity of generation features in SET1. The index difference between the restored image and the masked image is shown in Table 2. The PSNR and SSIM of the restored image are higher than those of the masked image. This means the restored phenotypic features can effectively fill the gap of saturation artifacts and make the restored image closer to the original image than the masked image. The FID of the restored image are lower than that of the masked image, which means the similarity between the restored image and the original image is higher than that between the masked image and the original image. Two examples of the original image, the masked image, and the restored image are shown in Figure 6.

5. Conclusions

This paper introduces the TC-GAN model, a two-stage phenotypic feature restoration approach addressing saturation artifacts in fluorescence microscopy images. The model separately restores the shape and texture features of cells. Through ablation studies and quantitative and qualitative experiments, the effectiveness of the network under progressive training is validated. The results demonstrate the model's practical significance and its potential to enhance qualitative and quantitative analysis of cell fluorescence microscopy images.

References

  1. Caicedo, J.; et al. Data-analysis strategies for image-based cell profiling. Nat. Methods 2017, 14, 849–863. [Google Scholar] [CrossRef]
  2. Bougen-Zhukov, N.; Loh, S.; Hwee-Kuan, L.; Loo, L.-H. Large-scale image-based screening and profiling of cellular phenotypes: Phenotypic Screening and Profiling. Cytom. Part A 2016, 91. [Google Scholar]
  3. Ettinger, A.; Wittmann, T.J.M. Fluorescence live cell imaging. Methods Cell Biol. 2014, 123, 77–94. [Google Scholar]
  4. Smith, K.; et al. CIDRE: An illumination-correction method for optical microscopy. Nat. Methods 2015, 12. [Google Scholar] [CrossRef] [PubMed]
  5. Goswami, S.; Singh, A. A Simple Deep Learning Based Image Illumination Correction Method for Paintings. Pattern Recognit. Lett. 2020, 138. [Google Scholar] [CrossRef]
  6. Wang, Q.; Li, Z.; Zhang, S.; Chi, N.; Dai, Q. A versatile Wavelet-Enhanced CNN-Transformer for improved fluorescence microscopy image restoration. Neural Netw. 2024, 170, 227–241. [Google Scholar] [CrossRef] [PubMed]
  7. Aladeokin, A. ; et al. Network-guided analysis of hippocampal proteome identifies novel proteins that colocalize with Aβ in a mice model of early-stage Alzheimer’s disease. Neurobiol. Dis. 2019, 132, 104603. [Google Scholar] [CrossRef] [PubMed]
  8. Goodfellow, I.; et al. Generative Adversarial Networks. Adv. Neural Inf. Process. Syst. 2014, 3. [Google Scholar] [CrossRef]
  9. Zhang, J.; Huang, X.; Liu, Y.; Han, Y.; Xiang, Z. GAN-based medical image small region forgery detection via a two-stage cascade framework. PLoS ONE 2024, 19, e0290303. [Google Scholar] [CrossRef] [PubMed]
  10. Chen, X.; Zhang, C.; Zhao, J.; Xiong, Z.; Zha, Z.J.; Wu, F. Weakly Supervised Neuron Reconstruction From Optical Microscopy Images With Morphological Priors. IEEE Trans. Med. Imaging 2021, 40, 3205–3216. [Google Scholar] [CrossRef] [PubMed]
  11. Belthangady, C.; Royer, L. Applications, Promises, and Pitfalls of Deep Learning for Fluorescence Image Reconstruction. Nat. Methods 2018, 16, 1215–1225. [Google Scholar] [CrossRef] [PubMed]
  12. Nazeri, K.; Ng, E.; Joseph, T.; Qureshi, F.; Ebrahimi, M. EdgeConnect: Generative Image Inpainting with Adversarial Edge Learning. arXiv preprint arXiv 2019.
  13. J. Yu, Z. J. Yu, Z. Lin, J. Yang, X. Shen, and X. Lu, Generative Image Inpainting with Contextual Attention,”IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 5505–5514.
  14. Boutros, M.; Heigwer, F.; Laufer, C. Microscopy-Based High-Content Screening. Cell 2015, 163. [Google Scholar] [CrossRef] [PubMed]
  15. Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Features for Image Classification. IEEE Trans. Syst. Man Cybern. 1973. SMC-3. [Google Scholar] [CrossRef]
  16. P. Isola, J. Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-Image Translation with Conditional Adversarial Networks. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5967–5976.
  17. T.-C. Wang, M.-Y. Liu, J.-Y. Zhu, A. Tao, J. Kautz, and B. Catanzaro, "High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017; pp. 8798–8807.
  18. Johnson, J.; Alahi, A.; Li, F. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. European Conference on Computer Vision, 2016.
  19. Gatys, L.A.; Ecker, A.S.; Bethge, M. Image Style Transfer Using Convolutional Neural Networks. in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2414–2423.
  20. Russakovsky, O.; et al. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision 2014, 115. [Google Scholar]
  21. Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. Int. Conf. Learn. Represent. 2014.
  22. O. Elharrouss, N. Almaadeed, S. Al-Maadeed, and Y. Akbari, "Image Inpainting: A Review. Neural Process. Lett. 2020, 51, 2007–2028. [CrossRef]
  23. Bynagari, N. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. Asian J. Appl. Sci. Eng. 2019, 8. [Google Scholar] [CrossRef]
Figure 1. The modules of the proposed TC-GAN networks. G 1 and D 1 compose Edge-GAN, G 2 and D 2 compose Content-GAN.
Figure 1. The modules of the proposed TC-GAN networks. G 1 and D 1 compose Edge-GAN, G 2 and D 2 compose Content-GAN.
Preprints 108377 g001
Figure 2. The examples of (a) normal images and (b) problematic images with saturated pixels with their histogram. (a) shows the images with normal grayscale value distribution, The red arrow in (b) shows a large number of saturated pixels caused by saturation artifacts. Images with saturated pixels have more pixels with a value of 255, which means losing a lot of information.
Figure 2. The examples of (a) normal images and (b) problematic images with saturated pixels with their histogram. (a) shows the images with normal grayscale value distribution, The red arrow in (b) shows a large number of saturated pixels caused by saturation artifacts. Images with saturated pixels have more pixels with a value of 255, which means losing a lot of information.
Preprints 108377 g002
Figure 3. The result of progressive restoration. From left to right are the shape features extracted from the image when there are saturated artifacts, the shape features restored in the first stage and the shape features restored in the second stage.
Figure 3. The result of progressive restoration. From left to right are the shape features extracted from the image when there are saturated artifacts, the shape features restored in the first stage and the shape features restored in the second stage.
Preprints 108377 g003
Figure 4. The results of progressive restore ablation experiments. (a) is the training result of direct restore on high-resolution images, and (b) is the training results obtained by gradually training from low-resolution images to high-resolution images.
Figure 4. The results of progressive restore ablation experiments. (a) is the training result of direct restore on high-resolution images, and (b) is the training results obtained by gradually training from low-resolution images to high-resolution images.
Preprints 108377 g004
Figure 5. Two demos of images with saturation artifacts restored by TC-GAN.
Figure 5. Two demos of images with saturation artifacts restored by TC-GAN.
Preprints 108377 g005
Figure 6. Two demos of (a) original images, (b) masked images and (c) restored images of Set1. The (b) masked images lose some of their original morphological features in (a), and these missing morphological features are restored in the (c) restored images.
Figure 6. Two demos of (a) original images, (b) masked images and (c) restored images of Set1. The (b) masked images lose some of their original morphological features in (a), and these missing morphological features are restored in the (c) restored images.
Preprints 108377 g006
Table 1. The structure of G 1 and D 1 .
Table 1. The structure of G 1 and D 1 .
Input Filter Channel/Stride/Padding Act Output
G 1 Encoder Architecture X Conv   7 × 7 64/1/0 S/I/ReLU
Conv   4 × 4 128/2/1 S/I/ReLU
Conv   4 × 4 256/2/1 S/I/ReLU Encoder X 1
Contextual Attention Architecture X Conv   5 × 5 32/1/2 ELU
Conv   3 × 3 32/2/1 ELU
Conv   3 × 3 64/1/1 ELU
Conv   3 × 3 128/2/1 ELU
Conv   3 × 3 128/1/1 ELU
Conv   3 × 3 128/1/1 ReLU
Contextual Attention Layer
Conv   3 × 3 128/1/1 ELU
Conv   3 × 3 128/1/1 ELU Feature X 2
ResNet Architecture X 1 + X 2 Conv   3 × 3 384/1/0 S/I/ReLU
Conv   3 × 3 384/1/0 S/I
( ResNet Block × 8 ) Feature X 3
Decoder Architecture X 3 TransposeConv   4 × 4 128/2/1 S/I/ReLU
TransposeConv   4 × 4 64/2/1 S/I/ReLU
Conv   7 × 7 1/1/0 Sigmoid Y
D 1 Encoder Architecture Y Conv   4 × 4 64/2/1 S/LReLU
Conv   4 × 4 128/2/1 S/LReLU
Conv   4 × 4 256/2/1 S/LReLU
Conv   4 × 4 512/1/1 S/LReLU
Conv   4 × 4 1/1/1 LReLU/Sigmoid 1 × 32 × 32
Conv = Convolution filter, S = Spectral normalization, I = Instance normalization, LReLU = LeakyReLU. X is the input image of G 1 , which consists of three channels: the original grayscale image, the imaging mask, and the masked edge image. X 1 , X 2 , and   X 3 are the feature mapscalculated by the middle layer. Y is the input image of D 1 , which is the output image of G 1 . T h e s t r u c t u r e o f G 2 i s a l m o s t t h e s a m e a s G 1 , except for the ResNet of G 2 has 4 layers instead of 8 layers. And the loss function of G 2 is different from G 1 . The structure of is the same as D 1 .
Table 2. The indexes of masked images and restored images.
Table 2. The indexes of masked images and restored images.
Dataset Image 1 Image 2 Image 3 Image 4 Image 5
PSNR image with mask 9.077 8.952 8.860 8.414 10.201
image be repaired 25.137 24.502 24.565 29.028 26.508
SSIM image with mask 0.726 0.726 0.726 0.706 0.739
image be repaired 0.860 0.853 0.858 0.839 0.858
FID image with mask 645.929 612.348 716.363 467.555 603.574
image be repaired 36.397 49.163 46.545 78.578 41.041
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated