We present a method to improve the reconstruction and generation performance of variational autoencoder (VAE) by injecting an adversarial learning. On the other hand, instead of comparing the reconstructed with the original data to calculate the reconstruction loss, we use a consistency principle for deep features. The training process of the VAE is then divided into two steps, training the encoder and then training the decoder. By using this two-step learning process, our method can be more widely used in applications other than image processing. While training the encoder, the label information is integrated to better structure the latent space in a supervised way. The adversarial constraints allow the decoder to generate data with better authenticity and more realistic than the conventional VAE. We present experimental results to show that our method gives better performance than the original VAE.