In this study, we aim to explore whether rephrasing enhances the performance of advanced question answering systems in terms of accuracy. Initially, we analyze the effectiveness of Bidirectional Encoder Representations from Transformers (BERT)—a widely used pre-trained language model—when presented with rephrased questions and documents containing distracting details. Our findings reveal that BERT exhibits excessive sensitivity to such instances, prompting us to investigate potential strategies for mitigating this sensitivity. To address the observed decline in performance, we propose a refinement approach for BERT, incorporating techniques such as data augmentation and multitask learning. Specifically, given a question and its context, we generate a series of paraphrases through back-translation. Alongside minimizing the loss between the predicted answer distributions for the original questions—we also minimize the supervised loss associated with the augmented questions. Furthermore, we introduce an auxiliary objective aimed at minimizing the unsupervised loss between the answer distributions of the original and augmented questions. Notably, this auxiliary loss is unsupervised, as it does not directly rely on labels corresponding to the augmented examples. To compute this unsupervised loss, we employ measures such as symmetric Kullback-Leibler divergence and the Jensen-Shannon distance, serving as regularization techniques for the model. The findings revealed that the supervised model exhibited superior performance compared to both the original and our proposed model. However, our proposed model demonstrated a nearly 2% enhancement in both Exact Match (EM) and F1 scores when tested on a paraphrased development set that we formulated. This suggests the potential effectiveness of our proposed approach in enhancing model accuracy and performance.