Zhu, L.; Yan, S.; Cao, X.; Zhang, S.; Sha, Q. Integrating External Controls by Regression Calibration for Genome-Wide Association Study. Genes2024, 15, 67.
Zhu, L.; Yan, S.; Cao, X.; Zhang, S.; Sha, Q. Integrating External Controls by Regression Calibration for Genome-Wide Association Study. Genes 2024, 15, 67.
Zhu, L.; Yan, S.; Cao, X.; Zhang, S.; Sha, Q. Integrating External Controls by Regression Calibration for Genome-Wide Association Study. Genes2024, 15, 67.
Zhu, L.; Yan, S.; Cao, X.; Zhang, S.; Sha, Q. Integrating External Controls by Regression Calibration for Genome-Wide Association Study. Genes 2024, 15, 67.
Abstract
Genome-wide association studies (GWAS) have successfully revealed many disease-associated genetic variants. For a case-control study, the adequate power of an association test can be achieved with a large sample size, although genotyping large samples is expensive. A cost‐effective strategy to boost power is to integrate external control samples with publicly available genotyped data. However, the naïve integration of external controls may inflate the type I error rates if ignoring the systematic differences (batch effect) between studies, such as the differences in sequencing platforms, genotype calling procedures, population stratification, and so forth. To account for the batch effect, we propose an approach by integrating External Controls into the Association Test by Regression Calibration (iECAT-RC) in case-control association studies. Extensive simulation studies show that iECAT-RC not only can control type I error rates but also can boost statistical power in all models. We also apply iECAT-RC to the UK Biobank data for M72 Fibroblastic disorders by considering genotype calling as the batch effect. Four SNPs associated with Fibroblastic disorders have been detected by iECAT-RC and the other two comparison methods. However, our method has a higher probability of identifying these significant SNPs in the scenario of an unbalanced case-control association study.
Keywords
Genome-wide association test; case-control study; batch effect; data integration
Subject
Computer Science and Mathematics, Probability and Statistics
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.