Bioinformatics_2015Consensus Genotyper for Exome Sequencing (CGES, pronounced “sea-guess”) is a tool designed to increase the fidelity of genotypes identified by exome sequencing. CGES uses a machine learning ensemble approach that combines the output of multiple models to dramatically improve classifier performance. It uses a two-stage voting scheme among four algorithm implementations. While our ensemble method can accept variants generated by any variant-calling algorithm, we used GATK2.8, SAMtools, FreeBayes and Atlas-SNP2 in building CGES because of their performance, widespread adoption and diverse but complementary algorithms.You can read more about CGES and its uses in this article from Science Life.

Our partners in development at Globus Genomics have also made CGES available on Galaxy. For supported use, please contact Ravi Madduri.

CGES tool repository: Public Github repository containing source code for CGES and  its Galaxy wrapper

CGES-QC tool repository: Public Github repository containing source code for CGES-QC and its Galaxy wrapper

Trubetskoy V, Rodriguez A, Dave U, Campbell N, Crawford EL, Cook EH, Sutcliffe JS, Foster I, Madduri R, Cox NJ, Davis LK. Consensus Genotyper for Exome Sequencing (CGES): improving the quality of exome variant genotypes. Bioinformatics. 2014;btu591.


cni-GWAS Figure 1Copy number integrated GWAS (cni-GWAS): is a method developed by Dr. Davis and Eric Gamazon to effectively integrate both SNP allelic content and copy number dosage in a single model and estimate their joint effects on phenotype. In contrast to the traditional eQTL mapping approach that assumes diploidy at each candidate eQTL SNP or assumes no SNPs at a CNV locus, we assume that CNVs and SNPs may co-localize (genome-wide). We thus fit the following regression model:



where Y is a gene expression trait, C is the CNV genotype, S is the SNP genotype, b1 is the CNV genotype effect, b2 is the SNP genotype effect, Xb is the effect of nongenotype covariates (e.g., age, sex, or principal components), and e is the residual. The residuals e are assumed to be independently and identically (normally) distributed. Note that, in the absence of a CNV, the model reduces to the simple model that tests only for the presence of a SNP effect. Furthermore, in the absence of a SNP at a CNV locus, the model reduces to a regression that tests for the presence of a CNV eQTL effect. Thus, this approach contains the traditional single variant approaches as special cases.

 Gamazon ER, Cox NJ, Davis LK. Structural Architecture of SNP Effects on Complex Traits. American Journal of Human Genetics. 2014 Oct 7.

For more information about the method and paper, check out this write-up in the AJHG Editors’ Corner!