Laboratory for Genetic Epidemiology


http://www.genepi.org.au

SimHap: A comprehensive modelling framework and a multiple-imputation approach to haplotypic analysis of population-based data

Pamela A McCaskie, Kim W Carter, Lyle J Palmer

SimHap v1.0.0 is now available for download!

Front screen of SimHap

When inferring haplotypes for individuals with ambiguous phase (e.g. phase unknown genotype data), uncertainty is inherent. SimHap uses biallelic SNP genotype data to impute haplotype frequencies at the individual level. SimHap also tests for haplotype associations with outcomes of interest while incorporating the uncertainty around inferred haplotypes into the modelling procedure.

SimHap allows both single SNP and haplotype association analyses of Normal, binary, longitudinal and right-censored outcomes under a range of genetic models. SimHap can accommodate large data sets, and can model genetic and environmental effects, including complex haplotype:environment interactions. SimHap features cross-platform functionality via Java, and a sophisticated graphical user interface (GUI), so you need not have a comprehensive knowledge of statistical modelling or command line operation to perform complex analyses. This approach uses current estimation-maximisation based methods for the estimation of haplotypes from unphased genotype data1 and incorporates multiple-imputation techniques to model haplotypic associations in population-based samples.

SimHap will also perform association analysis on more simple epidemiological models, with or without the inclusion of single-SNPs or haplotypes. The current implementation of SimHap utilises a package written for the statistical computing package R2 to resolve haplotypes and provide their posterior probabilities; all possible haplotype configurations are resolved for each individual within the program itself, and the posterior probability of each configuration calculated. This information is then passed into either a generalised-linear modelling, linear mixed effects or Cox proportional hazards framework where (using multiple-imputation to deal with the uncertainty around imputing haplotypes) association tests are performed.

1. Excoffier L, Slatkin M. Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population. Mol Biol Evol 1995; 12: 921-7.

2. Ihaka R, Gentleman R. R: A Language for Data Analysis and Graphics. Journal of Computational and Graphical Statistics 1996; 5(3): 299-314.

Version History

07/09/2007SimHap v1.0.0 release - Release Notes
20/06/2007SimHap Beta 2.1 download temporarily removed pending SimHap v1.0 release
21/03/2006SimHap Beta 2.1 release - Release Notes
15/09/2005SimHap Beta 2.0 release
05/04/2005SimHap Beta 1.1 release
06/03/2005SimHap Beta 1.0 release

Contact

For further information, please contact:

Dr Pamela McCaskie
simhap@genepi.org.au