PLINK: Whole genome data analysis toolset plink...
Latest PLINK release is v1.02 (27-Mar-2008)

Whole genome association analysis toolset

Introduction | Basics | Download | Reference | Formats | Data management | Summary stats | Filters | Stratification | IBS/IBD | Association | Family-based | Permutation | Haplotypes | Conditional tests | Proxy association | Imputation | Clumping | Epistasis | Copy Number Variation | R-plugins | SNP annotation | Simulation | Profiles | Resources | Misc. | FAQ | gPLINK

1. Introduction

2. Basic information

3. Download and general notes

4. Command reference table

5. Basic usage/data formats 6. Data management

7. Summary stats 8. Inclusion thresholds 9. Population stratification 10. IBS/IBD estimation 11. Association 12. Family-based association 13. Permutation procedures 14. Multimarker tests 15. Conditional haplotype tests 16. Proxy association 17. Full imputation (beta) 18. LD-based results clumping 19. Epistasis 20. Copy Number Variation 21. R-plugins 22. SNP annotation lookup 23. Simulation tools 24. Profile scoring 25. Resources 26. Miscellaneous 27. FAQ & Hints

28. gPLINK
 

PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.

PLINK (one syllable) is being developed by Shaun Purcell at the Center for Human Genetic Research (CHGR), Massachusetts General Hospital (MGH), and the Broad Institute of Harvard & MIT, with the support of others.  

 

Data management

  • Read data in a variety of formats
  • Recode and reorder files
  • Merge two or more files
  • Extracts subsets (SNPs or individuals)
  • Flip strand of SNPs
  • Compress data in a binary file format

Summary statistics for quality control

  • Allele, genotypes frequencies, HWE tests
  • Missing genotype rates
  • Inbreeding, IBS and IBD statistics for individuals and pairs of individuals
  • non-Mendelian transmission in family data
  • Sex checks based on X chromosome SNPs
  • Tests of non-random genotyping failure

Population stratification detection

  • Complete linkage hierarchical clustering
  • Handles virtually unlimited numbers of SNPs
  • Multidimensional scaling analysis to visualise substructure
  • Significance test for whether two individuals belong to the same population
  • Constrain cluster solution by phenotype, cluster size and/or external matching criteria
  • Perform subsequent association analyses conditional on cluster solution

Basic association testing

  • Case/control
    • Standard allelic test
    • Fisher's exact test
    • Cochran-Armitage trend test
    • Mantel-Haenszel and Breslow-Day tests for stratified samples
    • Dominant/recessive and general models
    • Model comparison tests (e.g. general versus multiplicative)
  • Family-based association (TDT, sibship tests)
  • Quantitative traits, association and interaction
  • Association conditional on one or more SNPs
  • Asymptotic and empirical p-values
  • Flexible clustered permutation scheme

Multimarker predictors, haplotypic tests

  • Impute a new dataset containing specific imputed haplotypes
  • Case/control and TDT association on the posterior distribution of haplotypes given genotypes
  • Suite of flexible, conditional haplotype tests
  • A set of "proxy associaiton" methods to study single SNP associations in their local haplotypic context
  • Extension to full SNP imputation methods, to test untyped SNPs given a reference panel

Additional tests

  • Gene-based tests of association
    • Hotelling's T(2) test
    • Cumulative rank-order sum statistics
  • Screen for epistasis
  • Gene-environment interaction with continuous and dichotomous environments

Additional features

  • Extensible with via R function plug-ins
  • Web-based SNP and gene annotation lookup feature
  • LD-based grouping of results across multiple studies
  • Simple SNP simulation feature
  • See the main documentation for full list of features
 

This document last modified Thursday, 27-Mar-2008 23:50:00 EDT