PLINK: Whole genome data analysis toolset plink...
Latest PLINK release is v1.07 (10-Oct-2009)

Whole genome association analysis toolset

Introduction | Basics | Download | Reference | Formats | Data management | Summary stats | Filters | Stratification | IBS/IBD | Association | Family-based | Permutation | LD calcualtions | Haplotypes | Conditional tests | Proxy association | Imputation | Dosage data | Meta-analysis | Result annotation | Clumping | Gene Report | Epistasis | Rare CNVs | Common CNPs | R-plugins | SNP annotation | Simulation | Profiles | ID helper | Resources | Flow chart | Misc. | FAQ | gPLINK

1. Introduction

2. Basic information

3. Download and general notes

4. Command reference table

5. Basic usage/data formats 6. Data management

7. Summary stats 8. Inclusion thresholds 9. Population stratification 10. IBS/IBD estimation 11. Association 12. Family-based association 13. Permutation procedures 14. LD calculations 15. Multimarker tests 16. Conditional haplotype tests 17. Proxy association 18. Imputation (beta) 19. Dosage data 20. Meta-analysis 21. Annotation 22. LD-based results clumping 23. Gene-based report 24. Epistasis 25. Rare CNVs 26. Common CNPs 27. R-plugins 28. Annotation web-lookup 29. Simulation tools 30. Profile scoring 31. ID helper 32. Resources 33. Flow-chart 34. Miscellaneous 35. FAQ & Hints

36. gPLINK
 

PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner.

The focus of PLINK is purely on analysis of genotype/phenotype data, so there is no support for steps prior to this (e.g. study design and planning, generating genotype or CNV calls from raw data). Through integration with gPLINK and Haploview, there is some support for the subsequent visualization, annotation and storage of results.

PLINK (one syllable) is being developed by Shaun Purcell at the Center for Human Genetic Research (CHGR), Massachusetts General Hospital (MGH), and the Broad Institute of Harvard & MIT, with the support of others.  

 

New in 1.07: meta-analysis, result annotation and analysis of dosage data.  

Data management

  • Read data in a variety of formats
  • Recode and reorder files
  • Merge two or more files
  • Extracts subsets (SNPs or individuals)
  • Flip strand of SNPs
  • Compress data in a binary file format

Summary statistics for quality control

  • Allele, genotypes frequencies, HWE tests
  • Missing genotype rates
  • Inbreeding, IBS and IBD statistics for individuals and pairs of individuals
  • non-Mendelian transmission in family data
  • Sex checks based on X chromosome SNPs
  • Tests of non-random genotyping failure

Population stratification detection

  • Complete linkage hierarchical clustering
  • Handles virtually unlimited numbers of SNPs
  • Multidimensional scaling analysis to visualise substructure
  • Significance test for whether two individuals belong to the same population
  • Constrain cluster solution by phenotype, cluster size and/or external matching criteria
  • Perform subsequent association analyses conditional on cluster solution

Basic association testing

  • Case/control
    • Standard allelic test
    • Fisher's exact test
    • Cochran-Armitage trend test
    • Mantel-Haenszel and Breslow-Day tests for stratified samples
    • Dominant/recessive and general models
    • Model comparison tests (e.g. general versus multiplicative)
  • Family-based association (TDT, sibship tests)
  • Quantitative traits, association and interaction
  • Association conditional on one or more SNPs
  • Asymptotic and empirical p-values
  • Flexible clustered permutation scheme
  • Analysis of genotype probability data and fractional allele coounts (post-imputation)

Multimarker predictors, haplotypic tests

  • Suite of flexible, conditional haplotype tests
  • Case/control and TDT association on the probabilistic haplotype phase
  • A set of proxy associaiton" methods to study single SNP associations in their local haplotypic context
  • Imputation heuristic, to test untyped SNPs given a reference panel

Copy number variant analysis

  • Joint SNP and CNV tests for common copy number variants
  • Filtering and summary procedures for segmental (rare) CNV data
  • Case/control comparison tests for global CNV properties
  • Permutation-based association procedure for identifying specific loci

Additional tests

  • Gene-based tests of association
  • Screen for epistasis
  • Gene-environment interaction with continuous and dichotomous environments

Meta-analysis

  • Automatically combine several generically-formatted summary files, for millions of SNPs
  • Fixed and random effects models

Result annotation and reporting

  • Post-analysis annotation of result files
  • LD-based and region-based grouping of results across multiple studies

Additional features

  • Extensible with via R function plug-ins
  • Web-based SNP and gene annotation lookup feature
  • Simple SNP simulation feature
  • ID helper tools, for tracking and working with project data
  • See the main documentation for full list of features
 

This document last modified Saturday, 10-Oct-2009 10:18:10 EDT