PLINK: Whole genome data analysis toolset plink...
Latest PLINK release is v1.03 (10-Jun-2008)

Whole genome association analysis toolset

Introduction | Basics | Download | Reference | Formats | Data management | Summary stats | Filters | Stratification | IBS/IBD | Association | Family-based | Permutation | Haplotypes | Conditional tests | Proxy association | Imputation | Clumping | Epistasis | Copy Number Variation | R-plugins | SNP annotation | Simulation | Profiles | Resources | Misc. | FAQ | gPLINK

1. Introduction

2. Basic information

3. Download and general notes

4. Command reference table

5. Basic usage/data formats 6. Data management

7. Summary stats 8. Inclusion thresholds 9. Population stratification 10. IBS/IBD estimation 11. Association 12. Family-based association 13. Permutation procedures 14. Multimarker tests 15. Conditional haplotype tests 16. Proxy association 17. Full imputation (beta) 18. LD-based results clumping 19. Epistasis 20. Copy Number Variation 21. R-plugins 22. SNP annotation lookup 23. Simulation tools 24. Profile scoring 25. Resources 26. Miscellaneous 27. FAQ & Hints

28. gPLINK
 

Changelog

This page contains a version history recording changes and additions to PLINK.

V1.03 (10-Jun-2008)
  • Added teaching material/tutorial to resources section of web
  • Added --write-cluster, which can handle strings
  • Added --cnv-freq-exclude-exact and --cnv-freq-include-exact
  • Added --cnv-region-overlap
  • Displays type and score display in --segment-group for CNVs
  • Fixed problem with --read-freq
  • Fixed problem with --hethom and X chromosome data
  • Fixed problem when --condition and --genotypic used together
  • Added --genome-minimal and --read-genome-minimal
  • Now possible to --filter on strings and lists of strings
  • Added --make-pheno command to generate a binary phenotype given string filter
  • Allow --keep and --remove files to have additional columns beyond two
  • Additional case/control statistics given in LOG after filtering
  • Fixed a bug in the --hap-tdt and --proxy-tdt analyses
  • Added the --make-set and --make-set-border commands
  • Added --lookup-kb and --lookup-gene-kb
  • Added --lookup-gene-list (to create a SET file)
  • Added additional output information to SNP and gene lookups
  • Added --ld-snp command to modify the behaviour of --r2
  • LD pruning now considers non-autosomal markers
  • Fixed some issues with non-human data and the IBS/IBD calculation (previously skipped chromosomes over 22)

V1.02 (27-Mar-2008)
  • Added beta versions of CNV and generic variant commands, described here
  • Created a PDF version of the web page
  • Added --hethom flag to modify --genotypic
  • Added --seed to specify a fixed random seed
  • Added --recode-allele to modify --recodeA
  • Fixed issue with --clump-index-first option
  • Enabled PED files to be input from standard input (--ped -)
  • Fixed potential error in --chap output when test not defined

V1.01 (28-Jan-2008)
  • Added --dummy-coding modifier for --write-covar
  • Added --upate-map
  • Outputs phenotype names for --all-pheno if given
  • Reworked --mds-plot and --mds-cluster option to work with --within and without re-running the clustering
  • Fixed --qfam issues with permutation test
  • Changed defaults for --proxy-assoc and --proxy-impute
  • Changed direction of allele coding for proxy association options
  • Changed --proxy-r2-filter command (3 parameters) and naming
  • Changed syntax for proxy association options, --proxy-r2, etc
  • Added --proxy-glm method
  • Fixed problems with --hap-impute
  • Fixed problem with --hap-window
  • Issue with hyphens in SNP names and use as range delimiter (--d)
  • Fixed issue with numeric chromosome codes greater than 22 and --file
  • Changed output format of TDT and CMH commands
  • Make monomorphic SNPs have missing alleles in output if forced
  • Fixed minor problem with --bmerge when more than 2 alleles seen per SNP
  • Physical position output correctly with --genotypic option
  • Changed threshold to print NA in logistic
  • Changed headers BETA or OR in GLM output for clarity
  • Now --recodeA and --recodeAD count number of minor alleles
  • Added --sheep option
  • Fixed problem with --homozyg

V1.00 (4-Dec-2007)
  • Added conditional haplotype-based testing (--chap)
  • Added simple data simulation option (--simulate)
  • Added/extended SNP imputation functions (--proxy-assoc and --proxy-impute)
  • Added LD-based results clumping procedure (--clump)
  • Added option to select specific covariates (--values)
  • Added ability to specify lists and ranges of SNPs (--snps)
  • Added ability to select ranges based on regions (--range)
  • Added proxy selection features based on LD (--proxy-r2-filter)
  • Added simple "risk-profile" tool (--score)
  • Fixed issue with scaling of covariates in GLMs
  • Added --rerun option to repeat analysis given LOG file
  • Added --write-snplist option
  • Fixed dirction-of-effect error in haplotypic QTL test
  • Enabled --fisher to work with --model
  • Made variance inflation factor default value less stringent
  • Fixed some problems with haplotype TDT
  • Fixed problem with slightly different p-values for QTL tests from --adjust
  • Fixed bug in --all-pheno option when used with disease traits
  • Fixed bug in --epistasis routine regarding handling of missing data
V0.99s (26-July-2007)
  • Added SNP annotation --lookup set of options
  • Added proxy assocition functions (--proxy-assoc, etc)
  • Added extensible R plugin functionality (--R)
  • Added --lfile option for long-format input
  • Fixed problem with all-male or all-female X chromosome test
  • Added r-squared calculation for two SNPs based on haplotype frequencies
  • Added geno-grouping speedup to E-M algorithm; fixed minor problem with treatment of missing genotype data
  • Added --oblig-missing and --oblig-cluster options, to specify obligatory-missing genotypes
  • Added --impute-sex option
  • Added concordance calculation to --merge-mode 6 and 7
  • Added haplotype support for X and haploid chromosomes
  • Added haplotype support for quantitative trait analysis
  • Mendel error filter now zero's out the people implicated as per heurtistic described here
  • Fixed output commands to use user-defined missing phenotype and genotype values
  • Added dominant and recessive models for --linear and --logistic
  • Improved convergence of EM haplotyping routine
  • Fixed minor bug in --parameters function
  • Added --lambda option to fix genomic control factor
  • Added --log10 option to change output in *.adjusted
  • Added --horse species option
  • Added --qq-plot function
  • Added --loop-assoc option
  • Added --distance-matrix option
  • Changed implementation and interface of the --homozyg-* methods
  • Enabled permutation and set-tests with --dfam
  • Added ability to constrain --cluster with --within
  • Added --recode-bimbam, --recode-fastphase and --recode-structure options
  • Fixed minor issue with --het command
  • Added --liability option
  • Fixed issue with --genotypic and --covar
  • Fixed issue with --dfam
V0.99r (29-April-2007)
  • Added --parameters and --tests options
  • Added --zero-cluster option
  • Added --no-fid, --no-parents, --no-sex and --no-pheno options
  • Added --with-phenotype flag to modify --write-covar
  • Now give a warning if fileroots contain a fullstop/period character
  • Added --fisher for Fisher's exact test; use this in --test-missing
  • Added --set-test option
  • DFAM can include unrelateds (possibly in clusters) as well as families in a combined test
  • Improved multicollinearity check in linear model tests
  • Added --all-pheno option for some tests
  • Enabled permutation for --mh
  • Added XY and MT chromosome support
  • Fixed problem with --hap-window introduced in 0.99q
  • Fixed --homog for X and changed output format
  • Fixed problem with --out and --script introduced in 0.99q
V0.99q (3-March-2007)
  • Support for PED files larger than 4GB
  • Added --tfile to load transposed (row=SNP,column=person) files (i.e. as from --recode --transpose)
  • Added --recodeA option (like --recodeAD but only output additive components)
  • Added --write-covar option and also ability to include covariate files when recoding or making binary files
  • Add simple filters: --filter-cases, --filter-controls, --filter-males, --filter-females, --filter-founders and --filter-nonfounders
  • Added weighted multimarker tests with --whap
  • Added X chromosome and haploid models for --linear and --logistic with --xchr-model
  • Add --set-me-missing -- now, by default, remaining (i.e. for SNPs/individuals not removed) Mendel errors are not fixed to zero when recoding (--make-bed, etc) a file and filtering on --me.
  • Fixed bug in loading of covariates which made missing phenotypes
  • Changed implementation of --fast-epistasis
  • Fixed minor --bmerge issue with monomorphic alleles in offspring-only subsamples
  • Added --allele1234 and --alleleACGT options
  • Fixed CMH output to NA rather than -9
  • Added web-based context-specific warnings

V0.99p (16-January-2007)
  • Fixed bug in loading of covariates which made missing phenotypes no longer missing (e.g. -9 phenotype would have been treated literally as -9)
  • Fixed bug in --bmerge function when merged-in SNPs already exist
  • Added --transpose option to modify --recode
  • Fixed bug in --genotypic option that lead to incorrect results
  • Added --test-all option for --linear and --logistic
  • Changed --fast-epistasis to use correlational test
  • Added --ci support for --linear and --logistic
  • Added --mds-plot option
  • Now allow --remove and --keep together (similarly for --extract and --exclude
  • Added --genome-lists option to facilitate parallization of --genome
  • Added lower pool size in pool segment output, with --pool-size option
  • Added odds ratio calculation for --model tests
  • Modified --qfam within test (only model W)
  • Added --check-sex option
  • Cleaned up excessive memory use issue when merging multiple files
  • Added speed-up and bug fixes to QFAM routines
  • Added gPLINK compatibility via --gplink flag
  • Now treats half-missing genotypes, e.g. A 0 as missing rather than giving an error (haploid genotypes should still be coded as homozygous)
  • Recode file options (--make-bed, --recode, etc) now do not automatically set haploid heterozygous genotypes to missing, unless --set-hh-missing specified
  • No longer sets p-values <1e-16 to 0
  • Now use t-statistic for QTL test
  • Improved verbose segmental output (separate files)
  • Added --filter and --mfilter options

V0.99o 27-November-2006
  • Permutation applicable to --test-missing option
  • Added --twolocus output option
  • Added --overlap option
  • Added --logistic and --linear options
  • Added --genotypic and --interaction options
  • Reframed --homozyg tests
  • Added epistasis using linear (QT) and logistic regression models
  • Fixed bug in haplotype-based TDT test (counted transmissions to unaffecteds)
  • IBD estimation adjusted, and fixed a minor bug
V0.99n 11-October-2006
  • Added option to print warning when duplicate individual or marker IDs are found
  • Added --read-segment option
  • Changed output format of HWE and genotypic/model association tests
  • Implemented new bias-correct IBD estimators
  • Fixed minor bug that could cause problems when merging datasets on some platforms
  • Large restructuring of haplotype inference code
  • Added --test-mishap option
  • Added --indep-pairwise option
  • Added --hap-window option
  • Added --ld-window option
  • Added --plist option
  • Added --read-genome option
  • Added --map3 option
V0.99m 23-August-2006
  • Added --gene extraction option
  • Fixed bug affecting labels after set pruning
  • Added --list output option
  • Added --counts option to modify --freq
  • Fixed bug in the Hotelling's T(2) test handling of missing genotypes
  • Added permutation options for --model
  • Fixed minor bug introduced in v0.99l that caused crash when attempting a set-based TDT analysis
  • Altered some field headers in various output files for greater consistency
V0.99l 27-July-2006
  • Added --bmerge option to merge in a binary file
  • Added framework for QFAM test (option not yet available in release version)
  • Added Wiggington et al (AJHG, 2005) exact Hardy-Weinberg calculation
  • Added --from-kb etc options to select regions
V0.99j 14-July-2006
  • Added --window option to extract a +/- X kb region around a given SNP
  • Fixed bug which made set VIF pruning fail with a set containing a single SNP
  • Redircted ambiguous sex and no-non-missing-founders messages to files (plink.nosex and plink.nof) rather than to plink.log
  • Fixed bug in HWE tests which meant non-founders were included
V0.99i 5-July-2006
  • Improved parsing of PED and haplotype specification files; fixed some minor bugs since 0.99h in this regard, mainly DOS versus UNIX issues
  • Fixed bug in haploTDT routine
  • Implemented gene-based canonical correlation test within PLINK (previously, an R script was generated and this analysis was performed externally)
  • Added feature to scan genome and extract a set of SNPs that are relatively uncorrelated with each other (sliding window based on VIF; implemented in the --indep option)
V0.99h 29-June-2006
  • Added option to prune SNPs based on LD (i.e. select an independent subset of SNPs) using the --indep option
  • Fixed bug that occurred when creating a binary map file if a SNP had no non-missing alleles (i.e. previously one allele field was left blank, meaning that the file would not be properly read in subsequently)
  • Improved Hotelling's T(2) calculation -- now it better handles highly or completely correlated SNPs
  • Added singular value decomposition routines and variance inflation calculation
  • Add --allow-no-sex option to differently handle individuals with ambiguous sex codes
  • Fixed bug in --r and --r2 routines
V0.99g 20-June-2006
  • Implemented web-based version checking
  • Fixed error in which families counted twice when filtering on Mendel errors and performing TDT also
  • Added column count check for PED files
  • Allowed comments in PED files (lines starting #) for basic input and merge commands
  • Fixed specification of --gap for case-only epistasis tests -- using kb now, not bp
V0.99f 12-June-2006
  • Fixed bug that TDT in version 0.99e (but not prior versions), that meant that transmissions to unaffecteds as well as affecteds were counted
  • Improved parsing of --merge-list for end-of-file
V0.99e 9-June-2006
  • Improved efficiency of haplotype phase routine
  • Added nearest neighbour identification in --neighbour routine, and fixed a minor bug
  • Added support for haplotypic TDT test
  • Fixed error in homozygosity-run analysis
  • Fixed error in handling of monomorphic variants when creating a binary map file
  • Added --snp option to select single SNPs
  • Added out-of-memory warning
V0.99c 23-May-2006
  • Fixed error in conversion from SNP-major to individual-major data representations that effected Mendel error check routines
V0.99b 16-May-2006
  • Fixed error in Hardy-Weinberg calculations for quantitative traits
  • Implemented --nudge and --impossible features for IBD calculation
V0.99 30-Apr-2006
  • Major internal restructuring to hold data in either row-major or column-major formats, depending on choice of analysis (i.e. order genotypes either by individual or by SNP in memory).
  • Added ability to stratify summary statistics by a cluster variable
  • Improved parsing of haplotypes ( creates .mishap file for mis-specified haplotypes)
  • Fixed bug in CMH tests (problem with individuals who were not assigned to a cluster)
  • Fixed problem with extracting SNPs and individuals with binary PED files
V0.98 19-Apr-2006
  • Added support for adjusted significance test calculation (Bonferroni, FDR, Sidak, etc)
  • Added --script feature to allow long command lines
  • Added --1 feature to allow for 0/1 coding of affection variables
  • Added --tab feature to control field delimiters in recoded PED files
  • Added proper support for combined label-swapping and gene-dropping permutation (--swap-parents, --swap-sibs and --swap-unrel
  • Corrected bug in filters for binary files that aren't in genomic order (i.e. those that result from merge operations).
V0.97 10-Apr-2006
  • Added Hotelling's T2 test for multilocus SNP data
  • Added a test for interaction with quantitative traits and a dichotomous covariate
  • Added --merge-list option to merge more than two filesets simultaneously
  • Fixed bug quantitative trait association test (not dealing with missing phenotypes properly)
  • Fixed some minor bugs with parsing the command line
V0.96 30-Mar-2006
  • Fixed bug in --remove option
  • Added Breslow-Day test of homogeneous odds ratios
  • Added option to skip nearby SNPs in case-only epistasis test
  • Added time/date stamps to output
  • Records output in *.log file; most remaining output echoed to STDOUT instead of STDERR (aside from errors and warnings)
  • Improved parsing of command lines (checking numeric inputs, etc)
  • Added -mcc option to specify number of cases:controls in clustering, e.g. for 3:1 matching of cases to controls, for example.
V0.95 20-Mar-2006
  • Added Cochran-Mantel-Haenszel tests (2x2xK and IxJxK)
  • Added homogeneity of odds ratio between clusters test (partitioning chi-square)
  • Added support for gene-dropping simulation
V0.94 7-Mar-2006
  • Added feature to perform error checking of command line options (scan for unused options)
  • Ability to include external matching criteria for --cluster added
  • Ability to specify merge modes and a diff function for PED files
V0.93 1-Mar-2006
  • X chromosome support added for basic association test & quantitative traits
  • Threshold for --genome output based on pi-hat exceeding --min
V0.92 22-Feb-2006
  • X chromosome support added for case/control tests, quantitative trait association, TDT, genotypic correlations, allele frequency statistics. Not yet implemented for the population stratification, inbreeding or epistasis tests.
  • --chr and --from X and --to X options added
  • Some problems with the --merge option corrected
  • Now only considers founders for the allele frequency and HWE tests
 
This document last modified Tuesday, 17-Jun-2008 19:55:00 EDT