SNP scoring routine
PLINK provides a simple means to generate scores
or profiles for individuals based on a simple allelic
scoring system involving one or more SNPs. One potential use
of such would be to assign a single quantitative index of genetic
load, perhaps to build simple multi-SNP prediction models.
Note This is an advanced function intended for
exploratory analyses, that is still in a beta development
phase. If the point of this routine isn't clear to you, you
probably should just ignore this entire feature.
Basic usage
The basic command to generate a score is the --score option, e.g.
./plink --bfile mydata --score myprofile.raw
which takes as a parameter the name of a file (here myprofile.raw)
that describes the scoring system. This file has the format of one or more lines,
each with exactly three fields
SNP ID
Reference allele
Score (numeric)
for example
SNPA A 1.95
SNPB C 2.04
SNPC T -0.98
SNPD A -0.24
These scores can be based on whatever you want. One choice might be the log of the odds ratio for significantly
associated SNPs, for example. Then, running the command above would generate a file
plink.profile
with one individual per row and the fields:
FID Family ID
IID Individual ID
PHENO Phenotype for that
CNT Number of non-missing SNPs used for scoring
SCORE Total score for that individual
The score is simply a sum across SNPs of the number of reference alleles (0,1 or 2) at that SNP multiplied by the score
for that SNP. For, example,
Genotype A/A G/G A/T 0/0
# ref alleles 2 0 1 n/a
Score 2*1.95 + 0*2.04 + 1*-0.98 -> 2.92
The score 2.92/3 (the average score per non-missing SNP) could then be used, e.g. as a covariate,
or a predictor of disease if it is scored in a sample that is independent from the one used
to generate the original scoring weights. Obviously, a score profile based on some effect
size measure from a large number of SNPs will necessarily be highly correlated with the
phenotype in the original sample: i.e. this in no (straightforward) way provides additional
statistical evidence for associations in that sample.
|