|
Background Download Examples Conditional test tutorial Usage Warnings Future developments Licence Citation Contact |
Conditional analyses tutorialThis page illustrates how to use whap to perform a series of conditional tests. Conditional is (confusingly) used in two senses in whap. The first refers to tests based on a retrospective likelihood (becase we condition on trait values in that case). In contrast, in this tutorial we consider tests of genetic variation that are conditional on local genetic variation. 
PRACTICAL EXERCISE: FIRST PART, DETECTING THE EFFECT Omnibus haplotype test (all six markers)
whap --file dataACGT
Single SNP test, e.g. first SNP
whap --file dataACGT --alt 1
Single SNP test of second SNP, with permutation p-value
(500 permutations, in practice you should do more)
whap --file dataACGT --alt 2 --perm 500
All six single SNP tests, and permute data to obtain empirical p-values:
the P_MAX p-value is the significance of the best single SNP
result after correcting for multiple testing by permutation.
whap --file dataACGT --alt 1 --window --perm 100
Omnibus test, excluding rare haplotypes: e.g. only consider haplotypes with frequency > 10%
whap --file dataACGT --at 10
Haplotype-specific tests
whap --file dataACGT --hs
Selecting subsets of SNPs for haplotype tests:
e.g. haplotypes formed by SNP 3 and 4
note: use comma between the marker numbers, no spaces
whap --file dataACGT --alt 3,4
Give haplotype frequencies in cases and controls: haplotypes based on all 6 SNPs
whap --file dataACGT --cc-freqs
or based only on SNPs 3 and 4, for example:
whap --file dataACGT --alt 3,4 --cc-freqs
Output estimated haplotype phases for each individual: e.g. for all six SNPs
whap --file dataACGT --phase
e.g. for only 3rd and 4th SNP
whap --file dataACGT --alt 3,4 --phase
Get confidence intervals on estimates: note the first extra row is the CI for the intercept, this
should be ignored. The remaining five lines ( [2] to [6] refer to the CIs for the haplotypes (excluding the
first, most common haplotype, as this is the reference haplotype, so the effect is fixed to zero).
whap --file dataACGT --ci 0.95
QUESTION: What do you conclude about this
association? What SNPs / haplotypes are associated?
Do they increase or decrease risk? What are the effect sizes?
EXTRA EXERCISE (for bonus points): use Haploview to analyse these data. NOTE: load the file data1234.ped into Haploview instead of dataACGT.ped -- the files contain the same data except the SNP coding is numeric (1,2,3,4) instead of text (A,C,G,T), as Haploview can only handle numeric allele codes. SECOND PART: DISSECTING THE EFFECT All these tests below are appropriate for when a (large) omnibus haplotype association has been detected in a particular region. Often multiple SNPs and haplotype all have significant p-values. We can perform analyses to see if we can get a clearer picture of the association, as follows. Does SNP X have any effect after controlling for everything else? i.e. SNP X is dropped from the null model:e.g. for SNP 1:
whap --file dataACGT --alt 1,2,3,4,5 --null 2,3,4,5
For SNP 2
whap --file dataACGT --alt 1,2,3,4,5 --null 1,3,4,5
etc. Perform this test for every SNP. A significant p-value indicates
that the SNP still has an independent effect.
NOTE: look at the haplotype estimate numbers unders the null compared to the alternate, and see how they relate to the test NOTE: this test is not possible for all SNPs -- you will get a df of 0 and LRT of 0 meaning that the test was not possible. Why would this be the case? Does everything else still have an effect after controlling for SNP X? Here we ask whether or not the omnibus test remains significant after dropping out a SNP at a time. If the p-value remains significant then you conclude that this SNP cannot explain the entire omnibus result. If the p-value becomes not significant, this might suggest that the single SNP can explain the total association. e.g. for SNP 1
whap --file dataACGT --alt 1,2,3,4,5 --null 1
and the same for SNP 2
whap --file dataACGT --alt 1,2,3,4,5 --null 2
etc.
Does everything else still have an effect after controlling for haplotype H? Finally, we can ask whether or not there is any evidence of association (i.e. omnibus test) after controlling for a single haplotype at a time. This requires using the --constrain command to manually specify the haplotype parameters under the alternate and null models For example, the omnibus test is manually specified as follows (i.e. under the alternate every haplotype has a unique effect; under the null all haplotypes have the same effect, i.e. no association)
whap --file dataACGT --constrain 1,2,3,4,5,6/1,1,1,1,1,1
A haplotype-specific test, e.g. for the 2nd haplotype, would be specified:
(i.e. under alternate we estimate only 2 parameters, one for the second
haplotype, one for all others; under the null, we specify no association
as for the omnibus test)
whap --file dataACGT --constrain 1,2,1,1,1,1/1,1,1,1,1,1
Some notes on the use of the constrain option:
The conditional tests all involve a null hypothesis different from the simple null of no association. For example, testing for whether there is an effect controlling for haplotype 1: (i.e. the test is to constrain the 2nd to 6th haplotypes to have the same effect)
whap --file dataACGT --constrain 1,2,3,4,5,6/1,2,2,2,2,2
Testing for whether there is an effect controlling for haplotype 2:
whap --file dataACGT --constrain 1,2,3,4,5,6/1,2,1,1,1,1
Testing for whether there is an effect controlling for haplotype 3:
whap --file dataACGT --constrain 1,2,3,4,5,6/1,1,2,1,1,1
etc. Perform this for all six haplotypes, one at a time.
QUESTION: Compare these results to the single SNP and haplotype specific results. What do they tell you? The files beging cvACGT.* are the same data, except with an extra variant included that represents the true causal variant, which sits on a single haplotype (AACTA). The causal variant is a C/T SNP where the T allele increases risk, located between the 2nd and 3rd SNPs in the file used previously, i.e. the full disease haplotype is AATCTA. Follow-up exercise (time allowing): perform the same tests as above (omnibus, single SNP, haplotype specific and three classes of conditional tests) on the data files cvACGT.ped cvACGT.dat cvACGT.mapFor example, whap --file cvACGTetc. Interpret the tes results in light of the fact that this file contains an extra variant that is the functional causal variant itself. Created by Shaun Purcell; Last updated by Lori Thomas: March 2006 |