PGA Readme File
Introduction:
PGA is a package of algorithms and graphical user interfaces developed in Matlab for power and sample size calculation in case-control genetic association studies. The software comprises a wide verity of genetic models and statistical constraints and hence may facilitate decision making for case-control association studies of candidate genes, fine-mapping studies, and whole-genome scans.
Download and installation:
- To install the PGA, save the pga.exe file in an appropriate folder on your disk. Click on it, to extract the folder to a designated location on your hard drive.
- Users without Matlab software should install first the MATLAB Component Runtime (MCR) on their computers. To install the MCR component, double-click on the ‘MCRInstaller.exe’ file and follow the installation instructions.
- Ensure that the MCR is installed on your computer in C:\Program Files\MATLAB\MATLAB_Component_Runtime\v76 or in the folder you selected in the installation process. Once the MCR is installed, you can download and run the different PGA stand-alone GUIs (pga1.exe, pga2.exe and edf.exe.
** The MCR is needed to be installed only once.
Software description:
PGA1:
PGA1 calculates and plot the relation between statistical power and sample size for a variety of genetic and statistical parameters. The user can determine the following parameters:
- Type of genetic variation – SNP or Haplotype
- Genetic mode of inheritance – Recessive, Dominant, Co-dominant(1df) or Co-dominant(2df).
- Relative risk (RR) – The relative risk of the disease predisposing alleles. A second relative risk (RR2) is applicable only in Co-dominant model with 2 degrees of freedom (2df).
- Linkage disequilibrium (LD) – The linkage disequilibrium value (in the form of r2 or D’) between the causative SNP and the genotyped marker.
- Disease prevalence.
- Disease allele frequency - the frequency of the disease predisposing allele.
- Marker allele frequency – the allele frequency of the genotyped marker.
- Effective degrees of freedom (EDF) - accounts for multiple testing in the study.
- Alpha (Type I error).
- Control to Case ratio.
- Maximum sample size – the maximum number of cases to be considered in the calculations
- ** The parameter values in the legend of the graph are ordered according to their order in the GUI.
PGA2:
PGA2 calculates and plots the minimum detectable relative risk (MDRR) for genotyped markers with different allele frequencies. The user can determine the following parameters:
- Type of genetic variation – SNP or Haplotype.
- Genetic mode of inheritance – Recessive, Dominant, Co-dominant(1df) or Co-dominant(2df).
- Relative risk ratio (RR1/RR2) – The ratio between the two relative risks of the disease predisposing genetic alleles. This parameter is applicable only in the Co-dominant(2df) model.
- Linkage disequilibrium (LD) – The linkage disequilibrium value (in the form of r2 or D’) between the causative SNP and the genotyped marker.
- Effective degrees of freedom (EDF) - accounts for multiple testing in the study.
- Disease allele frequency - the frequency of the disease predisposing allele.
- Case number – the number of cases in the study.
- Control to Case ratio.
- Disease prevalence.
- Power (1- Type II error).
- Alpha (Type I error).
- Maximum sample size – the maximum number of cases to be considered in the calculations.
** The parameter values in the legend of the graph are ordered according to their order in the GUI.
EDF:
EDF calculates the effective degrees of freedom for a particular set of SNP in linkage disequilibrium. It accepts SNP genotype data files from Hapmap (http://www.hapmap.org)or tab-delimited text files with SNP genotypes (in columns) coded as 0/1/2 for major homozygous, heterozygous and minor homozygous respectively and missing data encoded as NaN. Please see example input files (hapmap_example.pdf, genotype_example.pdf). From these data, it computes a summary measure of the EDF (Nyholt et al., 2004) and produces a map of linkage disequilibrium patterns (r2) for the SNPs in the dataset. It allows filtering SNP according to their allele frequencies by determining a threshold of minor allele frequency (MAF).
The resulted EDF value can be incorporated into PGA1 and PGA2 computations.