Skip to main content
An official website of the United States government

PGA Readme File

Introduction:

PGA is a package of algorithms and graphical user interfaces developed in Matlab for power and sample size calculation in case-control genetic association studies. The software comprises a wide verity of genetic models and statistical constraints and hence may facilitate decision making for case-control association studies of candidate genes, fine-mapping studies, and whole-genome scans.

Download and installation:

  • To install the PGA, save the pga.exe file in an appropriate folder on your disk. Click on it, to extract the folder to a designated location on your hard drive.
  • Users without Matlab software should install first the MATLAB Component Runtime (MCR) on their computers. To install the MCR component, double-click on the ‘MCRInstaller.exe’ file and follow the installation instructions.
  • Ensure that the MCR is installed on your computer in C:\Program Files\MATLAB\MATLAB_Component_Runtime\v76 or in the folder you selected in the installation process. Once the MCR is installed, you can download and run the different PGA stand-alone GUIs (pga1.exe, pga2.exe and edf.exe.

** The MCR is needed to be installed only once.

Software description:

PGA1:

PGA1 calculates and plot the relation between statistical power and sample size for a variety of genetic and statistical parameters. The user can determine the following parameters:

  • Type of genetic variation – SNP or Haplotype
  • Genetic mode of inheritance – Recessive, Dominant, Co-dominant(1df) or Co-dominant(2df).
  • Relative risk (RR) – The relative risk of the disease predisposing alleles. A second relative risk (RR2) is applicable only in Co-dominant model with 2 degrees of freedom (2df).
  • Linkage disequilibrium (LD) – The linkage disequilibrium value (in the form of r2 or D’) between the causative SNP and the genotyped marker.
  • Disease prevalence.
  • Disease allele frequency - the frequency of the disease predisposing allele.
  • Marker allele frequency – the allele frequency of the genotyped marker.
  • Effective degrees of freedom (EDF) - accounts for multiple testing in the study.
  • Alpha (Type I error).
  • Control to Case ratio.
  • Maximum sample size – the maximum number of cases to be considered in the calculations
  • ** The parameter values in the legend of the graph are ordered according to their order in the GUI.

PGA2:

PGA2 calculates and plots the minimum detectable relative risk (MDRR) for genotyped markers with different allele frequencies. The user can determine the following parameters:

  • Type of genetic variation – SNP or Haplotype.
  • Genetic mode of inheritance – Recessive, Dominant, Co-dominant(1df) or Co-dominant(2df).
  • Relative risk ratio (RR1/RR2) – The ratio between the two relative risks of the disease predisposing genetic alleles. This parameter is applicable only in the Co-dominant(2df) model.
  • Linkage disequilibrium (LD) – The linkage disequilibrium value (in the form of r2 or D’) between the causative SNP and the genotyped marker.
  • Effective degrees of freedom (EDF) - accounts for multiple testing in the study.
  • Disease allele frequency - the frequency of the disease predisposing allele.
  • Case number – the number of cases in the study.
  • Control to Case ratio.
  • Disease prevalence.
  • Power (1- Type II error).
  • Alpha (Type I error).
  • Maximum sample size – the maximum number of cases to be considered in the calculations.

** The parameter values in the legend of the graph are ordered according to their order in the GUI.

EDF:

EDF calculates the effective degrees of freedom for a particular set of SNP in linkage disequilibrium. It accepts SNP genotype data files from Hapmap (http://www.hapmap.org)or tab-delimited text files with SNP genotypes (in columns) coded as 0/1/2 for major homozygous, heterozygous and minor homozygous respectively and missing data encoded as NaN. Please see example input files (hapmap_example.pdf, genotype_example.pdf). From these data, it computes a summary measure of the EDF (Nyholt et al., 2004) and produces a map of linkage disequilibrium patterns (r2) for the SNPs in the dataset. It allows filtering SNP according to their allele frequencies by determining a threshold of minor allele frequency (MAF).

The resulted EDF value can be incorporated into PGA1 and PGA2 computations.

Email