Skip to main content
An official website of the United States government
Government Funding Lapse
Because of a lapse in government funding, the information on this website may not be up to date, transactions submitted via the website may not be processed, and the agency may not be able to respond to inquiries until appropriations are enacted.

The NIH Clinical Center (the research hospital of NIH) is open. For more details about its operating status, please visit cc.nih.gov.

Updates regarding government operating status and resumption of normal operations can be found at opm.gov.

PGA Readme File

Introduction:

PGA is a package of algorithms and graphical user interfaces developed in Matlab for power and sample size calculation in case-control genetic association studies. The software comprises a wide verity of genetic models and statistical constraints and hence may facilitate decision making for case-control association studies of candidate genes, fine-mapping studies, and whole-genome scans.

Download and installation:

  • To install the PGA, save the pga.exe file in an appropriate folder on your disk. Click on it, to extract the folder to a designated location on your hard drive.
  • Users without Matlab software should install first the MATLAB Component Runtime (MCR) on their computers. To install the MCR component, double-click on the ‘MCRInstaller.exe’ file and follow the installation instructions.
  • Ensure that the MCR is installed on your computer in C:\Program Files\MATLAB\MATLAB_Component_Runtime\v76 or in the folder you selected in the installation process. Once the MCR is installed, you can download and run the different PGA stand-alone GUIs (pga1.exe, pga2.exe and edf.exe.

** The MCR is needed to be installed only once.

Software description:

PGA1:

PGA1 calculates and plot the relation between statistical power and sample size for a variety of genetic and statistical parameters. The user can determine the following parameters:

  • Type of genetic variation – SNP or Haplotype
  • Genetic mode of inheritance – Recessive, Dominant, Co-dominant(1df) or Co-dominant(2df).
  • Relative risk (RR) – The relative risk of the disease predisposing alleles. A second relative risk (RR2) is applicable only in Co-dominant model with 2 degrees of freedom (2df).
  • Linkage disequilibrium (LD) – The linkage disequilibrium value (in the form of r2 or D’) between the causative SNP and the genotyped marker.
  • Disease prevalence.
  • Disease allele frequency - the frequency of the disease predisposing allele.
  • Marker allele frequency – the allele frequency of the genotyped marker.
  • Effective degrees of freedom (EDF) - accounts for multiple testing in the study.
  • Alpha (Type I error).
  • Control to Case ratio.
  • Maximum sample size – the maximum number of cases to be considered in the calculations
  • ** The parameter values in the legend of the graph are ordered according to their order in the GUI.

PGA2:

PGA2 calculates and plots the minimum detectable relative risk (MDRR) for genotyped markers with different allele frequencies. The user can determine the following parameters:

  • Type of genetic variation – SNP or Haplotype.
  • Genetic mode of inheritance – Recessive, Dominant, Co-dominant(1df) or Co-dominant(2df).
  • Relative risk ratio (RR1/RR2) – The ratio between the two relative risks of the disease predisposing genetic alleles. This parameter is applicable only in the Co-dominant(2df) model.
  • Linkage disequilibrium (LD) – The linkage disequilibrium value (in the form of r2 or D’) between the causative SNP and the genotyped marker.
  • Effective degrees of freedom (EDF) - accounts for multiple testing in the study.
  • Disease allele frequency - the frequency of the disease predisposing allele.
  • Case number – the number of cases in the study.
  • Control to Case ratio.
  • Disease prevalence.
  • Power (1- Type II error).
  • Alpha (Type I error).
  • Maximum sample size – the maximum number of cases to be considered in the calculations.

** The parameter values in the legend of the graph are ordered according to their order in the GUI.

EDF:

EDF calculates the effective degrees of freedom for a particular set of SNP in linkage disequilibrium. It accepts SNP genotype data files from Hapmap (http://www.hapmap.org)or tab-delimited text files with SNP genotypes (in columns) coded as 0/1/2 for major homozygous, heterozygous and minor homozygous respectively and missing data encoded as NaN. Please see example input files (hapmap_example.pdf, genotype_example.pdf). From these data, it computes a summary measure of the EDF (Nyholt et al., 2004) and produces a map of linkage disequilibrium patterns (r2) for the SNPs in the dataset. It allows filtering SNP according to their allele frequencies by determining a threshold of minor allele frequency (MAF).

The resulted EDF value can be incorporated into PGA1 and PGA2 computations.

Email