Skip to main content
An official website of the United States government

Analysis Tools

  • Adaptive Rank Truncated Product - Version 2 (ARTP2)

    ARTP2 is an R package of biological pathway analysis or pathway meta-analysis for genome-wide association studies (GWAS). It also provides tools for gene-level test as a special case. ARTP2 is an enhanced version of two previously released packages ARTP and AdaJoint.

  • Age Period Cohort Analysis Web Tool

    A panel of easy-to-interpret estimable APC functions and corresponding Wald tests in R code that can be accessed through a user-friendly web tool.

  • Association analysis based on SubSETs (ASSET)

    A subset-based approach improves power and interpretation for combined analysis of genetic assocation studies of heterogeneous traits.

  • BaDGE (Bayesian model for Detecting Gene Environment interaction)

    Bayesian model for Detecting Gene Environment interaction

  • Bayesian Subset Regression (BSR)

    BSR (Bayesian Subset Regression) is an R package that implements the Bayesian subset modeling procedure for high-dimensional generalized linear models.

  • Biomarker Tools

    The Biomarker Tools set estimates risk stratification from early biomarker data and provides strategies to advance biomarkers or other risk measures identified through case-control studies to clinical or public health applications.

  • CBRM

    An R package for testing Calibration of Binary Risk Model (CBRM) using different goodness-of-fit statistics

  • CNVfam

    CNVfam is a software package for jointly detecting copy number variations (CNV) in nuclear families genotyped using the Illumina platform.

  • CompareTests

    CompareTests is an R package to estimate agreement and diagnostic accuracy statistics for two diagnostic tests when one is conducted on only a subsample of specimens. A standard test is observed on all specimens.

  • CRaVe

    Software package designed to perform a range of association tests between sets of SNPs and a phenotype.

  • CGEN R Package

    CGEN (Case-control.Genetics) is an R package for analyzing genetic data on case-control samples, with particular emphasis on novel methods for detecting Gene-Gene and Gene-Environment interactions.

  • Extremely small Pvalue Evaluation for Resampling-based Test

    This is a R package for rapid evaluation of extremely small p-value for resampling-based test (EXPERT).

  • ezQTL: Web Platform for Interactive Visualization and Colocalization of Quantitative Trait Loci (QTL) and GWAS

    Drs. Tongwu Zhang and Jiyeon Choi developed ezQTL, a web-based bioinformatic application to interactively visualize and analyze genetic association data such as GWAS and molecular QTLs under different linkage disequilibrium (LD) patterns. ezQTL facilitates mapping disease susceptibility regions and assists researchers in characterizing and prioritizing functional genes and variants based on the genotype-phenotype associations.

  • iCARE: An R Package to Compute Individualized Coherent Absolute Risk Estimators

    The iCARE R Package allows researchers to quickly build models for absolute risk, and apply them to estimate an individual's risk of developing disease during a specified time interval, based on a set of user defined input parameters.

  • INPower

    IN.power is an R package for estimating the number of susceptibility SNPs and power of future studies.

  • Indirect Relative Risk Adjustment (IRRAD)

    IRRAD is an Excel spreadsheet that extends the “Axelson adjustment” for binary variables to adjust observed RRs for a confounding variable when exposure and confounder have multiple categories or to define those characteristics that a confounder must have to explain an observed association.

  • Interactive Radioepidemiological Computer Program (IRCP)

    Background explanation of the Interactive Radioepidemiological Computer Program

  • KinCohort

    Different approaches for handling varied error structures in studies of irradiated populations

  • LDlink

    LDlink is a suite of web-based applications to easily and efficiently interrogate linkage disequilibrium in population groups, with query flexibility and interactive visualization of results.

  • MultAssoc

    MultiAssoc is a MATLAB software package for test of association of a disease with a group of SNPs after accounting for their interaction with another group of SNPs or environmental exposures.

  • Nested Cohort

    NestedCohort is an R software package for fitting Kaplan-Meier and Cox Models to estimate standardized survival and attributable risks for studies where covariates of interest are observed on only a sample of the cohort.

  • PCAmatchR: Match Cases to Controls Based on Genotype Principal Components

    PCAmatchR is an open-source R software package that enables users to match cases and controls for more accurate GWAS analyses. It converts user-provided principal components (PC) into a Mahalanobis distance metric for selecting a set of well-matched controls for each case.

  • POWER V3.0 Software

    POWER V3.0 Software is used for computing sample size and power for binary outcome studies.

  • Power for Genetic Association Analyses (PGA)

    PGA is a software package containing algorithms and graphical user interfaces developed in Matlab for power and sample size calculation under various genetic models and statistical constraints.

  • Prevalence Incidence Mixture Models

    The R package and webtool fits Prevalence Incidence Mixture models to left-censored and irregularly interval-censored time to event data that is commonly found in screening cohorts assembled from electronic health records. Absolute and relative risk can be estimated for simple random sampling, and stratified sampling (the two approaches of superpopulation and a finite population are supported for target populations). Non-parametric (absolute risks only), semi-parametric, weakly-parametric (using B-splines), and some fully parametric (such as the logistic-Weibull) models are supported.


    A subregion-based burden test for simultaneous identification of susceptibility loci and subregions in rare-variant association studies.

  • segCNV

    SegCNV is a software package, implemented in C++, to detect germline copy number variations in SNP array data.

  • Semiparametric Kernel Independence Test (SKIT)

    Semiparametric Kernel Independence Test (SKIT) is an R package that conducts the test of independence between two vectors when there are excess zeros.

  • Sample size calculations for case-control studies

    This R package can be used to calculate the required samples size for unconditional multivariate analyses of unmatched case-control studies. The sample sizes are for a scalar exposure effect, such as binary, ordinal or continuous exposures. The sample sizes can also be computed for scalar interaction effects. The analyses account for the effects of potential confounder variables that are also included in the multivariate logistic model.

  • Spatial Power: Estimate Statistical Power of Spatial Clusters

    Spatial Power is a suite of web-based applications designed to perform power calculations of spatial statistics easily and efficiently, to support accurate study design for cancer epidemiology studies.

  • subHMM

    The SubHMM model for identifying tumor subclones R Package - A hidden Markov modeling approach for identifying tumor subclones in next-generation sequencing studies.

  • SUITOR: Selecting the Number of Mutational Signatures through Cross-Validation

    SUITOR is an R package, an unsupervised cross-validation tool to select the optimal number of signatures for mutational signature analysis.

  • The PLCO Atlas, GWAS Explorer

    The PLCO Atlas was created to serve as an interactive tool to facilitate data sharing to the public, which enables researchers to search for, visualize, and download aggregated association results from the PLCO genome-wide association analyses (GWAS).

  • TREAT (TREe-based Association Test)

    TREAT is an R package for detecting complex joint effects in case-control studies. The test statistic is derived from a tree-structure model by recursive partitioning the data. Ultra-fast algorithm is designed to evaluate the significance of association between candidate gene and disease outcome

  • WeightCalibSurvival R Package

    WeightCalibSurvival R Package uses weight calibration to improve efficiency for estimating pure risks from additive and Cox hazards models with two-phase designs such as nested case-control design.