Skip to main content
An official website of the United States government

Tools & Resources

Biostatistics Branch investigators develop statistical and computational tools for epidemiologic and laboratory studies and distribute those tools to collaborators and the greater scientific community.

BB Descriptive Epidemiology Resources

Age Period Cohort Analysis Web Tool

A panel of easy-to-interpret estimable APC functions and corresponding Wald tests in R code that can be accessed through a user-friendly web tool.

BB Risk Assessment Tools

BCRA R Package

BCRA is an R package that projects absolute risk of invasive breast cancer according to NCI’s Breast Cancer Risk Assessment Tool (BCRAT) algorithm for specified race/ethnic groups and age intervals.

Breast Cancer Risk Assessment SAS Macro (Version 4, Gail Model)

A SAS macro (commonly referred to as the Gail Model) that projects absolute risk of invasive breast cancer according to NCI’s Breast Cancer Risk Assessment Tool (BCRAT) algorithm for specified race/ethnic groups and age intervals.

Breast Cancer with Mammographic Density Risk Assessment SAS Macro

A SAS macro that projects absolute invasive breast cancer risk for white women based on measurements of mammographic density and other risk factors.

Breast, Endometrial, and Ovarian Risk Assessment SAS Macros

Software that projects absolute risk for breast, endometrial, and ovarian cancer in Caucasian and African American women.

BLM (Binomial Linear Model) - R Package

BLM is an R package for estimating absolute risk and risk differences from cohort data with a binomial linear or LEXPIT regression model.

Asian Pacific Islander American Women - Breast Cancer Risk Assessment SAS Macro

Software that projects absolute breast cancer risk over defined age intervals for Asian and Pacific Islander American women with specific risk factors.

CARE Model SAS Macro: Breast Cancer Risk Assessment for African American Women

 

The CARE Model is a SAS macro that allows researchers to estimate an African American woman's risk of developing invasive breast cancer over specified age intervals.

Colon Cancer Risk Assessment - Gauss Program

An executable file (in GAUSS) that projects absolute colon cancer risk (with confidence intervals) according to NCI’s Colorectal Cancer Risk Assessment Tool (CCRAT) algorithm. GAUSS is not needed to run the program.

Colon Cancer Risk Assessment - SAS Macro

A SAS macro that projects absolute risk of colon cancer according to NCI’s Colorectal Cancer Risk Assessment Tool (CCRAT) algorithm.

Lung Cancer Risk Models for Screening (R package: lcrisks)

In both the absence and presence of screening, the R package lcrisks, calculates individual risks of lung cancer and lung cancer death based on covariates: age, education, sex, race, smoking intensity/duration/quit-years, Body Mass Index, family history of lung-cancer, and self-reported emphysema. In the presence of CT screening akin to the NLST (3 yearly screens, 5 years of follow-up), it uses the covariates to estimate risk of false-positive CT screen as well as the reduction in risk of lung cancer death and increase in risk of lung cancer screening.

lcmodels

The R package provides individual risks of lung cancer and lung cancer death based on various published papers: Bach et al., 2003; Spitz et al., 2007; Cassidy et al., 2008 (LLP); Hoggart et al., 2012; Tammemagi et al., 2013; Marcus et al., 2015 (LLPi); Wilson and Weissfeld, 2015 (Pittsburgh); Katki et al., 2016 (LCRAT, LCDRAT, and versions constrained to a few variables); Katki et al., 2018. This package also estimates the Life Years Gained From Screening-CT (LYFS-CT) as per Cheung et al., 2019. It requires the same variables as LCDRAT plus 12 additional comorbidities and the year of patient assessment.

Thyroid Cancer Risk Assessment Tool (tcrat)

The R package thyroid implements a risk prediction model developed by NCI researchers to calculate the absolute risk of developing a second primary thyroid cancer (SPTC) in individuals who were diagnosed with a cancer during their childhood.

BB Analysis Tools

Age Period Cohort Analysis Web Tool

A panel of easy-to-interpret estimable APC functions and corresponding Wald tests in R code that can be accessed through a user-friendly web tool.

Adaptive Rank Truncated Product - Version 2 (ARTP2)

ARTP2 is an R package of biological pathway analysis or pathway meta-analysis for genome-wide association studies (GWAS). It also provides tools for gene-level test as a special case. ARTP2 is an enhanced version of two previously released packages ARTP and AdaJoint.

Association analysis based on SubSETs (ASSET)

A subset-based approach improves power and interpretation for combined analysis of genetic assocation studies of heterogeneous traits.

BaDGE (Bayesian model for Detecting Gene Environment interaction)

Bayesian model for Detecting Gene Environment interaction

Bayesian Subset Regression (BSR)

BSR (Bayesian Subset Regression) is an R package that implements the Bayesian subset modeling procedure for high-dimensional generalized linear models.

CBRM

An R package for testing Calibration of Binary Risk Model (CBRM) using different goodness-of-fit statistics

CGEN R Package

CGEN (Case-control.Genetics) is an R package for analyzing genetic data on case-control samples, with particular emphasis on novel methods for detecting Gene-Gene and Gene-Environment interactions.

CNVfam

CNVfam is a software package for jointly detecting copy number variations (CNV) in nuclear families genotyped using the Illumina platform.

CompareTests

CompareTests is an R package to estimate agreement and diagnostic accuracy statistics for two diagnostic tests when one is conducted on only a subsample of specimens. A standard test is observed on all specimens.

CRaVe

Software package designed to perform a range of association tests between sets of SNPs and a phenotype.

Extremely small Pvalue Evaluation for Resampling-based Test

This is a R package for rapid evaluation of extremely small p-value for resampling-based test (EXPERT).

Het-Tree

This is the R package implementing the testing procedure described in the referred manuscript

iCARE: An R package to compute Individualized Coherent Absolute Risk Estimators

The iCARE R Package allows researchers to quickly build models for absolute risk, and apply them to estimate an individual's risk of developing disease during a specifed time interval, based on a set of user defined input parameters.

INPower

IN.power is an R package for estimating the number of susceptibility SNPs and power of future studies.

KinCohort

Different approaches for handling varied error structures in studies of irradiated populations

metapack

An R Package for Bayesian Meta-Analysis and Network Meta-Analysis with a Unified Formula Interface

MultAssoc

MultiAssoc is a MATLAB software package for test of association of a disease with a group of SNPs after accounting for their interaction with another group of SNPs or environmental exposures.

Nested Cohort

NestedCohort is an R software package for fitting Kaplan-Meier and Cox Models to estimate standardized survival and attributable risks for studies where covariates of interest are observed on only a sample of the cohort.

Prevalence Incidence Mixture Models

The R package and webtool fits Prevalence Incidence Mixture models to left-censored and irregularly interval-censored time to event data that is commonly found in screening cohorts assembled from electronic health records. Absolute and relative risk can be estimated for simple random sampling, and stratified sampling (the two approaches of superpopulation and a finite population are supported for target populations). Non-parametric (absolute risks only), semi-parametric, weakly-parametric (using B-splines), and some fully parametric (such as the logistic-Weibull) models are supported.

segCNV

SegCNV is a software package, implemented in C++, to detect germline copy number variations in SNP array data.

SUITOR: Selecting the Number of Mutational Signatures through Cross-Validation

SUITOR is an R package, an unsupervised cross-validation tool to select the optimal number of signatures for mutational signature analysis.

TREAT (TREe-based Association Test)

TREAT is an R package for detecting complex joint effects in case-control studies. The test statistic is derived from a tree-structure model by recursive partitioning the data. Ultra-fast algorithm is designed to evaluate the significance of association between candidate gene and disease outcome

BB Study Design & Planning Tools

POWER V3.0 Software

POWER V3.0 Software is used for computing sample size and power for binary outcome studies.

Power for Genetic Association Analyses (PGA)

PGA is a software package containing algorithms and graphical user interfaces developed in Matlab for power and sample size calculation under various genetic models and statistical constraints.

Email