Skip to main content
Discovering the causes of cancer and the means of prevention

Hormuzd A. Katki, Ph.D.

Senior Investigator

NCI Shady Grove | Room 7E592



Hormuzd A. Katki received a B.S. in math from the University of Chicago and an M.S. in statistics from Carnegie-Mellon University. He received a Ph.D in biostatistics from Johns Hopkins University in 2006, where he received the Margaret Merrell Award for research by a biostatistics doctoral student. Dr. Katki joined NCI in 1999, became a principal investigator in 2009, and was appointed senior investigator upon receiving NIH scientific tenure in 2015. 

Research Interests

Dr. Katki’s research focuses on understanding how epidemiologic findings could be used for cancer screening and prevention.  He is particularly interested in developing individualized risk-based approaches to cancer screening. His methodologic research focuses on estimating individual absolute risk, strategies for risk-based screening and management, and metrics for evaluating risk models and biomarkers. Dr. Katki mentors both statisticians interested in cancer research and epidemiologists who want to use state-of-the-art quantitative and computational methods.

Lung Cancer Screening

In spite of the definitive National Lung Screening Trial (NLST) and USPSTF guidelines recommending screening, CT lung-cancer screening is still not widespread. This is partly due to the inefficiency of screening. To make screening more efficient, Dr. Katki conducts research on the use of risk calculations to better identify those who would benefit the most from lung screening and to propose risk-based management options during the course of screening.

Dr. Katki developed validated individualized risk models for lung cancer incidence (LCRAT: Lung Cancer Risk Assessment Tool) and mortality (LCDRAT: Lung Cancer Death Risk Assessment Tool). Using these models to select ever-smokers at highest risk should improve screening effectiveness and efficiency versus current USPSTF guidelines. To empower doctors and patients with risk information needed to decide about undergoing screening, Dr. Katki collaborates with Dr. William Klein to improve the NCI lung cancer screening risk tool, the Risk-based NLST Outcomes Tool (RNOT).

The R package lcmodels estimates risk from nine published lung cancer models: LCRAT, LCDRAT, Bach, PLCOM2012, Spitz, Hoggart, LLP, LLPi, and Pittsburgh. The R package lcrisks provides the risk calculators that are used by RNOT.

Dr. Katki is conducting research on a Markov model for updating individual lung cancer risk with CT image findings during the course of screening. This model may be useful to extend screening intervals for those at sufficiently low risk of developing lung cancer.

Risk Models for Epidemiology

Dr. Katki is interested in developing models for individualized risk estimation.

Dr. Katki has developed risk models for screening data, where some disease is already present at baseline (left-censored), some disease occurs between consecutive visits (interval censored), and some disease is unknown if it was prevalent or incident. These models, the logistic-Weibull and logistic-Cox models, can be accessed as part of R package PImixture. The models allow sampling weights.

He has helped develop methods and software for calculating absolute risk for case-cohort studies, or case-control studies nested within cohorts (also known as “two-phase sampling”) which is in the R package NestedCohort.

He has proposed a hybrid risk regression model called “LEXPIT” that allows for both additive and multiplicative effects in logistic regression, and allows sampling weights. LEXPIT is in the R package blm.

Dr. Katki is conducting research on improving the external validity of epidemiologic cohort analyses by including data from nationally representative surveys.

Dr. Katki is also helping with research to develop individualized models of years of life gained by screening to select people for screening. Years of life gained is a measure of the benefit of screening, and as such is more relevant than simply using risk to select people for screening.

Metrics for evaluating diagnostic tests and risk prediction models

Dr. Katki is interested all aspects of evaluating the potential of new biomarkers for clinical use.

In particular, he has done research quantify risk stratification, the ability of a test or model to separate those at high-risk from those at low-risk. His metric, Mean Risk Stratification (MRS), is the average change in a person’s risk that is revealed by using a risk model or test. MRS better compares tests across populations with different disease prevalence by interpreting AUC in the context of prevalence. He has used MRS to compare the risk stratification from cervical screening tests and risk models to identify who in a family carries a variation in BRCA1/2. The MRS web tool is part of the Biomarker Tools Suite.

Dr. Katki has developed methods for calculating diagnostic accuracy and agreement statistics under verification bias, when one test is conducted on only a sub-sample of specimens, in R package CompareTests.

Cervical Cancer Screening and HPV-related cancers

Dr. Katki led a team that calculated cervical cancer risks, using the logistic-Weibull model, using data on 1.4 million women at Kaiser Permanente Northern California (KPNC). These data enabled the development of clinical practice guidelines to ensure “equal management of women at equal risk of cancer.” The resulting 2012 ASCCP Guidelines and the eight reports with the supporting data were published in a 2013 supplement of the Journal of Lower Genital Tract Disease

He developed the “Risk Bar” for the risk-based App for the Consensus Guidelines for the Management of Abnormal Cervical Cancer Screening Tests and Cancer Precursors, based on patient’s history of HPV, Pap test, and biopsy results. 

Dr. Katki collaborates with Dr. Anil Chaturvedi on oral HPV and oropharyngeal cancer, conducting research on natural history with an eye towards future prevention.

Population-based mutation screening

Dr. Katki is developing risk-based approaches to help propose screening programs for variants in high-risk genes, such as for BRCA1 and BRCA2.



R Software

  • CompareTests to correct for verification bias in diagnostic accuracy and agreement
  • NestedCohort for survival analysis for case-cohort studies or case-control studies nested in cohorts.
  • blm for the LEXPIT binary risk regression model that handles both logistic and additive effects.
  • lcmodels to calculate risks from 9 published lung cancer risk models
  • lcrisks to calculate risks from LCRAT, LCDRAT, and a model risk for false-positive lung CT screen
  • PImixture to calculate risks from screening program data using the logistic-Weibull and logistic-Cox models.

Press Contacts

To request an interview with NCI researchers, contact the NCI Office of Media Relations. | 240-760-6600