Skip to main content
Discovering the causes of cancer and the means of prevention

Comparing Alternatives for Estimation from Nonprobability Samples - Dr. Valliant

Biostatistics Branch Seminar Series

January 22, 2020 | 10:30 AM – 11:30 AM

NCI Shady Grove 1W032/034 Rockville, Maryland

Add to Outlook Calendar


Richard Valliant, Ph.D.
Research Professor Emeritus
Universities of Michigan & Maryland


Three approaches to estimation from non-probability samples are quasi-randomization, super-population modeling, and doubly-robust estimation. In the first, the sample is treated as if it was obtained via a probability mechanism but, unlike in probability sampling, that mechanism is unknown. Pseudo selection probabilities of being in the sample are estimated by using the sample in combination with some external data set that covers the desired population. In the super-population approach, observed values of analysis variables are treated as if they had been generated by some model. The model is estimated from the sample and, along with external population control data, is used to project the sample to the population. The specific techniques are the same or similar to ones commonly employed for estimation from probability samples and include binary regression, regression trees, and calibration. When quasi-randomization and super-population modeling are combined, this is referred to as doubly-robust estimation. This paper reviews some of the estimation options and compares them in a series of simulation studies.

**The mission of the Biostatistics Branch (BB) is to be an outstanding biostatistics unit that can contribute to the understanding of cancer etiology and to improve public health by the development and application of quantitative methods.  The BB Investigators develop statistical methods and data resources to strengthen observational studies, intervention trials, and laboratory investigations of cancer.**