Skip to main content
An official website of the United States government

Power of Cohorts: Public Health Advances from Prospective Cohort Studies

, by Jennifer K. Loukissas, M.P.P.

DCEG's Commitment to Collaboration

Etiologic discovery of the causes of cancer and other chronic diseases depends upon the power of the prospective cohort study. The intermingled factors that influence risk—heredity, environment, occupation, lifestyle—are challenging enough to tease apart without the limitations of the other primary approach, case-control studies. While swift to produce results, case-control studies offer limited insight and have been documented to produce the wrong results for many exposures.

The multi- and inter-disciplinary, collaborative approaches that cohort studies require are the hallmark of DCEG research. Teams of epidemiologists, geneticists, biostatisticians, and other experts, employ various tools to uncover the causes of cancer.

As part of the intramural research program at the National Cancer Institute, DCEG is a natural incubator for high-risk, high-reward, time-intensive projects, such as cohort studies, that depend on stable, long-term funding. Collaborations are key to their success. Partnerships across the Division and with extramural investigators across the country and around the world have expanded exponentially the value of these resources.

Among the tremendous discoveries and significant public health advances to come from such undertakings are the benefits of exercise for cancer prevention and the association of various exposures to elevated cancer risk, including the determination that smoking causes lung cancer. Cohort studies have informed recommendations like those in Healthy People 2030; regulatory guidelines for population-level exposures to potential or known carcinogens; safety procedures in the workplace; programs to prevent infections and chronic disease; and clinical management following a cancer diagnosis.

The length of longitudinal studies, which may continue for 20 to 30 years, allows researchers to track changes in exposures, lifestyle, or health status over time. Participants contribute maximally when they remain active in the study for decades, providing detailed information repeatedly, from various sources, such as lengthy questionnaires, blood samples, linkage to wearable digital devices, and clinic or home visits for collection of biological samples, like urine. Future studies have plans to collect stool, which will be valuable for examining the microbiome and other metabolic factors.

Participant samples become time capsules. Vials of frozen material stored in biobanks increase in value as the years go by until a future investigator with a novel assay discovers biomarkers unimaginable at the time of collection. For example, ‘omics’ technologies in use today are being applied to data and biological samples collected a generation ago.

The following is an overview of some U.S.-based longitudinal cohort studies utilized and maintained by investigators in DCEG and news about two new, exciting, modern cohorts. Many of these studies have pooled their data as part of the NCI Cohort Consortium and other consortia.

General Population Cohorts Inform Population-Level Prevention

Historically, cancer has been a disease of aging. As such, most cohorts enroll individuals in mid-life or later. Two of the most celebrated —launched in the 1990s—are the NIH-AARP Diet and Health Study, and PLCO, the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Study. In addition to the wealth of knowledge generated by DCEG investigators, the broader scientific community can access data and biospecimens from these studies for their own investigations. Details on accessing that information follow each study description. 

NIH-AARP Diet and Health Study

The NIH-AARP Diet and Health Study recruited participants from the membership rolls of AARP, formerly the American Association of Retired Persons, to amass what was then the largest cohort study in the world. Thirty years on, data collected from those half-million individuals are still being analyzed and new findings continue to improve our understanding of patterns of behavior in mid-life and their effect on future cancer risk.

Detailed information from multiple questionnaires has enabled over 900 project proposals resulting in over 600 publications. Using dietary information, investigators in the Metabolic Epidemiology Branch, along with colleagues, have observed many important patterns, such as the safety of coffee consumption—even at five or six cups per day. Other critical observations from this cohort: there is no safe level of exposure to tobacco smoke; even low-intensity smokers benefit from cessation. By mapping the residential histories provided by study participants to air pollution data, investigators in the Occupational and Environmental Epidemiology Branch (OEEB) linked elevated levels of ultrafine particulate exposure with increased risk of adenocarcinoma of the lung. Similarly, high levels of fine particulate air pollution were associated with increased breast cancer incidence. The effort to map participant residences also led to the important observation of an association between industrial emissions of ethylene oxide and in situ breast cancer. A similar pattern was described for ambient dioxin emissions and the risk of non-Hodgkin lymphoma. These studies demonstrate the power of residence history mapping (i.e., geocoding), an important add-on to the cohort. 

Learn more about the NIH-AARP Diet and Health Study and see a summary of select findings.

Researchers interesting in accessing data can use the NIH-AARP STARS portal to learn about the process and submit their proposals.

PLCO: A Screening Trial that Became a Prospective Cohort

The Prostate, Lung, Colorectal, and Ovarian Cancer Screening Study (PLCO) Cohort began as a screening trial. DCEG investigators and colleagues from NCI and the 10 participating centers transformed it into a large observational cohort study that is still producing critical findings today.

Survey data and serial biological samples allowed for over 2,000 projects resulting in over 1,400 publications, including the identification of novel biomarkers. For example, investigators in the Infections and Immunoepidemiology Branch (IIB) discovered that human papillomavirus (HPV) antibodies in the blood could be used to predict risk of HPV-positive oropharyngeal cancer years before diagnosis.

In the 2020s, DCEG investigators completed genotyping of more than 110,000 PLCO participants on high-density arrays with imputation and made available to the public multi-trait genome-wide association studies (GWAS) results through The PLCO Atlas – GWAS Explorer.

Investigators seeking access to complete cancer incidence, mortality data, and biospecimens for each subject in the PLCO trial can enter those requests online.

A Modern Cohort: Connect for Cancer Prevention

Exposures and lifestyles change with time among individuals and at the population level. To protect the health of today's generation and prevent the cancers of the future, epidemiologists must embark upon the construction of new cohorts. Beginning in the mid-2010s, DCEG investigators began planning the Connect for Cancer Prevention Study. To sow the seeds of future research discoveries, they are recruiting Gen X and Millennials whose lifestyles include entirely novel practices and experiences, compared to their parents of the Silent and Baby Boom generations.

Connect began enrollment in 2021 and as of June 2024, surpassed 40,000 participants. The aims is to enroll 200,000 adults between the ages of 30 and 70, who have not previously been diagnosed with cancer, and who receive their health care from one of 10 partner health care systems. With this latter criterion, participants can readily share access to their electronic medical records (EMRs)—a component missing from the general population cohorts described above. Furthermore, EMR integration will aid with long-term follow up and increase the completeness of the participant data.

Consented participants in Connect complete extensive, online questionnaires and biospecimen collections—blood, urine, and saliva—at enrollment and periodically throughout the duration of follow-up. Over the course of the study, tissue collected from biopsies and invasive cancers will also be shared with Connect investigators for molecular studies. Passive follow-up via tumor registries, the National Death Index, and EMRs will provide additional outcome information for cancers and their precursors.

Connect is a digital-first cohort, built with a Findable, Accessible, Interoperable, and Reproducible (FAIR) data infrastructure that allows for sharing and collaboration on scales legacy cohorts could not achieve. This state-of-the-art cohort is built with an efficient, flexible, and integrated data infrastructure that makes the most of modern interoperability standards to serve as a research resource for future generations of scientists at the NCI and across the broader scientific community.

Additionally, by incorporating a diverse Participant Advisory Board and partnering with health systems that serve diverse communities, Connect can enhance recruitment of populations typically underrepresented in research. While there are several studies in the U.S. that have sought to address these gaps, including the Southern Community Cohort, the Multiethnic Cohort, and the Black Women’s Health Study, historically, most cohorts recruited from a relatively narrow segment of the population—predominately White, cis-gender, well-educated, higher-income individuals—limiting the generalizability of the findings.

Learn more about Connect on the GitHub site.

Browse the Connect for Cancer Prevention Study participant recruitment website.

Exposure-based Cohorts

DCEG also prioritizes research in populations with unique exposures, such as workers exposed in occupational settings, or individuals with unique health conditions or medical exposures. The discoveries from such investigations benefit not only the populations studied but also the general population, which may experience similar exposures, though typically at lower rates or doses. 

Cohorts to Study the Health of Workers

Experts in OEEB and the Radiation Epidemiology Branch (REB) have studied worker populations for over 40 years. These cohorts provided some of the earliest data on the potential harms from industrial chemicals and ionizing radiation.

Industry and Manufacturing: Formaldehyde, Diesel Exhaust, and Acrylonitrile

Workers whose jobs involve the use of toxic chemicals and other potentially harmful substances are often exposed at levels well above those of the general public. With well-designed questionnaires, reliable exposure assessment, careful participant recruitment, and long-term follow up, occupational cohort studies can profoundly influence safety in the workplace and regulations to protect public health.

For example, OEEB investigators have conducted countless studies resulting in important discoveries, from dry cleaners exposed to solvents to workers whose jobs involved exposure to formaldehyde. Data from these cohorts informed the classification of those exposures as carcinogenic to humans by the International Agency for Research on Cancer (IARC) Monograph Programme and the National Toxicology Program Report on Carcinogens.

The Diesel Exhaust in Miners Study (DEMS), launched in 1992 in collaboration with the National Institute for Occupational Safety and Health, enrolled workers at eight non-metal mines across the country. DEMS captured comprehensive exposure and lifestyle data, which allowed the investigators to clarify the relationship between exposure to diesel engine exhaust and the risk of death from lung cancer. The findings played a critical role in the classification of diesel exhaust as a Group 1 carcinogen by IARC in 2012 and have important implications for miners, tens of millions of workers in the U.S. and worldwide who are exposed to diesel exhaust in the workplace, and people who live in cities with high levels of diesel exhaust.

Acrylonitrile is a chemical used in the production of synthetic fibers and many other products. While results from animal bioassays suggested it might cause cancers at multiple sites, findings from early epidemiologic studies were inconsistent and inconclusive due to small sample size and poor exposure characterization. The NCI Acrylonitrile Cohort, the largest to date, found workers with the highest cumulative exposure experienced excess lung cancer more than 20 years after first exposure. An additional 21 years of mortality data showed an exposure-response relationship for lung cancer death and positive associations for death from bladder cancer and for non-malignant respiratory disease. IARC’s Monograph Programme evaluated this exposure in June 2024.

Farmers and Pesticide Applicators

The Agricultural Health Study (AHS) works to understand how agricultural, lifestyle, and genetic factors affect the health of farming populations. Since its inception, AHS investigators have evaluated agricultural practices and pesticide use, other occupational exposures, and a broad range of factors as they relate to risk for cancer and other outcomes. Data from the AHS have contributed to determinations of carcinogenicity of agricultural exposures as well as regulatory decisions in the U.S. and internationally.

More recently, DCEG investigators have led a molecular epidemiologic initiative known as the Biomarkers of Exposure and Effect in Agriculture (BEEA) study. Within BEEA, biospecimens and updated exposure information are being used to investigate the biologic mechanisms underlying associations between agricultural exposures and risk of cancer and other chronic diseases.

More information about AHS and BEEA can be found on the AHS website.

Medical Radiation Workers

The U.S. Radiologic Technologists Study (USRT or Rad Tech) has expanded our understanding of the radiation-related health effects for medical workers who administer diagnostic and therapeutic medical exams. This nationwide study began in 1982 with more than 110,000 current and former radiologic technologists, certified by the American Registry of Radiologic Technologists, who completed one or more questionnaires about their work history, health status, and other factors.

The Rad Tech Study has yielded important findings related to health risks from repeated exposure to relatively low doses of ionizing radiation, including associations between cumulative lifetime radiation exposure and risks of female breast cancer, lung cancer, and cataracts. Additional analysis demonstrated that cataract risk was particularly high for technologists who were positioned closer to the radiation source while risk was much lower for those who used personal protection equipment (room shields, lead glasses). The cohort has also been a valuable resource for investigating the effects of ultraviolet light exposure and other lifestyle factors on cancer and other health outcomes.

Individuals with Specific Medical Exposures or Diagnoses

The DES Story: Lessons Learned

Dr. Robert Hoover discusses a followup study of diethylstilbestrol (DES), a drug once prescribed to pregnant women. (Video produced and edited by Natalie Giannosa)

Multi-Generation Study of DES-Exposed Individuals

From the mid-1940s through the early 1970s, diethylstilbestrol (DES)—the first synthetic estrogen—was given to millions of pregnant women, exposing daughters and sons while in utero. It was thought to prevent miscarriage. Instead, DES was later identified as a human carcinogen and the first known trans-placental carcinogen.

In 1971, the first study was published connecting a mother’s prescription for DES during pregnancy and the occurrence of vaginal cancer in her daughter, prompting the FDA to revoke the use of DES in pregnant women. Several field studies were launched across the country. In 1992, NCI investigators and collaborators brought together those individual study centers to create the NCI Follow-up of Combined DES Cohorts. With the greater statistical power of the combined studies, investigators identified a constellation of adverse health outcomes in three generations, including an increased frequency of problems of the reproductive tract, changes in the tissue of the vagina, infertility, and poor pregnancy outcomes in daughters. As DES-exposed offspring reach the age when cancer rates begin to rise, it is important to continue to monitor the long-term risk of cancer and other adverse health outcomes in this unique population.

Cancer Survivors

Over 18 million Americans are survivors of one or more cancers. Survivors of cancer are at risk for a second primary malignancy either because of their exposures in life, genetic predisposition, or adverse effects of their treatment.

To investigate these risks, investigators in REB and the Integrative Tumor Epidemiology Branch convened a retrospective record-linkage cohort, the Kaiser Permanente (KP) Breast Cancer Survivors Cohort, a transdisciplinary resource to investigate treatment patterns over time and the risk of second cancers, cardiovascular disease, and mortality.

Among their findings to date, they observed breast cancer patients who received radiotherapy, had breast-conserving surgery, and had a history of hypertension or diabetes at the time of their breast cancer diagnosis had elevated risks for thoracic angiosarcoma.

A New Cohort of Children Treated for Cancer

As therapies to treat cancer continue to evolve, it is important to monitor short- and long-term adverse health outcomes. The Pediatric Proton and Photon Therapy Comparison Cohort, supported by the Childhood Cancer Data Initiative since 2020, is a multi-center retrospective cohort to compare the risk of second cancers among childhood cancer patients treated with proton radiotherapy to those treated with photon radiotherapy.

Investigators in REB and collaborators from Massachusetts General Hospital are assembling patient and radiotherapy treatment data from participating study centers across the United States and Canada. REB experts are developing state-of-the-art dosimetry methods to quantify radiation doses to exposed organs and tissues. Investigators will examine dose-response and assessment of dose-volume effects for the most common and radiosensitive second cancer sites (brain tumors, sarcomas, breast and thyroid cancer). The study is expected to continue for decades in order to capture the range of the late effects that may be associated with these therapies.

International Cohorts

The encyclopedic breadth of research within and across cohort studies in the Division could not begin to fit in the length of this article; the focus here was limited to projects in the United States. In collaboration with international partners, we have assembled many cohorts of truly unique populations. For example, the Shanghai Women’s Health Study, in collaboration with Vanderbilt University and the Shanghai Cancer Center, is a population-based prospective cohort of about 75,000 mostly never-smoking women recruited between 1997 and 2000 with blood and urine sample collection and followed via multiple in-person surveys and record linkages with population-based registries.

In Costa Rica, where DCEG has been studying cervical cancer for over 40 years, the Guanacaste HPV Natural History Study has followed over 10,000 women since 1993. It has yielded many critical insights into HPV natural history, including the evidence to establish the performance of then-novel HPV and cytologic screening techniques.

Collaborations in Ukraine have advanced our understanding of the health effects of low-dose exposure to ionizing radiation. Among a cohort of individuals exposed to radioactive fallout following the accident at the Chernobyl power plant, investigators have quantified the relationship between internal exposure to radiation in childhood and thyroid cancer detected through standardized screening procedures.

The Alpha-Tocopherol, Beta-Carotene Cancer Prevention (ATBC) Study, conducted in southwestern Finland, has been an integral research resource for NCI for over three decades. It was designed to test nutritional approaches to cancer prevention and the biological and anti-neoplastic properties of two antioxidants micronutrients , beta-carotene and vitamin E, among nearly 30,000 male smokers.

See an inventory of cohorts in DCEG on our website.