Using Geographic Information Systems to Improve Exposure Assessment in Epidemiological Studies of Cancer
, by Cora A. Hersh
For decades, DCEG scientists have been mapping cancer rates to describe spatial patterns of cancer incidence and mortality and to generate hypotheses about the environmental causes of cancer. Now, a modern generation of epidemiologic studies are using geographic information system (GIS) technology, recently-available data resources, and novel analytic methods to home in on environmental causes of cancer. Mary Ward, Ph.D., and Rena Jones, Ph.D., both in the Occupational and Environmental Epidemiology Branch (OEEB), are leading the DCEG Geographic Analysis Working Group to facilitate wider use of these approaches across the Division.
Spatial Variation Presents Challenges
To determine whether an environmental exposure is a cancer risk factor, researchers must first evaluate who has been exposed, for how long, and at what levels. Typically, retrospective surveys ask participants to recall their residential and occupational history and may also ask about daily activities at home and work, their diet, and lifestyle. However, surveys are limited by the participants’ knowledge of the exposures in their environment, thus independent exposure assessment using GIS and regulatory and other monitoring data are often used to complement these surveys. One of the main challenges of this endeavor stems from the spatial variation observed in many environmental exposures. GIS technology provides a toolkit for characterizing this complexity and linking it to participants in epidemiologic studies. Early epidemiologic studies were often limited to evaluating relationships based on large geographic areas such as state, county, or zip code, which could result in inaccurate associations between exposures and disease.
One example of the degree to which environmental exposures can vary within geographic areas is concentrations of outdoor air pollutants. Some of these pollutants, like ultrafine particulate matter (UFP) emitted from vehicle exhaust, are hypothesized to increase risk for lung cancer and other diseases. However, UFP concentrations, very high at an emission source, such as on freeways and other major roads, drop off dramatically at short distances downwind from these sources. As a result, an individual whose home is adjacent and downwind of a major road is likely to have a considerably different exposure to UFP than a close neighbor living upwind of the source. The Southern California Ultrafines Study, led by Dr. Jones and Debra Silverman, Sc.D., Chief of OEEB, is currently incorporating state-of-the-art GIS techniques to investigate the health effects of such pollutants.
Advances in Exposure Assessment
In recent years, as developments in technology and data resources have dramatically improved GIS science, DCEG investigators and others in the field have employed these tools in epidemiologic studies of cancer. Over the past 25+ years, the increasing availability of electronic databases with accurate geolocations of historical air and water monitoring data, satellite imagery, and other geographic datasets (e.g., locations of industrial facilities that emit dioxins, pesticide application registries) allows scientists to reconstruct residential exposures over a substantial portion of a person’s lifetime. These long-term exposure data are crucial to the study of cancer, which can take decades to develop. It is especially important to be accurate in estimating exposures suspected to have small individual contributions to risk. When an address history is obtained, investigators can use these tools and databases to characterize exposures across multiple residential addresses. Furthermore, with the availability of global positioning systems (GPS) and commercial address databases such as LexisNexis®, studies are increasingly taking advantage of access to these data.
In one of the first studies to utilize modern GIS methods, the NCI-SEER Non-Hodgkin Lymphoma Study in the late 1990’s, interviewers took a GPS reading of the home location at the time of the interview and collected lifetime residential histories. The residences were located on crop maps derived from satellite imagery, and acreage of crops around the homes was evaluated as a predictor of pesticide concentrations in dust samples obtained from homes of a subset of the study participants. Increasing acres of corn and soybeans within 750 meters of homes were associated with higher concentrations of agricultural herbicides in house dust samples, regardless of whether the occupants were farmers. Similar methods were used in a subsequent study of childhood leukemia in California, in which home locations were linked to a geographic database of agricultural pesticide use. For five of the seven pesticides evaluated, increasing agricultural pesticide use within 1,250 meters in the previous year was associated with significantly higher pesticide concentrations than homes with little or no pesticide use nearby. These approaches to relating geo-referenced crop and pesticide use data with measured residential pesticide concentrations demonstrate how GIS tools can be used to assess participant exposures without the time and cost involved in sampling tens of thousands of homes.
An investigation of drinking water nitrate exposure and cancer risk is underway within the Agricultural Health Study (AHS) cohort, an effort led by Dr. Ward. Participants provided their home addresses and drinking water source at the enrollment interview in the mid 1990’s. Over 60% of the cohort used private wells as their water source, which in agricultural areas can have higher nitrate concentrations than public water supplies due to their proximity to nitrogen sources and shallower depth. Because private wells are not routinely monitored, the investigators developed a geospatial model that linked wells with measurement data to historical crop maps, the locations of animal feeding operations, nitrogen fertilizer use, geology, soils, and other factors. “We use these techniques to predict exposures in locations where measurements are not available, or where it would be too expensive collect samples,” said Dr. Ward. “GIS-based modeling, in this case random forest modeling, can provide an estimate of likely contaminants, such as nitrates, in their wells without direct testing of the water.”
For the New England Bladder Cancer Study, a similar modeling approach was used to estimate arsenic concentrations when it was not possible to obtain water samples from participants’ current or past homes. Geological data on the bedrock features of the region, which are known to contain natural arsenic, together with information on well depth and other factors associated with arsenic in the groundwater, were used to develop a GIS-based model in the landmark study, which found that arsenic in drinking water may be responsible for the long-standing bladder cancer excess in the region.
DCEG scientists are part of a broad community working on developing and applying GIS techniques to cancer research. In September 2016, the NCI Division of Cancer Control and Population Sciences sponsored the Conference on Geospatial Approaches to Cancer Control and Population Sciences. Dr. Jones presented her work on the accuracy of residence locations and environmental exposure assessment, and Pavel Chernyavskiy, Ph.D., of the Radiation Epidemiology and Biostatistics Branches presented his work on spatial analysis of mortality in the U.S. Conferences such as these are important because of the highly collaborative nature of GIS-based research—sharing information about data resources and analysis techniques strengthens epidemiologic research nationwide.