Skip to main content
An official website of the United States government

Helicobacter pylori Genome Project Publishes Findings

, by Jennifer K. Loukissas, M.P.P.

graphical representation of h pylori bacterium and genomic landscape

False-colored green H. pylori in front of stylized genomics data.

An international and multidisciplinary team of over 250 scientists has sequenced the genomes and mapped the population structure of over one thousand strains of Helicobacter pylori (H. pylori), the only confirmed bacterial carcinogen, collected from individuals from 50 countries around the world. In 2020, H. pylori was responsible for more than 850,000 new cancer cases, most notably non-cardia gastric tumors. The results of this massive undertaking were published December 11, 2023, in the journal Nature Communications. 

Gastric cancer is the fourth leading cause of cancer death worldwide. In the US, there are persistent racial and ethnic disparities in gastric cancer mortality rates. While there are several known risk factors for gastric cancer, chronic infection with H. pylori is an established cause of the non-cardia gastric cancer subtype. It also causes a subset of cardia gastric cancers as well as non-Hodgkin’s lymphomas in the stomach. About fifty percent of the world’s population carries this infection in their stomachs. Overall, <1 in 100 infected individuals with H. pylori will go on to develop gastric cancer. Humans have been infected and colonized by this bacterium for at least the past 100,000 years. The microbe has evolved with humans and maps to human migratory patterns. 

The H. pylori Genome Project (HpGP) research team—co-led by DCEG investigators M. Constanza Camargo, Ph.D., investigator in the Metabolic Epidemiology Branch, and Charles S. Rabkin, M.D., senior investigator in the Infections and Immunoepidemiology Branch—compared the resulting dataset to a reference set of 255 genomes with known population ancestry which allowed them to quantify, with great resolution, the different inferred ancestral sources of H. pylori subpopulations and the recent and ongoing admixture among subpopulations.

There were several novel observations. For H. pylori isolated from people in northern Europe, they found evidence of substantial contribution from a north Asian population of H. pylori and specifically a subpopulation of H. pylori that infects Uralic speakers in northwestern Siberia of the Khanty and Nenet ethnicities. The genomes of H. pylori isolated from northern and southern Indigenous Americans had surprisingly different ancestries. Bacteria isolated in northern Indigenous communities were more similar to H. pylori found in north Asia while the bacteria isolated in southern Indigenous communities had higher relatedness to H. pylori found in east Asia. They also found a highly related yet geographically dispersed North American subpopulation.

Ongoing analyses by the HpGP Research Network are comparing strains from patients with different gastric diseases to identify genetic and epigenetic bacterial features that determine human pathogenicity. Many critical questions about biology are under investigation and expected to be published in the coming year. Other studies are addressing prophages, plasmids, key virulence factors, and antibiotic resistance. This publicly available worldwide collection of complete genomes and epigenomes with high-quality metadata will become a major asset for H. pylori genomics and gastric cancer research.


Thorell K and Muñoz-Ramírez ZY, et al. The Helicobacter pylori Genome Project: Insights into H. pylori population structure from analysis of a worldwide collection of complete genomes. Nat Commun. 2023.

Data Availability