Understanding the epidemiology and pathogenesis of Mycobacterium tuberculosis with non-redundant pangenome of epidemic strains in China
Zhou, Yang ; Anthony, Richard ; Wang, Shengfen ; Xia, Hui ; Ou, Xichao ; Zhao, Bing ; Song, Yuanyuan ; Zheng, Yang ; He, Ping ; Liu, Dongxin ... show 2 more
Zhou, Yang
Anthony, Richard
Wang, Shengfen
Xia, Hui
Ou, Xichao
Zhao, Bing
Song, Yuanyuan
Zheng, Yang
He, Ping
Liu, Dongxin
Series / Report no.
Open Access
Type
Journal Article
Article
Article
Language
en
Date of publication
2025-05-19
Year of publication
Research Projects
Organizational Units
Journal Issue
Title
Understanding the epidemiology and pathogenesis of Mycobacterium tuberculosis with non-redundant pangenome of epidemic strains in China
Translated Title
Published in
PLoS One 2025; 20(5):e0324152
Abstract
Tuberculosis is a major public health threat resulting in more than one million lives lost every year. Many challenges exist to defeat this deadly infectious disease which address the importance of a thorough understanding of the biology of the causative agent Mycobacterium tuberculosis (MTB). We generated a non-redundant pangenome of 420 epidemic MTB strains from China including 344 Lineage 2 strains, 69 Lineage 4 strains, six Lineage 3 strains, and one Lineage 1 strain. We estimate that MTB strains have a pangenome of 4,278 genes encoding 4,183 proteins, of which 3,438 are core genes. However, due to 99,694 interruptions in 2,447 coding genes, we can only confidently confirm 1,651 of these genes are translated in all samples. Of these interruptions, 67,315 (67.52%) could be classified by various genetic variations detected by currently available tools, and more than half of them are due to structural variations, mostly small indels. Assuming a proportion of these interruptions are artifacts, the number of active core genes would still be much lower than 3,438. We further described differential evolutionary patterns of genes under the influences of selective pressure, population structure and purifying selection. While selective pressure is ubiquitous among these coding genes, evolutionary adaptations are concentrated in 1,310 genes. Genes involved in cell wall biogenesis are under the strongest selective pressure, while the biological process of disruption of host organelles indicates the direction of the most intensive positive selection. This study provides a comprehensive view on the genetic diversity and evolutionary patterns of coding genes in MTB which may deepen our understanding of its epidemiology and pathogenicity.
