Disclaimer: Early release articles are not considered as final versions. Any changes will be reflected in the online version in the month the article is officially released.
Author affiliation: The University of Osaka Thailand–Japan Research Collaboration Center on Emerging and Re-emerging Infections, Mueang, Thailand (K. Okada, A. Roobthaisong); The University of Osaka Research Institute for Microbial Diseases, Suita, Japan (K. Okada, T. Iida, S. Hamada); Ministry of Public Health, Muang (W. Wongboot, P. Doung-ngern, W. Bhunyakitikorn, P.A. Okada); Maesot General Hospital, Maesot, Thailand (T. Wongchai, W. Swaddiwudhipong)
Cholera, caused by Vibrio cholerae O1, is a potentially life-threatening diarrheal disease and remains a major global public health threat (1). In recent years, large-scale outbreaks have occurred in several countries in Africa and the Middle East, including Sudan and Yemen, where conflict, population displacement, and limited access to clean water have contributed to increased transmission (2,3). South Asia, particularly the Bengal Delta, serves as a hub for genetically diverse pandemic strains and contributes to their global dissemination (4,5). Outbreak occurrence and severity are influenced by environmental and socio-economic factors, and V. cholerae can persist in aquatic environments, although long-term local persistence varies by region (6–9).
Cholera has resurged periodically in Thailand, notably from 2007, when a large-scale outbreak affected 50 provinces, peaking in northeastern regions between September and November. Analyses during 2007–2010 using pulsed field gel electrophoresis and multilocus variable-number tandem repeat analysis indicated that those outbreaks were driven by repeated introductions of V. cholerae O1, particularly in border areas (10). Independent evidence further suggested regional transmission or shared infection sources among Thailand, Laos, and Vietnam. Previous studies have revealed high clonal diversity among Thailand isolates collected during 1999–2002 (11), whereas isolates of the El Tor serotype Ogawa strain from 2007 displayed a distinct ribotype, consistent with the introduction of a new clone in northeastern Thailand. Despite those insights, understanding of the genomic relationships between historical and more recent Thailand isolates and global pandemic strains remains incomplete.
We analyzed 498 Vibrio cholerae O1 isolates from Thailand collected from 2007 through April 2025 and selected 157 representative isolates for whole-genome sequencing: 85 from 2007–2012, 52 from 2015–2016, and 20 from 2023–2025 (including 72 newly sequenced in this study). We chose our isolate pool to reflect the genetic diversity observed in the multilocus variable-number tandem-repeat analysis and to ensure variation in the collection site and isolation period (Appendix 1 Tables 1, 2). Cholera case numbers and their geographic distribution in Thailand, obtained from reports by the Division of Epidemiology (Appendix 2 Figure 1), showed a marked nationwide decline in outbreak size and frequency. From 2017 to early 2025, the annual number of cases in Thailand remained
We conducted a series of analyses to investigate the evolutionary dynamics of V. cholerae O1 isolates in Thailand during 2007–2025, clarifying their origins, transmission patterns, and the relationships between isolates from Thailand and those from outside the country. We sequenced genomic DNA from the isolates using an Illumina MiSeq platform (https://www.illumina.com) and mapped to the V. cholerae N16961 reference genome. For global comparison, we curated V. cholerae O1 genomes from the National Center for Biotechnology Information Sequence Read Archive (https://www.ncbi.nlm.nih.gov/sra) and GenBank on the basis of published studies (Appendix 1 Table 3), ensuring balanced geographic and temporal representation. We quality-filtered data from the Sequence Read Archive using fastp software (https://github.com/opengene/fastp; -q parameter), retaining only high-quality sequences. We then identified core genome single-nucleotide polymorphisms (SNPs) using Snippy (https://github.com/tseemann/snippy), excluding regions associated with prophages (identified using PHAST [https://github.com/CshlSiepelLab/phast]), repeats (identified using NUCmer [https://github.com/nf-core/modules/tree/master/modules/nf-core/nucmer]), and recombination (identified using Gubbins [https://github.com/nickjcroucher/gubbins]). We masked these regions using VCFtools (https://github.com/vcftools/vcftools) before phylogenetic analysis. We reconstructed maximum-likelihood phylogenetic trees using RAxML software (https://cme.h-its.org/exelixis/web/software/raxml) and analyzed temporal signals using root-to-tip regression. We estimated divergence times using Bayesian molecular clock analysis. Pairwise SNP comparisons identified the 5 closest non-Thailand isolates for each outbreak cluster (Table). We used CholeraeFinder (Center for Genomic Epidemiology, https://www.genomicepidemiology.org) to detect virulence genes and mobile genetic elements. We also used Mantel tests as a complementary metric to assess broad congruence but drew the primary conclusions from the phylogenetic topologies to account for evolutionary structure.
Comparative genomic analysis of 157 Thailand isolates and global genomes identified 2 lineages: the seventh pandemic El Tor (7PET; 150 isolates) and the El Tor sister group (ST75; 7 isolates) (Figure 1). The 7PET clade comprised 4 distinct groups (TH1–TH4), corresponding to 3 outbreak periods (Figure 2). Period I (2007–2012) included TH1 (clusters a and b, Ogawa, CTX-3) and TH2 (cluster c, Inaba, CTX-3), with characteristic deletions in Vibrio seventh pandemic island [VSP] II. Period II (2015–2016) included TH3 (cluster d, Ogawa), characterized by CTX-3, VPI-1ΔVC0819, ΔhlyA, and variable presence of PLE1, which varied by the geographic region of cholera cases. Period III (2024–2025) was dominated by TH4 (cluster e, Ogawa), carrying CTX-3b (ctxB7), OmpU G325D, and deletions in VSP-II. Although deletions in canonical pathogenicity islands (e.g., VSP-II, VPI) could theoretically influence fitness, we observed no obvious association with clinical severity in the study dataset.
We conducted temporal analysis (Appendix 2 Figure 2) and Bayesian molecular clock modeling (8,137 recombination-filtered SNPs) to estimate the temporal dynamics and evolutionary divergence of V. cholerae clades circulating in Thailand and worldwide (Figure 3). The clades exhibited clear evolutionary patterns over time, and the statistical analysis was well supported (effective sample size >200), indicating that the estimated divergence times were reliable. Thailand clades were closely related to other South Asia clades: TH1 corresponded to BD-2 (MAB004 and MAB006), TH2 to BD-1 (MAB001), TH3 to BD-2 (HCIS-055B, BGD133, and IDH-7956), and TH4 to BD-1.2 (DMAVC-4, -8, -17, -18, and -19). In Bangladesh, BD-1 and BD-2 have been the 2 most prominent clades during the past 2 decades, with BD-1.2 emerging more recently and responsible for a massive 2022 cholera outbreak (12).
This study confirmed that Thailand experienced multiple independent waves of cholera during 2007–2025, primarily associated with repeated introductions of V. cholerae O1 strains from South Asia. Outbreak clades in Thailand during period I (2007–2012) were most closely related to the prominent BD-1 and BD-2 clades in Bangladesh. During period II (2015–2016), 3 distinct waves emerged in both the northwestern and southern regions of Thailand, and the southern isolates exhibited multilocus variable-number tandem-repeat analysis profiles identical to strains predominant in Mandalay, Myanmar, 6 months earlier (13), suggesting direct cross-border introductions. By period III (2024–2025), most cholera cases in Thailand were imported, primarily from Myanmar, which reported >2,000 cases in 2024 (1). Of note, CTX-3b isolates in Thailand were first detected in period III and belonged to the recently emerged BD-1.2 clade, responsible for a massive 2022 cholera outbreak in Bangladesh. Although our phylogenetic data highlight repeated introductions as the primary driver of outbreaks, the potential role of long-term environmental persistence cannot be ruled out owing to a lack of systematic environmental genomic data.
Our analysis also revealed non-7PET (El Tor sister, sequence type [ST] 75) isolates: 6 of the 7 we detected lacked cholera toxin genes and 1 was the toxigenic strain MS6 previously described in a border area in 2008 (14). During 2010–2020, ST75, rather than ST69, the major seventh pandemic lineage, emerged as the dominant clonal group in China and South Africa (15).
Although the geographic spread and origins of ST75 in Thailand remain unclear, our findings demonstrate the importance of continuous genomic surveillance for elucidating cholera transmission dynamics and informing outbreak response strategies. To translate our data into public health action, implementing routine genomic sequencing at border checkpoints and establishing a real-time data sharing framework with regional partners, particularly with Myanmar, Bangladesh, and India, is essential for the early detection and mitigation of future cholera waves in the South Asia region.
Dr. Okada is an associate professor at The University of Osaka whose research interests focus on the prevention and control of diarrheal diseases caused by Vibrio cholerae and other enteropathogens of significant public health concern.