Table_3_The Hidden Genomic Diversity, Specialized Metabolite Capacity, and Revised Taxonomy of Burkholderia Sensu Lato.xlsx
Burkholderia sensu lato is a collection of closely related genera within the family Burkholderiaceae that includes species of environmental, industrial, biotechnological, and clinical importance. Multiple species within the complex are the source of diverse specialized metabolites, many of which have been identified through genome mining of their biosynthetic gene clusters (BGCs). However, the full, true genomic diversity of these species and genera, and their biosynthetic capacity have not been investigated. This study sought to cluster and classify over 4000 Burkholderia sensu lato genome assemblies into distinct genomic taxa representing named and uncharacterized species. We delineated 235 species groups by average nucleotide identity analyses that formed seven distinct phylogenomic clades, representing the genera of Burkholderia sensu lato: Burkholderia, Paraburkholderia, Trinickia, Caballeronia, Mycetohabitans, Robbsia, and Pararobbisa. A total of 137 genomic taxa aligned with named species possessing a sequenced type strain, while 93 uncharacterized species groups were demarcated. The 95% ANI threshold proved capable of delineating most genomic species and was only increased to resolve several closely related species. These analyses enabled the assessment of species classifications of over 4000 genomes, and the correction of over 400 genome taxonomic assignments in public databases into existing and uncharacterized genomic species groups. These species groups were genome mined for BGCs, their specialized metabolite capacity calculated per species and genus, and the number of distinct BGCs per species estimated through kmer-based de-replication. Mycetohabitans species dedicated a larger proportion of their relatively small genomes to specialized metabolite biosynthesis, while Burkholderia species harbored more BGCs on average per genome and possessed the most distinct BGCs per species compared to the remaining genera. Exploring the hidden genomic diversity of this important multi-genus complex contributes to our understanding of their taxonomy and evolutionary relationships, and supports future efforts toward natural product discovery.