Data_Sheet_1_Evolution of the Natural Transformation Protein, ComEC, in Bacteria.pdf
Natural transformation enables the incorporation of exogenous DNA into host genomes and plays a fundamental role in the evolution of microbial populations. At the center of the natural transformation machinery, the ComEC protein mediates DNA import and serves potential functions in DNA recognition and single strand degradation. Despite its importance, the evolution of ComEC is not fully understood. Here, we aim to fill this knowledge gap by surveying putative ComEC proteins across 5,574 bacteria that span diverse phyla. We first derived the presence of a universal, core Competence domain through the analysis of ComEC proteins from known naturally competent species. Then, we followed this observation to identify Competence domain containing proteins (CDCPs) from all bacteria and used CDCPs as putative ComEC proteins for evolutionary analysis. A near universal presence of CDCPs was revealed, with 89% of the proteomes and 96% of the genomes encoding a single CDCP or a CDCP-like fragment. Two domains, DUF4131 and Lactamase_B, were found to commonly co-occur with the Competence domain. Ancestral state reconstruction of CDCPs over the bacterial species phylogeny suggested an origin of a Competence-only domain profile, while multiple gains and losses of the DUF4131 and Lactamase_B domains were observed among diverse bacterial lineages.