Data_Sheet_3_Genetic Diversity of Non-O157 Shiga Toxin-Producing Escherichia coli Recovered From Patients in Michigan and Connecticut.PDF
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Shiga toxin-producing Escherichia coli (STEC) are important foodborne pathogens and non-O157 serotypes have been gradually increasing in frequency. The non-O157 STEC population is diverse and is often characterized using serotyping and/or multilocus sequence typing (MLST). Although spacers within clustered regularly interspaced repeat (CRISPR) regions were shown to comprise horizontally acquired DNA elements, this region does not actively acquire spacers in STEC. Hence, it is useful for further characterizing non-O157 STEC and examining relationships between strains. Our study goal was to evaluate the genetic relatedness of 41 clinical non-O157 isolates identified in Michigan between 2001 and 2005 while comparing to 114 isolates from Connecticut during an overlapping time period. Whole genome sequencing (WGS) was performed, and sequences were extracted for serotyping, MLST and CRISPR analysis. Phylogenetic analysis of MLST and CRISPR data was performed using the Neighbor joining and unweighted pair group method with arithmetic mean (UPGMA) algorithms, respectively. In all, 29 serogroups were identified; eight were unique to Michigan and 13 to Connecticut. “Big-six” serogroup frequencies were similar by state (Michigan: 73.2%, Connecticut: 81.6%), though STEC O121 was not found in Michigan. The distribution of sequence types (STs) and CRISPR profiles was also similar across states. Interestingly, big-six serogroups such as O103 and O26, grouped into different STs located on distinct branches of the phylogeny, further confirming that serotyping alone is not adequate for evaluating strain relatedness. Comparatively, the CRISPR analysis identified 361 unique spacers that grouped into 80 different CRISPR profiles. CRISPR spacers 231 and 317 were isolated from 79.2% (n = 118) and 59.1% (n = 88) of strains, respectively, regardless of serogroup and ST. Spacer profiles clustered according to the MLST analysis, though some discrepancies were noted. Indeed, use of both MLST and CRISPR typing enhanced the discriminatory power when compared to the use of each tool separately. These data highlight the genetic diversity of clinical STEC from different locations and show that CRISPR profiling can be used alongside MLST to discriminate related strains. Use of targeted sequencing approaches are particularly helpful for sites without WGS capabilities and can help define which strains require additional characterization using more discriminatory methods.
Read the peer-reviewed publication