Data_Sheet_1_Development and Validation of a Reference Data Set for Assigning Staphylococcus Species Based on Next-Generation Sequencing of the 16S-23S rRNA Region.docx
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Many members of the Staphylococcus genus are clinically relevant opportunistic pathogens that warrant accurate and rapid identification for targeted therapy. The aim of this study was to develop a careful assignment scheme for staphylococcal species based on next-generation sequencing (NGS) of the 16S-23S rRNA region. All reference staphylococcal strains were identified at the species level using Sanger sequencing of the 16S rRNA, sodA, tuf, and rpoB genes and NGS of the 16S-23S rRNA region. To broaden the database, an additional 100 staphylococcal strains, including 29 species, were identified by routine diagnostic methods, 16S rRNA Sanger sequencing and NGS of the 16S-23S rRNA region. The results enabled development of reference sequences encompassing the 16S-23S rRNA region for 50 species (including one newly proposed species) and 6 subspecies of the Staphylococcus genus. This study showed sodA and rpoB targets were the most discriminative but NGS of the 16S-23S rRNA region was more discriminative than tuf gene sequencing and much more discriminative than 16S rRNA gene sequencing. Almost all Staphylococcus species could be distinguished when the max score was 99.0% or higher and the sequence similarity between the best and second best species was equal to or >0.2% (min. 9 nucleotides). This study allowed development of reference sequences for 21 staphylococcal species and enrichment for 29 species for which sequences were publicly available. We confirmed the usefulness of NGS of the 16S-23S rRNA region by identifying the whole species content in 45 clinical samples and comparing the results to those obtained using routine diagnostic methods. Based on the developed reference database, all staphylococcal species can be reliably detected based on the 16S-23S rRNA sequences in samples composed of both single species and more complex polymicrobial communities. This study will be useful for introduction of a novel diagnostic tool, which undoubtedly is an improvement for reliable species identification in polymicrobial samples. The introduction of this new method is hindered by a lack of reference sequences for the 16S-23S rRNA region for many bacterial species. The results will allow identification of all Staphylococcus species, which are clinically relevant pathogens.
Read the peer-reviewed publication