Data_Sheet_1_Emerging Severe Acute Respiratory Syndrome Coronavirus 2 Mutation Hotspots Associated With Clinical Outcomes and Transmission.PDF
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is the cause of the ongoing coronavirus disease 2019 (COVID-19) pandemic. Understanding the influence of mutations in the SARS-CoV-2 gene on clinical outcomes is critical for treatment and prevention. Here, we analyzed all high-coverage complete SARS-CoV-2 sequences from GISAID database from January 1, 2020, to January 1, 2021, to mine the mutation hotspots associated with clinical outcome and developed a model to predict the clinical outcome in different epidemic strains. Exploring the cause of mutation based on RNA-dependent RNA polymerase (RdRp) and RNA-editing enzyme, mutation was more likely to occur in severe and mild cases than in asymptomatic cases, especially A > G, C > T, and G > A mutations. The mutations associated with asymptomatic outcome were mainly in open reading frame 1ab (ORF1ab) and N genes; especially R6997P and V30L mutations occurred together and were correlated with asymptomatic outcome with high prevalence. D614G, Q57H, and S194L mutations were correlated with mild and severe outcome with high prevalence. Interestingly, the single-nucleotide variant (SNV) frequency was higher with high percentage of nt14408 mutation in RdRp in severe cases. The expression of ADAR and APOBEC was associated with clinical outcome. The model has shown that the asymptomatic percentage has increased over time, while there is high symptomatic percentage in Alpha, Beta, and Gamma. These findings suggest that mutation in the SARS-CoV-2 genome may have a direct association with clinical outcomes and pandemic. Our result and model are helpful to predict the prevalence of epidemic strains and to further study the mechanism of mutation causing severe disease.