Data_Sheet_1_In-silico Antigenicity Determination and Clustering of Dengue Virus Serotypes.docx

Emerging or re-emerging dengue virus (DENV) causes dengue fever epidemics globally. Current DENV serotypes are defined based on genetic clustering, while discrepancies are frequently observed between the genetic clustering and the antigenicity experiments. Rapid antigenicity determination of DENV mutants in high-throughput way is critical for vaccine selection and epidemic prevention during early outbreaks, where accurate prediction methods are seldom reported for DENV. Here, a highly accurate and efficient in-silico model was set up for DENV based on possible antigenicity-dominant positions (ADPs) of envelope (E) protein. Independent testing showed a high performance of our model with AUC-value of 0.937 and accuracy of 0.896 through quantitative Linear Regression (LR) model. More importantly, our model can successfully detect those cross-reactions between inter-serotype strains, while current genetic clustering failed. Prediction cluster of 1,143 historical strains showed new DENV clusters, and we proposed DENV2 should be further classified into two subgroups. Thus, the DENV serotyping may be re-considered antigenetically rather than genetically. As the first algorithm tailor-made for DENV antigenicity measurement based on mutated sequences, our model may provide fast-responding opportunity for the antigenicity surveillance on DENV variants and potential vaccine study.