Data_Sheet_1_Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data.ZIP (28.21 MB)
Download file

Data_Sheet_1_Origin Sample Prediction and Spatial Modeling of Antimicrobial Resistance in Metagenomic Sequencing Data.ZIP

Download (28.21 MB)
dataset
posted on 08.03.2021, 14:39 by Maya Zhelyazkova, Roumyana Yordanova, Iliyan Mihaylov, Stefan Kirov, Stefan Tsonev, David Danko, Christopher Mason, Dimitar Vassilev

The steady elaboration of the Metagenomic and Metadesign of Subways and Urban Biomes (MetaSUB) international consortium project raises important new questions about the origin, variation, and antimicrobial resistance of the collected samples. CAMDA (Critical Assessment of Massive Data Analysis, http://camda.info/) forum organizes annual challenges where different bioinformatics and statistical approaches are tested on samples collected around the world for bacterial classification and prediction of geographical origin. This work proposes a method which not only predicts the locations of unknown samples, but also estimates the relative risk of antimicrobial resistance through spatial modeling. We introduce a new component in the standard analysis as we apply a Bayesian spatial convolution model which accounts for spatial structure of the data as defined by the longitude and latitude of the samples and assess the relative risk of antimicrobial resistance taxa across regions which is relevant to public health. We can then use the estimated relative risk as a new measure for antimicrobial resistance. We also compare the performance of several machine learning methods, such as Gradient Boosting Machine, Random Forest, and Neural Network to predict the geographical origin of the mystery samples. All three methods show consistent results with some superiority of Random Forest classifier. In our future work we can consider a broader class of spatial models and incorporate covariates related to the environment and climate profiles of the samples to achieve more reliable estimation of the relative risk related to antimicrobial resistance.

History

References