Data_Sheet_1_Interpreting Non-coding Genetic Variation in Multiple Sclerosis Genome-Wide Associated Regions.docx

Multiple sclerosis (MS) is the most common neurological disorder in young adults. Despite extensive studies, only a fraction of MS heritability has been explained, with association studies focusing primarily on protein-coding genes, essentially for the difficulty of interpreting non-coding features. However, non-coding RNAs (ncRNAs) and functional elements, such as super-enhancers (SE), are crucial regulators of many pathways and cellular mechanisms, and they have been implicated in a growing number of diseases. In this work, we searched for possible enrichments in non-coding elements at MS genome-wide associated loci, with the aim to highlight their possible involvement in the susceptibility to the disease. We first reconstructed the linkage disequilibrium (LD) structure of the Italian population using data of 727,478 single-nucleotide polymorphisms (SNPs) from 1,668 healthy individuals. The genomic coordinates of the obtained LD blocks were intersected with those of the top hits identified in previously published MS genome-wide association studies (GWAS). By a bootstrapping approach, we hence demonstrated a striking enrichment of non-coding elements, especially of circular RNAs (circRNAs) mapping in the 73 LD blocks harboring MS-associated SNPs. In particular, we found a total of 482 circRNAs (annotated in publicly available databases) vs. a mean of 194 ± 65 in the random sets of LD blocks, using 1,000 iterations. As a proof of concept of a possible functional relevance of this observation, we experimentally verified that the expression levels of a circRNA derived from an MS-associated locus, i.e., hsa_circ_0043813 from the STAT3 gene, can be modulated by the three genotypes at the disease-associated SNP. Finally, by evaluating RNA-seq data of two cell lines, SH-SY5Y and Jurkat cells, representing tissues relevant for MS, we identified 18 (two novel) circRNAs derived from MS-associated genes. In conclusion, this work showed for the first time that MS-GWAS top hits map in LD blocks enriched in circRNAs, suggesting circRNAs as possible novel contributors to the disease pathogenesis.