10.3389/fgene.2019.00387.s001 Keith R. Shockley Keith R. Shockley Shuva Gupta Shuva Gupta Shawn F. Harris Shawn F. Harris Soumendra N. Lahiri Soumendra N. Lahiri Shyamal D. Peddada Shyamal D. Peddada Data_Sheet_1_Quality Control of Quantitative High Throughput Screening Data.pdf Frontiers 2019 ANOVA clustering concentration-response potency quantitative high throughput screening toxicological response 2019-05-09 07:14:58 Dataset https://frontiersin.figshare.com/articles/dataset/Data_Sheet_1_Quality_Control_of_Quantitative_High_Throughput_Screening_Data_pdf/8099792 <p>Quantitative high throughput screening (qHTS) experiments can generate 1000s of concentration-response profiles to screen compounds for potentially adverse effects. However, potency estimates for a single compound can vary considerably in study designs incorporating multiple concentration-response profiles for each compound. We introduce an automated quality control procedure based on analysis of variance (ANOVA) to identify and filter out compounds with multiple cluster response patterns and improve potency estimation in qHTS assays. Our approach, called Cluster Analysis by Subgroups using ANOVA (CASANOVA), clusters compound-specific response patterns into statistically supported subgroups. Applying CASANOVA to 43 publicly available qHTS data sets, we found that only about 20% of compounds with response values outside of the noise band have single cluster responses. The error rates for incorrectly separating true clusters and incorrectly clumping disparate clusters were both less than 5% in extensive simulation studies. Simulation studies also showed that the bias and variance of concentration at half-maximal response (AC<sub>50</sub>) estimates were usually within 10-fold when using a weighted average approach for potency estimation. In short, CASANOVA effectively sorts out compounds with “inconsistent” response patterns and produces trustworthy AC<sub>50</sub> values.</p>