Image_2_Cataloging Human PRDM9 Allelic Variation Using Long-Read Sequencing Reveals PRDM9 Population Specificity and Two Distinct Groupings of Related Alleles.JPEG
The PRDM9 protein determines sites of meiotic recombination in humans by directing meiotic DNA double-strand breaks to specific loci. Targeting specificity is encoded by a long array of C2H2 zinc fingers that bind to DNA. This zinc finger array is hypervariable, and the resulting alleles each have a potentially different DNA binding preference. The assessment of PRDM9 diversity is important for understanding the complexity of human population genetics, inheritance linkage patterns, and predisposition to genetic disease. Due to the repetitive nature of the PRDM9 zinc finger array, the large-scale sequencing of human PRDM9 is challenging. We, therefore, developed a long-read sequencing strategy to infer the diploid PRDM9 zinc finger array genotype in a high-throughput manner. From an unbiased study of PRDM9 allelic diversity in 720 individuals from seven human populations, we detected 69 PRDM9 alleles. Several alleles differ in frequency among human populations, and 32 alleles had not been identified by previous studies, which were heavily biased to European populations. PRDM9 alleles are distinguished by their DNA binding site preferences and fall into two major categories related to the most common PRDM9-A and PRDM9-C alleles. We also found that it is likely that inter-conversion between allele types is rare. By mapping meiotic double-strand breaks (DSBs) in the testis, we found that small variations in PRDM9 can substantially alter the meiotic recombination landscape, demonstrating that minor PRDM9 variants may play an under-appreciated role in shaping patterns of human recombination. In summary, our data greatly expands knowledge of PRDM9 diversity in humans.