Data_Sheet_3_Full-Length Genome of an Ogataea polymorpha Strain CBS4732 ura3Δ Reveals Large Duplicated Segments in Subtelomeric Regions.ZIP (1.94 MB)
Download file

Data_Sheet_3_Full-Length Genome of an Ogataea polymorpha Strain CBS4732 ura3Δ Reveals Large Duplicated Segments in Subtelomeric Regions.ZIP

Download (1.94 MB)
dataset
posted on 06.04.2022, 08:32 authored by Jia Chang, Jinlong Bei, Qi Shao, Hemu Wang, Huan Fan, Tung On Yau, Wenjun Bu, Jishou Ruan, Dongsheng Wei, Shan Gao
Background

Currently, methylotrophic yeasts (e.g., Pichia pastoris, Ogataea polymorpha, and Candida boindii) are subjects of intense genomics studies in basic research and industrial applications. In the genus Ogataea, most research is focused on three basic O. polymorpha strains-CBS4732, NCYC495, and DL-1. However, the relationship between CBS4732, NCYC495, and DL-1 remains unclear, as the genomic differences between them have not be exactly determined without their high-quality complete genomes. As a nutritionally deficient mutant derived from CBS4732, the O. polymorpha strain CBS4732 ura3Δ (named HU-11) is being used for high-yield production of several important proteins or peptides. HU-11 has the same reference genome as CBS4732 (noted as HU-11/CBS4732), because the only genomic difference between them is a 5-bp insertion.

Results

In the present study, we have assembled the full-length genome of O. polymorpha HU-11/CBS4732 using high-depth PacBio and Illumina data. Long terminal repeat retrotransposons (LTR-rts), rDNA, 5′ and 3′ telomeric, subtelomeric, low complexity and other repeat regions were exactly determined to improve the genome quality. In brief, the main findings include complete rDNAs, complete LTR-rts, three large duplicated segments in subtelomeric regions and three structural variations between the HU-11/CBS4732 and NCYC495 genomes. These findings are very important for the assembly of full-length genomes of yeast and the correction of assembly errors in the published genomes of Ogataea spp. HU-11/CBS4732 is so phylogenetically close to NCYC495 that the syntenic regions cover nearly 100% of their genomes. Moreover, HU-11/CBS4732 and NCYC495 share a nucleotide identity of 99.5% through their whole genomes. CBS4732 and NCYC495 can be regarded as the same strain in basic research and industrial applications.

Conclusion

The present study preliminarily revealed the relationship between CBS4732, NCYC495, and DL-1. Our findings provide new opportunities for in-depth understanding of genome evolution in methylotrophic yeasts and lay the foundations for the industrial applications of O. polymorpha CBS4732, NCYC495, DL-1, and their derivative strains. The full-length genome of O. polymorpha HU-11/CBS4732 should be included into the NCBI RefSeq database for future studies of Ogataea spp.

History