DataSheet_1_Species Identification of Dracaena Using the Complete Chloroplast Genome as a Super-Barcode.docx
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
The taxonomy and nomenclature of Dracaena plants are much disputed, particularly for several Dracaena species in Asia. However, neither morphological features nor common DNA regions are ideal for identification of Dracaena spp. Meanwhile, although multiple Dracaena spp. are sources of the rare traditional medicine dragon’s blood, the Pharmacopoeia of the People’s Republic of China has defined Dracaena cochinchinensis as the only source plant. The inaccurate identification of Dracaena spp. will inevitably affect the clinical efficacy of dragon’s blood. It is therefore important to find a better method to distinguish these species. Here, we report the complete chloroplast (CP) genomes of six Dracaena spp., D. cochinchinensis, D. cambodiana, D. angustifolia, D. terniflora, D. hokouensis, and D. elliptica, obtained through high-throughput Illumina sequencing. These CP genomes exhibited typical circular tetramerous structure, and their sizes ranged from 155,055 (D. elliptica) to 155,449 bp (D. cochinchinensis). The GC content of each CP genome was 37.5%. Furthermore, each CP genome contained 130 genes, including 84 protein-coding genes, 38 tRNA genes, and 8 rRNA genes. There were no potential coding or non-coding regions to distinguish these six species, but the maximum likelihood tree of the six Dracaena spp. and other related species revealed that the whole CP genome can be used as a super-barcode to identify these Dracaena spp. This study provides not only invaluable data for species identification and safe medical application of Dracaena but also an important reference and foundation for species identification and phylogeny of Liliaceae plants.
Read the peer-reviewed publication