search for


Assembly and Comparative Analysis of Complete Mitogenome of Silybum marianum (L.) Gaertner
밀크시슬(Silybum marianum (L.) Gaertner.)의 미토콘드리아 유전체 조립과 비교분석
Korean J. Breed. Sci. 2022;54(4):294-304
Published online December 1, 2022
© 2022 Korean Society of Breeding Science.

Jeongwoo Lee1, Yedomon Ange Bovys Zoclanclounon1, Hwajin Jung1, Taeho Lee1, Jeonggu Kim1, Guhwang Park1, Keunpyo Lee1, Kwanghoon An2, Jeehyoung Shim2, Joonghyoun Chin3, and Suyoung Hong1*

1Genomics Division, National Institute of Agricultural Sciences, RDA, 370 Nongsaengmyeong-ro, Jeonju, 54874, Republic of Korea
2EL&I, Co, Ltd., 17, Jangdoek-ri, Namyang-eup, Gyeonggi-do, Hwaseong, 18281, Republic of Korea
3Department of Integrative Biological Sciences and Industry, Sejong University, 209 Neungdong-ro, Seoul, 05006, Republic of Korea
1농촌진흥청 국립농업과학원 유전체과, 2농업회사법인㈜이엘엔아이, 3세종대학교 스마트생명산업융합학과
Correspondence to: E-mail:, Tel: +82-63-238-4563, Fax: +82-63-238-4554
Received September 30, 2022; Revised October 26, 2022; Accepted November 4, 2022.
This is an Open-Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Silybum marianum (L.) Gaertner. (milk thistle), is a member of the Asteraceae family. Silymarin has hepatoprotective effects, accumulates at high levels in the external cover of milk thistle seeds, and is composed of flavonolignan isomers. In the present study, we assembled and annotated the mitogenome of Silybum marianum. This mitogenome was found to have a length of 407,123 base pairs and an overall base composition: A, 27.41%; T, 27.33%; G, 22.72%; and C, 22.54%. Seventy-four unique genes were identified in the Silybum marianum mitogenome based on annotation results, including 27 protein-coding genes, 44 tRNA genes, and 3 rRNA genes. Common protein-coding genes of 11 Asteraceae family references and four outgroup (Campanulaceae and Solanaceae) mitogenomes were used to construct a phylogenetic tree. The phylogenetic tree of the Silybum marianum mitogenome revealed close relationships with three reference mitogenomes (Arctium tomentosum, Arctium lappa, and Saussurea costus), and the flower morphology of Silybum marianum was similar to that of the three reference mitogenomes. This report describes unique features of the Silybum marianum mitogenome relative to the three related reference mitogenomes. In addition, we could envisage a specific analysis of the phylogenetic relationship of Silybum mariaum using additional Asteraceae family mitogenomes.
Keywords : Asteraceae family, Silybum marianum (L.) Gaertner., Arctium lappa, Arctium tomentosum, Saussurea costus

Silybum marianum (L.) Gaertner. (also known as milk thistle) is a species of the Asteraceae family that it native to the Mediterranean area and has features similar to annual or biennial, self-fertile plants that grow wild throughout the region (Hetz et al. 1993, Leng-Peschlow 1996). Milk thistle is a serious weed in many countries (LeRoy et al. 1997). It grows preferentially in fertile soils, but it can also grow successfully in sandy soils and heavier clay soils (Khan et al. 2009, Karkanis et al. 2011). It tends to occupy areas and eliminate other plant species through competition (Berner et al. 2002). Silybum marianum contains silymarin, which has hepatoprotective effects. Silymarin is highly accumulated in the external cover of Silybum marianum seeds and is composed of flavonolignan isomers (silybin, isosilybin, silychristin, isosilychristin, and silydianin) (Deep et al. 2008, Valková et al. 2021). Silybin is the principal active compound (Saller et al. 2001). Silybum marianum is a troublesome weed, but it can also be cultivated as a medicinal plant because of its silymarin components.

Mitochondria are membrane-bound cell organelles that play key roles in apoptosis regulation and energy production (Susin et al. 1999). The mitogenome of plants shows high rates of gene loss, accompanying gene transfer to the nucleus, intron acquisition by cross-species horizontal transfer (Palmer et al. 2000), and genetic variability in terms of repetitive sequences, non-coding regions, large introns, frequent duplications, and intergenic alterations (Kenji et al. 1992, Unseld et al. 1997). Therefore, plant mitogenomes show considerable variations in their length, gene order, and gene content (Richardson et al. 2013). In angiosperms, the mitogenome size ranges from 200 to 750 kilobases (kb) (Gualberto et al. 2014). Animal mitogenomes, which are approximately 16.5 kb long, are smaller than those of plants, but the number of encoded genes in plant mitogenomes is smaller than that in animals. Moreover, although the mitogenome size differs between plants, the number of genes in each genome is similar (Morley & Neilsen 2017). The genes encoded by the mitogenome fall into different functional classes such as respiration, oxidative phosphorylation, rRNAs, tRNAs, ribosomal proteins, elongation factor Tu (EF-Tu), RNA maturation, protein import, maturation, and transcription (Burger et al. 2003).

The GenBank Organelle Genome Resource ( contains approximately 7460 and 450 reference chloroplast and mitochondrion genomes, respectively. By searching mitochondrial reference genomes, we found that the land plant subgroup occupied approximately 337 (74%) of deposited mitogenome sequences and another subgroup related to green algae occupied the remaining mitogenome sequences. The land plant subgroup is divided into bryophytes and tracheophytes. Approximately 262 (78%) of public mitogenome sequences belong to the tracheophyte division. In the tracheophyte division, reference mitogenomes are available for 18 species in the Asteraceae family that are members of 8 different genera, including Ageratum (NC_053927.1, Ageratum conyzoides), Arctium (NC_058644.1, Arctium lappa; NC_058643.1, Arctium tomentosum), Bidens (NC_060635.1, Bidens bipinnata; NC_062672.1, Bidens biternata; NC_062670.1; Bidens parviflora; NC_062673.1, Bidens pilosa; NC_062671.1, Bidens tripartita), Chrysanthemum (NC_039757.1, Chrysanthemum boreale), Diplostephium (NC_ 034354.1, Diplostephium hartwegii), Helianthus (NC_023337.1, Helianthus annuus; NC_051989.1, Helianthus grosseserratus; NC_058584.1, Helianthus occidentalis; NC_051990.1, Helianthus strumosus; NC_058585.1, Helianthus tuberosus), Lactuca (NC_ 042406.1, Lactuca saligna; NC_042756.1, Lactuca sativa), and Saussurea (NC_059793.1, Saussurea costus). The complete chloroplast genome of Silybum marianum was derived from a plant of ‘SMAR20150709’ (unpublished) and deposited under Accession Number NC_028027.1, but the mitogenome has not been reported. In this study, the mitogenome features of Silybum marianum were analyzed and compared with the published reference mitogenomes of plants in the Asteraceae family.

Materials and Methods

Mitogenome assembly and annotation

Silybum marianum DNA was extracted from plants with an unknown genetic source (‘912036’) provided from EL&I, Co., Ltd. in Gyeonggi-do, Korea (Shim et al. 2020). The DNA was sequenced using long-read and short-read sequencing. For long-read sequencing, the Oxford Nanopore PromethION platform was used with the FLO-PRO002 flow cell type, and the libraries were prepared using the SQK-LSK110 Kit. For short-read sequencing, Illumina sequencing libraries were prepared using the TruSeq Nano DNA Kit and sequenced using the Illumina HiSeq X platform (151 base-pair [bp] paired-end reading). Long-read sequencing generated 3,063,041 reads, with a read-length N50 value of 39,844 bp and a mean read length of 25,239.3 bp, for a total of 77,308,942,612 bp. Short-read sequencing generated 195,748,964 reads and containing 29,558,093,564 bp. The Oxford Nanopore long reads were used to assemble the mitogenome of Silybum marianum using NextDenovo software (version 2.3.1), and Illumina short reads were used to correct the assembled data using NextPolish software (version 1.3.1). The default parameters of the NextDenovo and NextDenovo software tools were used. The assembled and corrected data comprised 705,967,878 bp and 67 contigs with an average length of 10,536,834 bp, a maximum length of 49,987,051 bp with an N50 value of 27,691,683 bp (11 contigs; n=11), an N70 value of 22,464,351 bp (n=17), and an N90 value of 12,772,082 bp (n=25). To identify the mitogenome, these contigs were compared with the reference mitogenomes of plants deposited in the National Center for Biotechnology Information (NCBI) by performing BLASTn analysis. Based on ≥99% sequence identity and ≥5 kb of identical matching number and length, a self-looping contig was selected as the potential mitogenome of Silybum marianum.

The potential mitogenome was initially annotated using a publicly available web-based tool (MITOFY; cgi-bin/mitofy/mitofy.cgi) to identify genes. Subsequently, the web-based tool GeSeq (which employs tRNAscan-SE software, version 2.0.7) was used for annotation by comparison with three reference mitogenomes from NCBI GenBank (NC_058644.1, Arctium lappa; NC_058643.1, Arctium tomentosum; NC_059793.1, Saussurea costus). The three reference mitogenomes were selected based on ≥99% sequence identity and ≥20 kb of identical matches with the contig using BLASTn. The GenBank file which has been resulted from the GeSeq annotation was edited to draw a circular mitogenome map using OGDRAW based on the MITOFY annotation results.

Phylogenetic inference

Eleven reference mitogenomes (NC_058643.1, Arctium tomentosum; NC_058644.1, Arctium lappa; NC_059793.1, Saussurea costus; NC_058584.1, Helianthus occidentalis; NC_ 058585.1, Helianthus tuberosus; NC_051990.1, Helianthus strumosus; NC_051989.1, Helianthus grosseserratus; NC_039757.1, Chrysanthemum boreale; NC_034354.1, Diplostephium hartwegii; NC_042756.1, Lactuca sativa; NC_042406.1, Lactuca saligna) for the Asteraceae family were selected based on this with ≥ 99% sequence identity and ≥5 kb length of identical matches when the contig (i.e., the potential mitogenome) was analyzed by BLASTn. Additionally, four reference mitogenomes (NC_037949.1, Codonopsis lanceolata; NC_035958.1, Platycodon grandifloras; NC_006581.1, Nicotiana tabacum; NC_035963.1, Solanum lycopersicum) of the outgroup were used as controls for the plant mitogenomes. The amino acid sequences of common protein-coding genes in 16 mitogenomes related to the Silybum marianum mitogenome, 11 reference mitogenomes for Asteraceae members, and 4 outgroups were downloaded from NCBI GenBank and used to construct a phylogenetic tree. The amino acid sequences corresponding to each protein-coding gene of the milk thistle and reference plant mitogenomes were aligned using MAFFT (version 7.505) (Katoh & Standley 2013). TrimAl (version 1.4.rev15) was used to trim the aligned amino acid sequences and remove spurious sequences or poorly aligned regions. After alignment and trimming of the amino acid sequences, they were used as input data for the IQ-TREE tool (version 2.2.0). The IQ-TREE analysis generated data in NEWICK format, which was used to construct a phylogenetic tree using the FigTree tool (version 1.4.4).

Analysis of repeat elements

Simple-sequence repeats (SSRs) of the reference mitogenomes and Silybum marianum mitogenome were discovered using the online website MISA (, with a size of one to six nucleotides and minimum numbers of 8, 4, 4, 3, 3, and 3, respectively. The Tandem Repeats Finder program (version 4.07b) was used with the default parameters to analyze additional repeat elements for tandem repeats (Benson 1999).

Analysis of nucleotide diversity

In previous studies, nucleotide-diversity (Pi) values were used to evaluate nucleotide differences between multiple sequences (Mehmetoglu et al. 2022; Zhang et al. 2009). The DNA sequence files of the reference mitogenomes (NC_058644.1, Arctium lappa; NC_058643.1, Arctium tomentosum; NC_059793.1, Saussurea costus) that formed a cluster with Silybum marianum mitogenome in the phylogenetic tree were downloaded from NCBI GenBank. The sequences of the mitogenomes with strong matches with the Silybum marianum mitogenome were aligned using MAFFT (version 7.505), and the aligned sequences were used as input data. The Pi positions were defined between the sequence files of the reference mitogenomes and Silybum marianum mitogenome by performing DNA-polymorphism analysis with the DnaSP software package (version 5.10.01). The positions of the high and low Pi values were checked to identify the coding genes of the Silybum marianum mitogenome. A 100-bp sliding window with a 25-bp step size was used to summarize Pi for visualization purposes.

Estimating nucleotide-substitution rates

Common protein-coding genes found in the three reference mitogenomes (NC_058644.1, Arctium lappa; NC_058643.1, Arctium tomentosum; NC_059793.1, Saussurea costus) and the Silybum marianum mitogenome were used to estimate nucleotide-substitution rates. The nucleotide-substitution rates, including the non-synonymous-substitution rate (Ka) and synonymous-substitution rate (Ks), as well as the Ka: Ks ratio of the protein-coding genes were estimated using the KaKs Calculator (version 2.0). Pairwise Ka: Ks ratios were plotted using the pheatmap package of R software.


Genomic features of the Silybum marianum mitogenome

The mitogenomes of Asteraceae family members deposited in NCBI GenBank had an average size of 266,432.61 bp and the following average base compositions: A, 27.38%; T, 27.32%; G, 22.64%; and C, 22.66%. The assembled Silybum marianum mitogenome generated in this study had a typical circular structure with a size of 407,123 bp (Fig. 1). The overall base compositions were as follows: A, 27.41%; T, 27.33%; G, 22.72%; and C, 22.54%. Seventy-four unique genes were identified in the Silybum marianum mitogenome based on the annotation results. These genes included 27 protein-coding genes, 44 tRNA genes, and 3 rRNA genes (Table 1). The 27 protein-coding genes could be divided into seven classes, including ATP synthases (atp1, atp4, atp6, atp8, and atp9), Cytochrome c biogenesis (ccmB, ccmFc, and ccmFn), Cytochrome c oxidases (cox1, cox2, and cox3), NADH dehydrogenases (nad1, nad2, nad3, nad4L, nad5, nad6, and nad7), Large ribosomal subunits (rpl5, rpl10, and rpl16), Small ribosomal subunits (rps3, rps4, rps12, rps13, and rps14), and Succinate dehydrogenase (sdh4). Of the protein-coding genes, ccmFc, cox2, nad2, nad5, nad7, and rps3 contain introns, three genes (ccmFc, cox2, and rps3) harbor one intron, two genes (nad2 and nad5) harbor two introns, and one gene (nad7) harbors four introns. Of the tRNA-coding genes, trnQ-UUG and trnT-UGU contain one intron. rrn5, nad5, nad6, and nad7 were annotated in more than one region. Three copies of the coding gene rrn5 were detected. The annotations for nad6 and nad7 revealed that two copies of these protein-coding genes were present in the mitogenome, but the annotation for nad5 revealed two regions of different sizes for this gene. In another study, the nad5 gene of higher plant mitochondria required trans-splicing to induce maturation of the mRNA, and the coding genes for nad5 were split into three or five exons at distant regions in the mitochondria of the higher plants (wheat and maize), Oenothera, and Arabidopsis (Glanz & Kück 2009; Knoop at al. 1991; Pereira de Souza et al. 1991). In the Silybum marianum mitogenome, the coding genes for nad5 were found to have two exons at two distant genomic regions. One nad5 region was 1,501 bp long (bp 61454-62954) and had one intron of 960 bp. The other nad5 region was 2,286 bp long (bp 266361-268646) and had one intron of 834 bp. It is necessary to confirm whether these coding genes for nad5 require trans-splicing, based on the annotation results for nad5.

Table 1

Gene composition of the Silybum marianum mitogenome.

Group of genes Name of genes
ATP synthases atp1, atp4, atp6, atp8, atp9
Cytochrome c biogenesis ccmB, ccmFc, ccmFn
Cytochrome c oxidases cox1, cox2, cox3
NADH dehydrogenases nad1, nad2, nad3, nad4L, nad5 nad6, nad7
Large ribosomal subunits rpl5, rpl10, rpl16
Small ribosomal subunits rps3, rps4, rps12, rps13, rps14
Succinate dehydrogenase sdh4
Ribosomal RNAs rrn5, rrn18, rrn26
Transfer RNAs trnY-GUA, trnY-AUA, trnW-CCA, trnV-GAC, trnV-CAC, trnT-UGU, trnT-GGU, trnT-AGU, trnS-UGA, trnS-GGA, trnS-GCU, trnS-CGA, trnS-AGA, trnR-UCU, trnR-UCG, trnQ-UUG, trnQ-CUG, trnP-UGG, trnP-CGG, trnP-AGG, trnO-CUA, trnN-GUU, trnM-CAU, trnL-UAG, trnL-UAA, trnL-GAG, trnL-CAG, trnK-UUU, trnK-CUU, trnI-UAU, trnI-AAU, trnH-GUG, trnG-UCC, trnG-GCC, trnG-CCC, trnF-GAA, trnF-AAA, trnE-CUC, trnE-UUC, trnD-GUC, trnC-GCA, trnC-ACA, trnA-CGC, trnseC-UCA

Fig. 1. Circular map of the Silybum marianum mitogenome. The genomic features of both strands are drawn clockwise and counter-clockwise on the inside and outside of the circle, respectively. Different functional gene groups are color-coded, as indicated in the legend.

After assembly and annotation of the Silybum marianum mitogenome, 11 reference mitogenomes of the Asteraceae family and four reference mitogenomes of the outgroup were compared to identify common protein-coding genes shared with Silybum marianum. Sixteen common protein-coding genes were identified between the reference mitogenomes and the Silybum marianum mitogenome (Fig. 2).

Fig. 2. Gene contents in the 12 Asteraceae and 4 outgroup mitogenomes. Twenty-seven common protein-coding genes were compared. Members of the Campanulaceae (NC_035958.1, Platycodon grandiflorus; NC_ 037949.1, Codonopsis lanceolata) and Solanaceae (NC_006581.1, Nicotiana tabacum; NC_035963.1, Solanum lycopersicum) families were used as the outgroups.

Phylogenetic analysis of common genes

Phylogenetic analysis of the amino acid sequences of 16 common protein-coding genes (atp1, atp6, atp9, ccmB, ccmFc, ccmFn, cox1, cox3, nad3, nad4L, nad6, nad7, rps3, rps4, rps12, and rps13) of the 16 mitogenomes yielded separations between the Asteraceae family group and the outgroup (Fig. 3). In the outgroup, the mitogenomes of the Campanulaceae family (NC_ 037949.1, Codonopsis lanceolata; NC_035958.1, Platycodon grandiflorus) and the Solanaceae family (NC_006581.1, Nicotiana tabacum; NC_035963.1, Solanum lycopersicum) separated into clusters. In the Asteraceae family group, the mitogenomes of identical genera (Arctium, Helianthus, and Lactuca) formed distinct clusters. The three mitogenomes of Arctium tomentosum (NC_058643.1), Arctium lappa (NC_058644.1), and Saussurea costus (NC_059793.1) were closely related to the mitogenome of Silybum marianum.

Fig. 3. Phylogenetic tree of Silybum marianum with other 11 Asteraceae and 4 outgroup plants. The tree was constructed based on the common protein-coding sequences. The numbers on the tree are bootstrap values.

The Asteraceae family has capitulum features, and the capitula commonly has two types of florets (ray and disc florets). Ray floral symmetry is characterized by three fused ventral petals protruding, whereas disc florets have radial symmetry with five evenly sized petals (Figs. 4A-4C) (Zoulias et al. 2019). In a study conducted by Elomaa et al. (2018), the characteristics of Asteraceae flower heads were heterogamous and homogamous. With the heterogamous flower heads, the capitulum was occupied by ray flowers and the center was occupied by disc flowers. For example, sunflower plants (Helianthus) have marginal ray flowers, which are sterile and have perfect central discs. In contrast, in homogamous flowers, the heads are formed from single-flower types. For example, the heads of lettuce (Lactuca) are composed of only ray flowers, whereas the discoid heads in thistles develop disc flowers. Based on this morphological feature of Asteraceae flowers (Fig. 4), Silybum marianum was similar to the three reference species of the Asteraceae family. These three mitogenomes were identical to the three reference mitogenomes used for Silybum marianum annotation.

Fig. 4. Capitulum morphologies of members of the Asteraceae family. Matricaria inodora capitula (A and B) and phyllaries and florets (C) (Zoulias et al. 2019). Inflorescences of Gerbera (D; Elomaa et al. 2018), Cirsium canescens Nutt. (E; Ackerfield et al. 2020), and Silybum marianum (L.) Gaertn. (genetic source: ‘912036’) (F). (G) ray floret. (H) Phyllary. (I) Disc floret.

Comparing the genomic features of mitogenomes from closed species

The SSR results for four mitogenomes (Silybum marianum mitogenome used in this study; NC_058643.1, Arctium tomentosum; NC_058644.1, Arctium lappa; NC_059793.1, Saussurea costus) revealed similar numbers of repeats that were one to six nucleotides long (Fig. 5). The mitogenomes of Arctium tomentosum and Arctium lappa had identical SSR results, and the number of di-nucleotide repeats was only higher than the number of mono-nucleotide repeats for Silybum marianum. In terms of the distributions of perfect tandem repeats, Silybum marianum, Arctium tomentosum, Arctium lappa, and Saussurea costus had eight, four, four, and seven tandem repeats, respectively (Table 2). The mitogenomes of Arctium tomentosum and Arctium lappa were found to have identical tandem repeat sequences, but the positions of the repeat sequences were slightly different. The repeat sequence ‘GAAAAGGGTATGAAATAGGTTGCTTGT’ is shared between three mitogenomes (Silybum marianum, Arctium tomentosum, and Arctium lappa), and it is located at two regions of the mitogenome of Silybum marianum. The two repeat sequences ‘TGAGAGATTCTATAGTTCCTGAGCT’ and ‘AGGTAAAA CAGTACGCCCACT’ are shared between the Silybum marianum and Saussurea costus mitogenomes.

Table 2

Distributions of perfect tandem repeats.

Plant species No. Size (bp) Start Stop Repeat sequence (5'-3')
Silybum marianum (mitogenome used in this study) 1 27 41,104 41,157 GAAAAGGGTATGAAATAGGTTGCTTGT (×2)
3 21 101,162 101,205 AGGTAAAACAGTACGCCCACT (×2)
4 19 237,447 237,483 AACCAGGCAAATCTCTCTG (×2)
6 25 315,262 315,311 TGAGAGATTCTATAGTTCCTGAGCT (×2)
7 19 365,938 365,974 AACCAGGCAAATCTCTCTG (×2)
Arctium tomentosum (NC_0586431.1) 1 27 31,306 31,359 GAAAAGGGTATGAAATAGGTTGCTTGT (×2)
2 16 116,399 116,429 CTTGAATCTTATAGCA (×2)
3 25 160,161 160,210 AGCTCAGGAACTATAGAATCTCTCA (×2)
4 25 234,416 234,465 AGCTCAGGAACTATAGAATCTCTCA (×2)
Arctium lappa (NC_058644.1) 1 27 31,306 31,359 GAAAAGGGTATGAAATAGGTTGCTTGT (×2)
2 16 116,394 116,424 CTTGAATCTTATAGCA (×2)
3 25 160,156 160,205 AGCTCAGGAACTATAGAATCTCTCA (×2)
4 25 234,411 234,460 AGCTCAGGAACTATAGAATCTCTCA (×2)
Saussurea costus (NC_059793.1) 1 13 12,844 12,869 AGAATCTAAAATC (×2)
3 21 152,667 152,710 AGGTAAAACAGTACGCCCACT (×2)
4 20 205,751 205,791 GAATAAAGATAAGTACAAAA (×2)
5 21 272,092 272,153 CTATAAGATGCTAGCTGAAAT (×2)
6 21 285,818 285,879 TTTCAGCTAGCATCTTATAGA (×2)
7 25 306,355 306,404 AGCTCAGGAACTATAGAATCTCTCA (×2)

Fig. 5. Comparison of simple-sequence repeats (SSRs) among the four indicated mitogenomes. Each column represents different repeat types. The numbers of repeats in each category are shown on the top of the corresponding columns.

In this study, the Pi values of the four mitogenomes varied at different positions (Fig. 6). Eight regions had Pi values of >0.5 (nucleotide positions 34,711-34,964, 198,523-198,775, 20,7021-207,400, 208,117-208,242, 209,467-209,616, 210,848-210,988, 214,087-214,217, and 216,649-216,770), and two regions had Pi value of <0.01 (nucleotide positions 322,723-333,182 and 353,080-356,650). In the Silybum marianum mitogenome, the coding genes trnS-UGA, rps4, and cox2 mapped to regions with Pi values higher than 0.5, and the coding genes trnK-UUU, nad6, trnL-GAG, trnE-UUC, trnV-CAC, and trnsec-UCA mapped to regions with Pi values less than 0.01.

Fig. 6. Sliding-window plot showing nucleotide differences as evaluated by determining the nucleotide diversity (Pi) for the indicated four mitogenomes. The X-axis shows the positions of mitogenomes in 50-kb increments, whereas the Y-axis shows Pi values. The color density reflects the corresponding Pi values, as indicated in the legend.

Nucleotide-substitution rates (Ka: Ks ratios) are used to understand the evolutionary dynamics of protein-coding genes in closed species (Fay & Wu 2003). The Ka: Ks ratios can be interpreted to indicate evolutionary selective pressure: neutral evolution when the Ka: Ks ratio=1, positive selection when the Ka: Ks ratio is >1, and negative selection when the Ka: Ks ratio is <1 (Zhang et al. 2006). Twenty common protein-coding genes (atp1, atp4, atp6, atp8, atp9, ccmB, ccmFc, ccmFn, cox1, cox3, nad3, nad4L, nad6, nad7, rpl5, rpl10, rps3, rps4, rps12, and rps13) were selected based on sequence-size similarity among the four mitogenomes. The nucleotide-substitution rates of common protein-coding genes were estimated for the Silybum marianum mitogenome and three reference mitogenomes (Fig. 7). The Ka: Ks ratio of ccmB was >1, and those of cox1, rps13, rps12, atp4, nad4L, atp6, rpl5, nad3, rpl10, ccmFc, atp8, cox3, and rps3 were <1, suggesting that both positive and negative selection occurred during evolution. The numbers of protein-coding genes with Ka: Ks ratios of <1 were 18, 16, and 15 in the mitogenomes of Arctium tomentosum, Arctium lappa, and Saussurea costus, respectively. In this study, the average Ka: Ks ratios of most protein-coding genes were <1. These Ka: Ks ratios indicate that negative selection occurred as a means of conserving those genes during evolution.

Fig. 7. The nucleotide-substitution rates (Ka: Ks ratios) of 20 common protein-coding genes for the indicated four mitogenomes. Each column in the heatmap represents Ka: Ks ratios determined by comparison with the Silybum marianum mitogenome. The color density of the heatmap reflects the Ka: Ks ration, as indicated in the legend.

In this study, we assembled and annotated the mitogenome of Silybum marianum and compared common protein-coding genes between the mitogenomes of Silybum marianum and reference plants (11 members of the Asteraceae family and four outgroup plants) to construct a phylogenetic tree. Phylogenetic analysis using common protein-coding genes showed that the Silybum marianum mitogenome was closely related to the mitogenomes of three reference Asteraceae family plants (Arctium tomentosum, Arctium lappa, and Saussurea costus). Genomic features were compared to repeat elements of four mitogenomes (Silybum marianum, Arctium tomentosum, Arctium lappa, and Saussurea costus). The SSR values were similar in the four mitogenomes. In terms of perfect tandem repeats, the tandem repeat sequences of the two mitogenomes (Arctium tomentosum and Arctium lappa) were identical, and the mitogenome of Silybum marianum had one or two identical tandem repeat sequences with each of the three mitogenomes. These four mitogenomes were used to evaluate nucleotide differences, which showed relatively large and small differences in the nucleotide positions in these four mitogenomes. When analyzing the nucleotide-substitution rates, we found that the Ka: Ks ratios between the mitogenome of Silybum marianum and the three reference mitogenomes were almost <1 for the common protein-coding genes. These values suggest that common protein-coding genes in the mitogenome of Silybum marianum were conserved during evolution.

Plant mitogenomes can be used to analyze phylogenetic relationships with the mitogenomes of other plant species. Zervas et al. (2019) compared the substitution rates of holoparasitic, hemiparasitic, and autotrophic plants by constructing a phylogenetic tree for angiosperms. In this study, the mitogenomes of Viscaceae among parasitic plants were unique with regard to their mitogenome contents and evolutionary substitution rates. In another study, Chang et al. (2013) studied the genome structures of soybean plants and gene evolution at the intercellular and phylogenetic levels. In this study, we used the mitogenome of representative soybean species with conserved genes to construct a phylogenetic tree to analyze mitogenome evolution. Our phylogenetic tree showed that intercellular transfer (loss or acquisition) occurred with genes of the soybean mitogenome and implied that gene loss of the mitogenome in seed plants may be considered a form of evolutionary compaction.

Cirsium, a genus of the Asteraceae family that comprises annual or perennial herbs, is distributed throughout northern Africa, Asia, Central and North America, and Europe (Song & Kim 2007). This genus comprises approximately 250-300 species worldwide and can be delineated by studying their characteristics in terms of cypsela size, color, surface ornamentation, and pericarp, and testa structures (Ghimire et al. 2018). Cirsium japonicum var. maackii (Maxim.) Matsum is known as Korean milk thistle and contains flavonoid compounds with pharmacological effects in various parts of the plant (Lee et al. 2017). The results of a study conducted by Jung et al. (2017) demonstrated that the methanol extracts and flavonoids from Cirsium japonicum var. maackii (Maxim.) Matsum protected human hepatocellular carcinoma (HepG2) cells against oxidative damage are that they potential natural antioxidative biomarkers of oxidative stress-induced hepatotoxicity. Park et al. (2004) described the pharmacological properties of a methanol extract and hispidulin 7-O-neohesperidoside isolated from Cirsium japonicum var. Ussuriense. Their results showed that the extract and compound decreased hepatic lipid peroxidation, along with increased hepatic levels of reduced glutathione, suggesting that the plant may affect alcoholic toxicity by enhancing ethanol oxidation and inhibiting lipid peroxidation. Additionally, other findings have supported the hepatoprotective efficacy of flavonoids and transcriptomics from the Cirsium genus (Mok et al. 2011, Yoo & Bae 2012, Park et al. 2020). The authors of those studies concluded that Cirsium is similar to Silybum mariaum in terms of its hepatoprotective properties.

The results of this study demonstrate that the mitogenome of Silybum mariaum is closely related to three reference mitogenomes (Arctium tomentosum, Arctium lappa, and Saussurea costus) and that phylogenetic relationships with morphological flowers of the Asteraceae family could be recognized in the mitogenome. Previous data suggest that Silybum mariaum is morphologically and pharmacologically similar to Cirsium (Ma et al. 2016, Nam et al. 2018). Therefore, additional mitogenomes of the Asteraceae family of plants in the Cirsium genus may be needed to specifically analyze the phylogenetic relationship of Silybum mariaum within Asteraceae family plants based on their mitogenomes.


This research was supported by the Rural Development Administration of South Korea under Project Number PJ015988.

  1. Ackerfield J, Susanna A, Funk V, Kelch D, Park DS, Thornhill AH, Yildiz B, Arabaci T, Dirmenci T. 2020. A prickly puzzle: generic delimitations in the Carduus‐Cirsium group (Compositae: Cardueae: Carduinae). Taxon 69: 715-738.
  2. Benson G. 1999. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res 27: 573-580.
    Pubmed KoreaMed CrossRef
  3. Berner DK, Paxson LK, Bruckart WL, Luster DG, McMahon M, Michael JL. 2002. First report of Silybum marianum as a host of Puccinia punctiformis. Plant Dis 86: 1271-1271.
    Pubmed CrossRef
  4. Burger G, Gray MW, Lang BF. 2003. Mitochondrial genomes: anything goes. Trends Genet 19: 709-716.
    Pubmed CrossRef
  5. Chang S, Wang Y, Lu J, Gai J, Li J, Chu P, Guan R, Zhao T. 2013. The mitochondrial genome of soybean reveals complex genome structures and gene evolution at intercellular and phylogenetic levels. PLoS One 8: e56502.
    Pubmed KoreaMed CrossRef
  6. Deep G, Oberlies NH, Kroll DJ, Agarwal R. 2008. Identifying the differential effects of silymarin constituents on cell growth and cell cycle regulatory molecules in human prostate cancer cells. Int J Cancer 123: 41-50.
    Pubmed CrossRef
  7. Elomaa P, Zhao Y, Zhang T. 2018. Flower heads in Asteraceae-recruitment of conserved developmental regulators to control the flower-like inflorescence architecture. Hortic Res 5: 1-10.
    Pubmed KoreaMed CrossRef
  8. Fay JC, Wu CI. 2003. Sequence divergence, functional constraint, and selection in protein evolution. Annu Rev Genomics Hum Genet 4: 213-235.
    Pubmed CrossRef
  9. Ghimire B, Suh GU, Lee CH, Heo K, Jeong MJ. 2018. Cypsela morphology of Cirsium species (Asteraceae) and its taxonomic implications. Flora 249: 40-52.
  10. Glanz S, Kück U. 2009. Trans‐splicing of organelle introns-a detour to continuous RNAs. Bioessays 31: 921-934.
    Pubmed CrossRef
  11. Gualberto JM, Mileshina D, Wallet C, Niazi AK, Weber-Lotfi F, Dietrich A. 2014. The plant mitochondrial genome: dynamics and maintenance. Biochimie 100: 107-120.
    Pubmed CrossRef
  12. Hetz E, Liersch R, Schieder O. 1993. The ratio of auto-and xenogamy in Silybum marianum. Planta Med 59(S1): A702-A702.
  13. Jung HA, Abdul QA, Byun JS, Joung EJ, Gwon WG, Lee MS, Kim HR, Choi JS. 2017. Protective effects of flavonoids isolated from Korean milk thistle Cirsium japonicum var. maackii (Maxim.) Matsum on tert-butyl hydroperoxide-induced hepatotoxicity in HepG2 cells. J Ethnopharmacol 209: 62-72.
    Pubmed CrossRef
  14. Karkanis A, Bilalis D, Efthimiadou A. 2011. Cultivation of milk thistle (Silybum marianum L. Gaertn.), a medicinal weed. Ind. Crops Prod 34: 825-830.
  15. Katoh K, Standley DM. 2013.
    Pubmed KoreaMed CrossRef
  16. Kenji O, Yamato K, Ohta E, Nakamura Y, Takemura M, Nozato N, Akashi K, Kanegae T, Ogura Y, Kohchi Y, Ohyama K. 1992. Gene organization deduced from the complete sequence of liverwort Marchantia polymorpha mitochondrial DNA: a primitive form of plant mitochondrial genome. J Mol Biol 223: 1-7.
    Pubmed CrossRef
  17. Khan MA, Blackshaw RE, Marwat KB. 2009. Biology of milk thistle (Silybum marianum) and the management options for growers in north‐western Pakistan. Weed Biol Manag 9: 99-105.
  18. Knoop V, Schuster W, Wissinger B, Brennicke A. 1991. Trans splicing integrates an exon of 22 nucleotides into the nad5 mRNA in higher plant mitochondria. EMBO J 10: 3483-3493.
    Pubmed KoreaMed CrossRef
  19. Lee J, Rodriguez JP, Lee KH, Park JY, Kang KS, Hahm DH, Huh CK, Lee SC, Lee S. 2017. Determination of flavonoids from Cirsium japonicum var. maackii and their inhibitory activities against aldose reductase. Appl Biol Chem 60: 487-496.
  20. Leng-Peschlow E. 1996. Properties and medical use of flavonolignans (silymarin) from Silybum marianum. Phytother Res 10: s25-s26.
  21. LeRoy H, Doll J, Holm E, Pancho JV, Herberger JP. 1997. World weeds: Natural histories and distribution. 1st ed. John Wiley & Sons. p. 775-786.
  22. Ma Q, Wang LH, Jiang JG. 2016. Hepatoprotective effect of flavonoids from Cirsium japonicum DC on hepatotoxicity in comparison with silymarin. Food Funct 7: 2179-2184.
    Pubmed CrossRef
  23. Mehmetoglu E, Kaymaz Y, Ates D, Kahraman A, Tanyolac MB. 2022. The complete chloroplast genome sequence of Cicer echinospermum, genome organization and comparison with related species. Sci Hortic 296: 110912.
  24. Mok JY, Kanh HJ, Cho JK, Jeon IH, Kim HS, Park JM, Jeong SI, Shim JS, Jang SI. 2011. Antioxidative and anti-inflammatory effects of extracts from different organs of Cirsium japonicum var. ussuriense. Kor J Herbol 26: 39-47.
  25. Morley SA, Nielsen BL. 2017. Plant mitochondrial DNA. Front Biosci 22: 1023-1032.
    Pubmed CrossRef
  26. Nam SH, Lee BH, Kim YJ. 2018. Silymarin contents and liver protection effects of six domestic cultivated thistles. Trends Agric Life Sci 56: 55-62.
  27. Palmer JD, Adams KL, Cho Y, Parkinson CL, Qiu YL, Song K. 2000. Dynamic evolution of plant mitochondrial genomes: mobile genes and introns and highly variable mutation rates. Proc Natl Acad Sci 97: 6960-6966.
    Pubmed KoreaMed CrossRef
  28. Park JC, Hur JM, Park JG, Kim SC, Park JR, Choi SH, Choi JW. 2004. Effects of methanol extract of Cirsium japonicum var. ussuriense and its principle, hispidulin‐7‐O‐neohesperidoside on hepatic alcohol‐metabolizing enzymes and lipid peroxidation in ethanol‐treated rats. Phytother Res 18: 19-24.
    Pubmed CrossRef
  29. Park YJ, Baek SA, Kim JK, Park SU. 2020. Integrated analysis of transcriptome and metabolome in Cirsium japonicum Fisch ex DC. ACS Omega 5: 29312-29324.
    Pubmed KoreaMed CrossRef
  30. Pereira de Souza A, Jubier MF, Delcher E, Lancelin D, Lejeune B. 1991. A trans-splicing model for the expression of the tripartite nad5 gene in wheat and maize mitochondria. Plant Cell 3: 1363-1378.
    Pubmed KoreaMed CrossRef
  31. Richardson AO, Rice DW, Young GJ, Alverson AJ, Palmer JD. 2013. The "fossilized" mitochondrial genome of Liriodendron tulipifera: ancestral gene content and order, ancestral editing sites, and extraordinarily low mutation rate. BMC Biol 11: 1-17.
    Pubmed KoreaMed CrossRef
  32. Saller R, Meier R, Brignoli R. 2001. The use of silymarin in the treatment of liver diseases. Drugs 61: 2035-2063.
    Pubmed CrossRef
  33. Shim J, Han JH, Shin NH, Lee JE, Sung JS, Yu Y, Lee S, Ahn KH, Chin JH. 2020. Complete chloroplast genome of a milk thistle (Silybum marianum) Acc.'912036'. Plant Breed Biotechnol 8: 439-444.
  34. Song MJ, Kim H. 2007. Taxonomic study on Cirsium Miller (Asteraceae) in Korea based on external morphology. Kor. J Plant Taxon 37: 17-40.
  35. Susin SA, Lorenzo HK, Zamzami N, Marzo I, Snow BE, Brothers GM, Mangion J, Jacotot E, Costantini P, Loeffler M, Larochette N, Goodlett DR, Aebersold R, Siderovski DP, Penninger JM, Kroemer G. 1999. Molecular characterization of mitochondrial apoptosis-inducing factor. Nature 397: 441-446.
    Pubmed CrossRef
  36. Unseld M, Marienfeld JR, Brandt P, Brennicke A. 1997. The mitochondrial genome of Arabidopsis thaliana contains 57 genes in 366,924 nucleotides. Nat Genet 15: 57-61.
    Pubmed CrossRef
  37. Valková V, Ďúranová H, Bilcikova J, Habán M. 2021. Milk thistle (Silybum marianum): a valuable medicinal plant with several therapeutic purposes. J Microbiol Biotechnol Food Sci 9: 836-843.
  38. Yoo SK, Bae YM. 2012. Phylogenetic and chemical analyses of Cirsium pendulum and Cirsium setidens inhabiting Korea. J Life Sci 22: 1120-1125.
  39. Zervas A, Petersen G, Seberg O. 2019. Mitochondrial genome evolution in parasitic plants. BMC Evol Biol 19: 1-14.
    Pubmed KoreaMed CrossRef
  40. Zhang H, Gong H, Zhou X. 2009. Molecular characterization and pathogenicity of tomato yellow leaf curl virus in China. Virus Genes 39: 249-255.
    Pubmed CrossRef
  41. Zhang Z, Li J, Zhao XQ, Wang J, Wong GK, Yu J. 2006. KaKs_Calculator: calculating Ka and Ks through model selection and model averaging. Genom Proteom Bioinform 4: 259-263.
    Pubmed CrossRef
  42. Zoulias N, Duttke SHC, Garcês H, Spencer V, Kim M. 2019. The role of auxin in the pattern formation of the Asteraceae flower head (capitulum). Plant Physiol 179: 391-401.
    Pubmed KoreaMed CrossRef

December 2022, 54 (4)
Full Text(PDF) Free

Social Network Service

Cited By Articles
  • CrossRef (0)

Funding Information