ISSN : 2287-5174(Online)
DOI : https://doi.org/10.9787/KJBS.2012.44.4.490
Development of EST-SSRs in Brassica napus
We have witnessed the versatile applications of molecular markers in plant genetics, breeding, and ecology during the past three decades. Among the many molecular markers, SSRs (simple sequence repeats) or microsatellites are the prior choice because SSR marker system has many advantages of high variability, easy in detection, codominant inheritance, high transferability among related species, and repeatability between different research groups (Jones et al. 2009 for a review, references therein). SSR is a DNA repeat of 1–10 nucleotides that are widely distributed without exception in all eukaryotes. The repeat numbers are prone to be variable by replication slippage (Strand et al. 1993) or unequal crossing over (Jeffreys et al. 1998). The nature of hypervariability of the SSRs confers that SSRs are robust marker systems in many aspects of genetic and breeding programs in crops and wild species (Moe et al. 2012). The distribution and density of the SSRs are variable depending on the species such as one per 8.57 kb in Arabidopsis thaliana and one per 7.25 kb in Brassica species (Parida et al. 2010). Genomic locations of the SSRs can be either genic or intergenic regions, including 5’- and 3’-untranslated regions (UTR), introns, exons (Subramanian et al. 2003, Zhang et al. 2004).
Traditionally, SSRs have been developed from genomic libraries containing small size inserts, which is laborious because the frequencies colonies containing SSR motifs are relatively low. Alternately, prior to the library construction, SSR-enrichment can be incorporated the protocol to bypass the low frequency problem (Kwon et al. 2005, Lee et al. 2008). Nevertheless, the SSRs from the genomic libraries represent not only from genic regions but also from gene empty heterochromatic regions. Compared to the motley lot of origin of library derived genomic SSRs, EST-SSRs (expressed sequence tag-SSRs) are derived from functional genes (Park et al. 2009). With the high through-put sequencing technologies, the volume of transcriptome or EST sequences in GenBank has been grown with the fast speed and these ESTs or transcriptome sequences are good sources in mining of the SSRs (Gao et al. 2011, Nakatsuji et al. 2011, Lee et al. 2011, Dhandapani et al. 2012). While the non-genic SSRs are generally deemed to be evolutionary neutral, the genic EST-SSRs might have played some adaptive roles in the host species. Thus, the variation in genic EST-SSRs must reflect the past influence of selection that can affect the population structure. This is particularly useful in the characterization of the crop landraces which have been gone through the intended human selection for many generations (Holderegger et al. 2006).
Brassica napus, also known as rapeseed or oilseed rape, is an important oil crop worldwide. Limit of availability of the fossil fuel and demand for clean energy have brought high attentions of the rapeseed oil from both scientific and public sectors to produce diesel oil. The reports on the SSRs in Brassica species are available by numerous researchers (Lowe et al. 2004, Hasan et al. 2006, Batley et al. 2007, Gao et al. 2011, Ge et al. 2011). B. napus (2n=4x=38, AACC) is an allotetraploid species derived from the interspecific hybridization between Brassica rapa (2n=2x=20, AA) and B. oleracea (2n=2x=19, CC). Mutations in SSR loci promote genetic diversity that often leads adaptation and subsequent speciation (Kashi and King, 2006). Allopolyploidization can induce SSR evolution and the genetic diversity created by SSR mutation can increase host adaptability to the environmental challenges (Tang et al. 2009). B. napus is a species having high degrees of self-incompatibility so that SSR variations among breeding lines may help in planning the breeding programs of B. napus to induce maximum heterosis
This study aims two purposes: (1) identification and characterization of SSRs from EST database in GenBank; (2) analyze the SSR variations among rapeseed breeding lines in Korea for breeding purpose..
MATERIALS AND METHODS
EST clones having SSR motifs of longer than 20 bp (>10 repeats in di-nucleotide repeat motifs and >7 repeats in tri-nucleotide repeat motifs) were selected among the EST sequences of the B. napus from National Centre for Biotechnology Information (NCBI) using the Pearl computer program (Temnyk et al. 2001). Redundant sequences were removed from the mined ESTs with the protocol of Huang et al. (2010) at a cutoff of 90% identity. Primers were designed to produce amplifications in length range between 150-400 bp at melting temperatures between 53-65°C with the ARGOS software program (Kim, 2004).
Plant materials and DNA extraction
Ten rapeseed breeding lines or cultivars (Sunmang, Tammi, Tamla, Naehan, Youngsan, Halla, Mokpo68, Mokpo111, Mokpo113, and Mokpo114) developed in Oil Crop Research Centre, RDA, Mokpo, Korea were used in checking the PCR amplification and polymorphisms. Genomic DNA was isolated from young seedlings with the DNeasy Plant DNA Maxi Kit (Qiagen, USA) using the protocol from the supplier. DNA quality was adjusted to 50 ug/ul by 0.8% agarose gel electrophoresis and lambda genomic DNA control.
PCR and electrophoresis
PCR was conducted in a 25 ul reaction mix containing 2.5 ul of 10x reaction buffer (50 mM KCl, 20 mM Tris-HCl, pH 8.0, and 2.0 mM MgCl2 ), 2.5 mM of each dNTPs, 0.1 uM primers, 20 ng template DNA, and 0.5 unit Taq DNA polymerase (Intron Bio, Korea). PCR reaction was consisted of 94°C for 2 minutes, 35 cycles of 94°C for 30 seconds, 55°C for 30 seconds, 72°C for 1 minutes, and once extension at 72°C for 10 minutes. The amplified products were electrophoresed in 6% denaturing polyacrylamide gel in a conventional PAGE system for 2 hours at 1,800 volts. The separated DNAs were visualized after silver staining (Promega, USA).
EST-SSR mining from B. napus database
We isolated 7,802 EST-SSRs from the 643,947 B. napus EST sequences in NCBI. The number of EST-SSRs di-nucleotide and tri-nucleotide repeat motifs was 3,823 and 3,979, respectively (Fig. 1). Among the di-nucleotide repeat motif EST-SSRs, the number of EST-SSRs containing >10 repeats was 94 in which one clone was redundant so that 93 EST-SSRs were suitable for primer designing. Among the tri-nucleotide repeat motif EST-SSRs, the number of EST-SSRs containing >7 repeats was 220 in which 10 clones were redundant so that 210 tri-nucleotide EST-SSRs were used in primer designing.
Fig. 1.The strategy for developing the EST-SSRs in B. napus.
Although the number of di- and tri-nucleotide EST-SSRs was not much different among the 7,802 EST-SSR sequences, the number of the tri-nucleotide EST-SSRs was as much as high in double frequency after cut-off screening of >20 bp in the repeat motif lengths.
Characterization of the EST-SSRs containing longer than 20 bp in the repeat motifs
As explained in the above section, the cut-off for the primer designing of the di- and tri-nucleotide motif EST-SSRs was 20 bp in the repeat motif (>10 in di-nucleotide motifs and >7 in tri-nucleotide motifs) length. Table 1 shows the distribution of the EST-SSRs with the number of repeat motifs and repeat motif sequences. Of the 16 possible combinations of di-nucleotide motifs, only three types (AC/GT, AG/CT, and AT/TA) were present and AG/CT repeat motif was most predominant as high as 67%. The number of repeats was mostly in the range of 10 to 15.
Of the 64 possible tri-nucleotide motifs, 27 tri-nucleotide repeats were observed among the 210 tri-nucleotide repeat EST-SSRs. The high frequency of tri-nucleotide motifs were AGA/TCT, GAA/TTC, and AAG/CTT. These three types account for 35.7% of the isolated tri-nucleotide motif EST-SSRs. The number of repeats was mostly in the range of 7 to 9.
Table 1. Distribution of B. napus EST-SSRs and efficiency of marker development.
PCR amplification and polymorphisms of the 303 EST-SSRs among 10 rapeseed cultivars
PCR amplifications of the screened EST-SSRs were performed among 10 rapeseed cultivars or breeding lines that were the major breeding lines developed in the Oilseed Research Station, RDA, Korea. Of the 303 primer pairs for the EST-SSRs, 234 primer pairs showed amplification in all of the 10 control breeding lines and cultivars, and 142 primer pairs revealed polymorphisms (Table 2). The primer sequences of the PCR amplifiable 234 SSR primer pairs are shown in the Table 3.
Table 2. Frequency and distribution of different SSR motif types used to design EST-SSR primer pairs in B. napus.
Table 3. EST-SSR primer pairs developed from B. napus.
Table 3. Continued.
Table 3. Continued.
Table 3. Continued.
In checking the polymorphism rate with the SSR repeat length, the repeat length does not seem to be related with the generation of the polymorphism (Fig. 2a). However, the repeat number showed clear relationship in generating polymorphisms (Fig. 2b). The EST-SSRs with higher than 12 repeat numbers revealed higher polymorphisms than the EST-SSRs having less than12 repeat numbers.
Fig. 2. Relationship between SSR length and polymorphism rate (a), and between repeat number and polymorphism rate (b).
SSR system is a highly preferred molecular marker system by high repeatability and other advantages (Jones et al. 2009, Park et al. 2009, Moe et al. 2012). SSRs in Brassica species have been available by several reports (Lowe et al. 2004, Hasan et al. 2006, Batley et al. 2007, Gao et al. 2011, Ge et al. 2011). Recently, Cheng et al. (2009) reported 627 SSR markers that were derived from the BamHI BAC library in B. napus. Thus, the EST-SSRs in the current report add more venues in the SSR list in the B. napus. The SSR motif contribution of the report of Cheng et al. was highly different from our result. While they reported di-, tri-, tetra-, and penta-nucleotide motifs, we only observed di- and tri-nucleotide motifs. In di-nucleotide repeats, they also showed AC/GT, AG/CT, and AT/TA motifs, being the AT/TA motif was most frequent in contrast to the highest frequency of AG/CT in our result. In a survey of SSR motifs in 18 different plant species, Park et al. (2009) also reported the AC/GT, AG/CT, and AT/TA repeat motifs and their distribution frequency of each motif was different depending on the species. In tri-nucleotide motif SSRs, the 27 different types of repeat motifs in our dataset are more diverse than the 10 types of Cheng et al. (2009). While our result shows that AGA/TCT, GAA/CTT, and AAG/TTC are most frequent, Cheng’s group reported AAG/TTC as the most frequent, but AGA/TCT and GAA/CTT repeats were not even present in their report. This difference may be due to that our SSRs were derived from EST database, while their SSRs were from genomic sequences. Although SSRs are present ubiquitously present in genic or intergenic regions, including 5’- and 3’-untranslated regions (UTR), introns, exons (Subramanian et al. 2003, Zhang et al. 2004), EST-SSRs are only representing to the exonic region. Tóth et al. (2000) reported that di-nucleotide motifs are more abundant than tri-nucleotide in intronic SSRs and vice versa in exonic SSRs. As well, they showed different distribution frequencies of sequence motifs in both di- and tri-nucleotide motifs between intronic and exonic SSRs in all across the eukaryotic species.
Our result showed that 77.2% of the SSR-primer sets were amplifiable among the 10 control rapeseed cultivars or breeding lines, which is comparably lower than 94% of the result of Cheng’s et al. (2009). Similar rate of successful PCR amplification with our result was observed in radish (80.6%; Nakatsuji et al. 2011), peanut (86.5%; Liang et al. 2009) and coffee (Aggarwal et al. 2007). The amplification failure in SSR-ESTs might be due to the intron sequences because the ESTs do not contain intron sequences. If the long intron or multiple introns are present between designed primer pairs, PCR may not effectively amplify the long sequences. Another possibility would be presence of intronic sequence within the primer sequences, which definitely results in amplification failure. In the Cheng’s report, the number of polymorphic primers was higher in di-nucleotide repeat motif SSRs than that of tri-nucleotide motif SSRs. In contrast, the tri-nucleotide motif SSRs, the frequency of polymorphic primer pairs in our result was higher than the genomic SSRs of theirs, which is congruent with other plant species (Aggarwal et al. 2007, Liang et al. 2009, Nakatsuji et al. 2011). The extension or contraction of tri-nucleotide repeat does not interrupt the coding frame, whereas repeat variation of the di-nucleotide motifs causes frame-shift in the coding sequences that might have gone through the purifying selection, which accounts for the higher frequencies of tri-nucleotide repeats.
The lack of relations between polymorphism generation and repeat length is in contrast to the report of ryegrass (Jones et al. 2001), barley (Ramsay et al. 2000), and common bean (Yu et al. 2000). The positive relationship between repeat numbers and polymorphism does not match the report of the genomic SSRs of B. napus (Cheng et al. 2009), but congruent with other results of O. sativa (Yang et al. 1994). Molecular mechanisms for SSR polymorphisms are DNA replication slippage (Strand et al. 1993) and unequal crossing over (Jeffreys et al. 1998). Thus, it is probable that higher number of repeats can result in higher incidences of both DNA replication slippage and unequal crossing over.
B. napus has become increasingly an important crop not only for its oil crop but also for amenity purpose. Because it is an out-crossing crop by self-incompatibility, manipulation of molecular markers is highly required for germplasm management and breeding programs. Although SSRs in B. napus have been available by other reports (Szewe-McFadden et al. 1996, Cheng et al. 2009), the current report contains development and utility of EST-SSRs of the Korean breeding lines of B. napus, which will be highly informative in the germplasm management and experimental design in rapeseed breeding in Korea.
The authors would like to thank the anonymous reviewers of the manuscript. This research was supported by a grant to NSK and KSK from the Rural Development Administration (Project No. PJ907062).
2.Batley J, Hopkins C, Cogan N, Hand M, Jewell E, Kaur J, Kaur S, Li X, Ling A, Love C. 2007. Identification and characterization of simple sequence repeat markers from Brassica napus expressed sequences. Mol. Ecol. Notes 7: 886-889.
3.Cheng X, Xu J, Xia S, Gu J, Yang Y, Fu J, Qian X, Zhang S, Wu J, Liu K. 2009. Development and mapping of microsatellite markers from genome survey sequences in Brassica napus. Theor. Appl. Genet. 118: 1121-1131.
4.Dhandapani V, Choi SR, Paul P, Kim Y-K, Ramchiary N, Hur Y-K, Lim YP. 2012. Development of EST database and transcriptome analysis in the leaves of Brassica rapa using a newly developed pipeline. Genes Genom. 34: in press
5.Gao C, Tang Z, Yin J, An Z, Fu D, Li J. 2011. Characterization and comparison of gene-based simple sequence repeats across Brassica species. Mol. Genet. Genomics 286: 161-170.
6.Ge Y, Ramchiary N, Wang T, Liang C, Wang N, Wang Z, Choi S-R, Lim Y-P, Piao ZY. 2011. Development and linkage mapping of unigene-derived microsatellite markers in Brassica rapa L. Breed. Sci. 61: 160-167.
7.Hasan M, Seyis F, Badani A, Pons-Kühnemann J, Friedt W, Lühs W, Snowdon R. 2006. Analysis of genetic diversity in the Brassica napus L. gene pool using SSR markers. Genet. Resour. Crop Evol. 53: 793-802.
8.Holderegger R, Kamm U, Gugerli F. 2006. Adaptive vs. neutral genetic diversity: Implications for landscape genetics. Landsc. Ecol. 21: 797-807.
9.Huang T-Y, Niu B, Gao Y, Fu L, Li W. 2010. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26: 680-682.
10.Jeffreys AJ, Murray J, Neumann R. 1998. High resolution mapping of crossovers in human sperm defines a minisatelliteassociated recombination hotspot. Mol. Cell 2: 267-273.
11.Jones ES, Dupal MP, Kolliker R, Drayton MC, Forster JW. 2001. Development and characterization of simple sequence repeat (SSR) markers for perennial ryegrass (Lolium perenne L.). Theor. Appl. Genet. 102: 405-415.
12.Jones N, Ougham H, Thomas H, Pasakinskiene I. 2009. Markers and mapping revisited: finding your gene. New Phytol. 183: 935-966.
13.Kashi Y, King DG. 2006. Simple sequence repeats as advantageous mutators in evolution. Trends Genet. 22: 253-259.
14.Kim KY. 2004. Developing one step program (SSR Manager) for rapid identification of clones with SSRs and primer designing. MS Thesis, Seoul National University, Seoul, Korea.
15.Kwon SJ, Lee JK, Kim NS, Yu JW, Dixit Am Cho EG, Park YJ. 2005. Isolation and characterization of microsatellite markers in Perilla frutescens Brit. Mol. Ecol. Notes 5: 455-457.
16.Lee SI, Park KC, Song YS, Son JH, Kwon SJ, Na J-K, Kim JH, Kim NS. 2011. Development of expressed sequence tag derived-simple sequence repeats in the genus Lilium. Genes Genom. 33: 727-733.
17.Lee JR, Hong GY, Dixit A, Chung JW, Ma KH, Lee JH, Kang HK, Cho YH, Gwag JG, Park YJ. 2008. Characterization of microsatellite loci developed for Amaranthus hypochondriacus and their cross-amplifications in wild species. Conserv. Genet. 9: 243-246.
18.Liang X, Chen X, Hong Y, Liu H, Zhou G, Li S, Guo B. 2009. Utility of EST-derived SSR in cultivated peanut (Arachis hypogeal L.) and Arachis wild species. BMC Plant Biol. 9: 35.
19.Lowe A, Moule C, Trick M, Edwards K. 2004. Efficient large-scale development of microsatellites for marker and mapping applications in Brassica species. Theor. Appl. Genet. 108: 1103-1112.
20.Moe KT, Kwon SW, Park YJ. 2012. Trends in genomics and molecular marker systems for the development of some underutilized crops. Genes Genom. 34:
21.Nakatsuji R, Hashida T, Matsumoto N, Tsuro M, Kubo N, Hirai M. 2011. Development of genomic and EST-SSR markers in radish (Raphanus sativus L.). Breed. Sci. 61: 413-419.
22.Park YJ, Lee JK, Kim NS. 2009. Simple sequence repeat polymorphisms (SSRPs) for evaluation of molecular diversity and germplasm classification of minor crops. Molecules 14: 4546-4569.
23.Parida SK, Yadava DK, Mohapatra T. 2010. Microsatellites in Brassica unigenes: relative abundance, marker design, and use in comparative physical mapping and genome analysis. Genome 429: 80-89.
24.Ramsay L, Macaulay M, degli Ivanissevich S, MacLean K, Cardle L, Fuller J, Edwards KJ, Tuvesson S, Morgante M, Mssari A, Maestri E, Narmirori E, Marmiroli N, Sjakste T, Ganal M, Powell W, Waugh R. 2000. A simple sequence repeat-based linage map of barley. Genetics 156: 1997-2005.
25.Strand M, Prolla TA, Liskay RM, Petes TD. 1993. Destabilization of tracts of simple repetitive DNA in yeast by mutations affecting DNA mismatch repair. Nature 365: 274-276.
26.Subramanian S, Madgula VM, George R, Kumar S, Pandit MW, Singh L. 2003. SSRD: simple sequene repats database of the human genome. Comp. Funct. Genome 4: 342-345.
27.Szewe-McFadden AK, Kresovich S, Bliek SM, Mutchell SE, McFerson JR. 1996. Identification of polymorphic, conserved simple sequence repeats (SSRs) in cultivated Brassica species. Theor. Appl. Genet. 93: 534-538.
28.Tang Z, Fu S, Ren Z, Zou Y. 2009. Rapid evolution of simple sequence repeat induced by allopolyploidization. J. Mol. Evol. 69: 217-228.
29.Temnykh S, DeClerck G, Lukashova A, Lipovich L, Cartinhour S, McCouch S. 2001. Computational and experimental analysis of microsatellites in rice (Oryza sativa L.): Frequency, length variation, transposon associations, and genetic marker potential. Genome Res. 11: 1441–1452.
30.Tóth G, Gáspári Z, Zurka J. 2000. Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 10: 967-981.
31.Yang GP, Shagai Maroof MA, Xu CG, Zhang Q, Biyashev RM. 1994. Comparative analysis opf microsatellite DNA polymorphism in landraces and culvars of rice. Mol. Gen. Genet. 245: 187-194.
32.Yu K, Park SJ, POysa V, Gepts P. 2000. Integration of simple sequence repeat (SSR) marker into a molecular linkage map of common bean (Phaseolus vulgaris L.). J. Hered. 91: 429-434.
33.Zhang LD, Yuan DJ, Yu SW, Li ZG, Cao YF, Miao ZQ, Qian HM, Tang KX. 2004. Preference of simple sequence repeats in coding re and non-coding regions of Arabidopsis thaliana. Bioinformatics 20: 1081-1086.