Skip to main content

Genetic diversity and structure of tea plant in Qinba area in China by three types of molecular markers



Qinba area has a long history of tea planting and is a northernmost region in China where Camellia sinensis L. is grown. In order to provide basic data for selection and optimization of molecular markers of tea plants. 118 markers, including 40 EST-SSR, 40 SRAP and 38 SCoT markers were used to evaluate the genetic diversity of 50 tea plant (Camellia sinensis.) samples collected from Qinb. tea germplasm, assess population structure.


In this study, a total of 414 alleles were obtained using 38 pairs of SCoT primers, with an average of 10.89 alleles per primer. The percentage of polymorphic bands (PPB), polymorphism information content (PIC), resolving power (Rp), effective multiplex ratio (EMR), average band informativeness (Ibav), and marker index (MI) were 96.14%, 0.79, 6.71, 10.47, 0.58, and 6.07 respectively. 338 alleles were amplified via 40 pairs of SRAP (8.45 per primer), with PPB, PIC, Rp, EMR, Ibav, and MI values of 89.35%, 0.77, 5.11, 7.55, 0.61, and 4.61, respectively. Furthermore, 320 alleles have been detected using 40 EST-SSR primers (8.00 per primer), with PPB, PIC, Rp, EMR, Ibav, and MI values of 94.06%, 0.85, 4.48, 7.53, 0.56, and 4.22 respectively. These results indicated that SCoT markers had higher efficiency.

Mantel test was used to analyze the genetic distance matrix generated by EST-SSRs, SRAPs and SCoTs. The results showed that the correlation between the genetic distance matrix based on EST-SSR and that based on SRAP was very small (r = 0.01), followed by SCoT and SRAP (r = 0.17), then by SCoT and EST-SSR (r = 0.19).

The 50 tea samples were divided into two sub-populations using STRUCTURE, Neighbor-joining (NJ) method and principal component analyses (PCA). The results produced by STRUCTURE were completely consistent with the PCA analysis. Furthermore, there is no obvious relationship between the results produced using sub-populational and geographical data.


Among the three types of markers, SCoT markers has many advantages in terms of NPB, PPB, Rp, EMR, and MI. Nevertheless, the values of PIC showed different trends, with the highest values generated with EST-SSR, followed by SCoT and SRAP. The average band informativeness showed similar trends. Correlation between genetic distances produced by three different molecular markers were very small, thus it is not recommended to use a single marker to evaluate genetic diversity and population structure. It is hence suggested that combining of different types of molecular markers should be used to evaluate the genetic diversity and population structure. It also seems crucial to screen out, for each type of molecular markers, core markers of Camellia sinensis. This study revealed that genes of exotic plant varieties have been constantly integrated into the gene pool of Qinba area tea. A low level of genetic diversity was observed; this is shown by an average coefficient of genetic similarity of 0.74.


Evaluation of genetic diversity and population structure has significant implications for genetic improvement in plant breeding. It has been well established that the genetic basis of biological organisms is concealed within the genome sequence, and that base-pair substitution, insertion, deletion, and other alterations can lead to genetic diversity; the diversity of organisms are manifested through phenotypic, chromosomal and proteomic differences. DNA molecular markers, having stable performance, high polymorphism and other properties, are increasingly employed in taxonomical, genetic evolutionary, breeding, and cloning studies. The use of different molecular markers and different primers for a same marker may result in amplification of distinct regions of the genome. Theoretically, higher numbers of polymorphic markers used are associated with wider amplified regions that covers the entire genome and more accurate results.

EST-SSR (Expressed Sequence Tag-Simple Sequence Repeat) molecular markers have been widely used with many species and for many applications, such as genetic linkage mapping, comparative mapping, and evaluation of genetic diversity [1,2,3,4,5]. SRAP (Sequence related amplified polymorphism) was first used on Brassica in 2001 by Li G [6]. The genetic diversity and population structure analysis of Camellia sinensis by SRAP [7,8,9,10,11,12] have already been reported. SCoT (Start codon targeted polymorphism) marker was designed according to the Kozak sequence pattern and was developed after the discovery of the conservativeness of the initiation codon ATG (+ 1, + 2, + 3) flanking sequences, in which the positions + 4, + 7, + 8, and + 9 are occupied by nucleotides G, A, C, and C, respectively. These seven nucleotides are generally conserved. At positions − 3, − 6, and − 9, G is the usual nucleotide. Primers can therefore be designed according to the conservativeness of the initiation sequence SCoT marker allows single primer amplification of the region between two genes. Bertrand et al. first applied this marker on Oryza sativa [13]. Lately, SCoT molecular marker has been used to access the genetic diversity of plant species such as Saccharum spontaneum L [14], Dactylis glomerata [15], Mangifera indica [16], Arachis hypogaea [17], Saccharum officinarum [18], Podocarpus macrophyllus [19] and Paeonia suffruticosa [20]. Nevertheless, no similar study has been conducted on Camellia sinensis. Tea plant is an allogamous species; theoretically, after prolonged spontaneous hybridization, the genetic background of tea plant should be increasingly complex.

China is one of the main sources of tea germplasms. Currently, there are 1,100,000 ha of tea planting area, with different regions growing different types and different varieties of tea according to topographic, soil, and climatic characteristics. Xinan, Huanan, Jiangnan, and Jiangbei represent the four main districts of tea planting area in China. The Qinba area belongs to the Jiangbei district. In this research, 50 tea varieties, including those collected from different districts, common tea plant species, as well as local species in the Qinba area, were genotyped with EST-SSR, SRAP, and SCoT markers. Herein we constructed three types of molecular marker dataset which have important applications in diversity analysis, marker efficiency analysis, and correlation analysis that use these marker systems. Our study allowed the establishment of population structure, providing significant insights into the selection of molecular markers for tea plant breeding.

Results and discussion

Marker efficiency analysis

In this study, three types of molecular markers were used to differentiate tea plant accessions. A total of 1072 bands were produced using 118 primer pairs. 38 SCoT, 40 SRAP and 40 EST-SSR primers were selected for further studies according to the percentage of polymorphic bands (PPB), polymorphism information content (PIC) and the degree of clear band selected markers using six selected genotypes (Table 1). A total of 414, 338, and 320 bands were obtained using SCoT, SRAP and EST-SSR markers, respectively from the 50 test materials, which included 398, 302, and 301 polymorphic bands, with PPBs of 96.13%, 89.35%, and 94.06%. Comparisons of the three types of markers are shown in Table 2. SCoT markers have a higher marker efficiency and are excellent for the appraisal of polymorphic loci, except that its polymorphic information content is lower than that of EST-SSR.

Table 1 Amplification results of EST-SSR, SRAP, and SCoT primers
Table 2 Comparison of the efficiency of EST-SSR, SRAP, and SCoT primers

Correlation analysis among genetic distance matrices by three-types of marker dataset

Mantel tests [21] were used to measure the correlation between the genetic distance matrices generated by SCoT, SRAP and EST-SSR molecular markers. r ≥ 0.9, 0.8 ≤ r < 0.9, 0.7 ≤ r < 0.8, and r < 0.7 represented significant correlation, moderate correlation, weak correlation, and no correlation, respectively. In the present study, the coefficients of correlation (r) between the genetic distance matrices of SCoT and EST-SSR markers, SCoT and SRAP markers, and SRAP and EST-SSR markers were 0.19, 0.17, and 0.01, respectively (Fig. 1). Different molecular markers and different primers of the same marker all yielded distinct amplification products, which reflected the polymorphism of the genomic regions; hence, utilization of different marker designing strategies will produce different results. Theoretically, the validity of the results should improve with increasing numbers of markers and increasing coverage of the genome. Therefore, we employed three types of molecular markers to generated 1072 bands and to perform genetic constitution analyses.

Fig. 1

The correlation between the genetic distance matrices using Mantel tests

Genetic constitution analysis

Analysis using STRUCTURE

One thousand seventy-two polymorphic bands with MAF (minor allele frequency) < 5% were used to elucidate the population structure of the entire pool of tea germplasms. In this study, STRUCTURE 2.3.4, which applies a Bayesian clustering algorithm, was used to simulate population genetic structure based on the assumption that the 1072 loci were independent. Using a membership probability threshold of 0.60, population K values from 1 to 10 were simulated with 20 iterations for each K using 10,000 burn-in periods followed by 10,000 Markov Chain Monte Carlo iterations in order to obtain an estimate of the most probable number of population. Delta K was plotted against K values; the best number of clusters was determined following the method proposed by Evanno et al. [22] and obtained via the Structure Harvester platform ( Delta K reached a maximum value at K = 2, suggesting that the 50 tea germplasm were best divided into two subgroups (Fig. 2).

Fig. 2

STRUCTURE analysis of the number of population for K. The number of subpopulations(k) was identified based on maximum likelihood and k values. The most likely value of k identified by STRUCTURE was observed at k = 2. Note: Green bands: Group 1, Red bands: Group 2. The proportion of each color reflects the probability that each of the test materials (numbered from 1 to 50) belongs the corresponding group

UPGMA clustering

A dendrogram was constructed with cluster analysis using the unweighted pair-group method with arithmetic means (UPGMA), which demonstrated that the 50 genotypes could be clearly divided into 2 groups (Fig. 3). Group I included 27 varieties, and group II contained 23 varieties. The average similarity coefficient was 0.74. The two most closely related materials were 15 and 16, which have a sister line with a genetic similarity coefficient of 0.93.

Fig. 3

Cluster dendrogram of 50 tea genotypes constructed based on UPGMA by EST-SSR, SRAP and SCoT

Principal components analysis

The top three principal components were used to analyze population structure. Principal component analysis was conducted under NTSYS-pc2.10e [23]. The results showed that the three PCs had contribution rates of 15.97%, 8.50% and 6.17%. PCA separated the 50 genotypes into two major groups (Fig. 4) which were consistent with the STRUCTURE and UPGMA results. GroupI consisted of 18 genotypes (Fig. 4, left), with the other 32 genotypes belonging to group II (Fig. 4, right).

Fig. 4

PCA plots based on the first three components

The analysis performed using STRUCTURE, UPGMA and PCA yielded similar results, clustering the 50 genotypes into 2 sub-populations. Of note, PCA results had good consistency with previous results from STRUCTURE. The results generated using UPGMA were slightly different from those using STRUCTURE and PCA (Table 3) and bold numbers in group 1 by UPGMA represent the differences between the results using STRUCTURE and PCA and the results using NJ.

Table 3 Comparison of the clustering by STRUCTURE, PCA and UPGME


We firstly reported the use of SCoT markers to analysis genetic diversity of tea germplasms. The results showed that SCoT markers revealed high genetic diversity among tea resources. In the future, we planed to select core SCoT markers. Different kinds of molecular markers can reveal different and complementary information of the same genome. Thus, we highly recommend using more marker types for comprehensive evaluation of genetic diversity and structure. 50 accessions were clustered into 2 sub-populations based on STRUCTURE, UPGMA and PCA; there was no obvious differences between imported and local germplasms. The genes of exotic varieties have been constantly integrated into the gene pool of Qinba tea through long-term (20–25 years) tea breeding and production activities. The selection of varieties with economic characters was emphasized during the process of breeding, resulting in the loss of some tea resources and the decrease of genetic diversity; thus, it is necessary to introduce new tea tree resources in order to broaden the genetic diversity.


Plant materials

A total of 50 tea plant genotypes, representing most tea germplasm of the Qinba area in China, were collected from the tea experimental farm of the Hanzhong Institute of Agricultural Sciences during the 2016 growing season (Table 4).

Table 4 The 50 tea plant samples used for marker (EST-SSR, SRAP and SCoT) genotyping

DNA extraction and marker genotyping

Genomic DNA was extracted from fresh leaves of each individual using the modified CTAB technique and detected with 0.8% agarose gel electrophoresis. PCR was carried out as follows: 2 × Taq Master Mix (7.5 μL), forward and reverse primers (1 μL each, 2 μL for SCoT primers), RNase-free water (3.5 μL), and tea genomic DNA (2 μL). In order to improve the effect of PCR amplification, changing annealing temperature was used in a PCR reaction system; the reactions were programed as follows: initial denaturation at 94.0 °C for 5 min, denaturation at 94.0 °C for 1 min, annealing at 60.0 °C for 1 min, and extension at 72.0 °C for 1 min, for a total of 10 cycles; subsequently, a total of 35 cycles of denaturation at 94.0 °C for 30 s, annealing at 35 °C for 30 s, and extension at 72.0 °C for 1 min were performed. The duration of extension was 10 min; then storage at 4.0 °C. The selected primers were synthesized by Shanghai Sangon Biological Engineering Technology and Service Company (Shanghai, China). Initially, six germplasms (LongJing, ShanCha1, ChunBoLu, BeiBa11–6, Ning13–6, ZaoBaiJian) were used to screen markers for high polymorphim. Then, 40 pairs of clear and highly polymorphic EST-SSR and SRAP markers, and 38 paris of SCoT marker primers were selected from 154 EST-SSR pairs, 154 SRAP pairs, 125 SCoT pairs. Electrophoresis was performed using 8% non-denaturing polyacrylamide gel under 160 V voltage; the bands were visualized via silver staining.

Genetic variation and marker efficiency analysis

Following electrophoresis, each amplification band corresponded to a primer hybridization locus and was considered as an effective molecular marker. Each polymorphic band detected by a same given primer represented an allelic mutation. In order to generate molecular data matrices, clear bands for each fragment were scored in every accession for each primer pair and recorded as 1 (presence of a fragment), 0 (absence of a fragment), and 9 (complete absence of band). Excel was used to compute the marker index (MI) of the three types of markers and the marker frequencies of the three types of markers were compared. MI values were obtained from the average band informativeness (Ibav) and the effectiveness multiplex ratio (EMR); EMR represents the number of polymorphic loci and Ibav is given by the following formula:

$$ {Ib}_{av}=\frac{1}{n}\sum \limits_{i=1}^n\left(1-\left(2\left|0.5-{P}_i\right|\right)\right), $$

where Pi represents the proportion of the ith sample in the amplified locus and n represents the total number of amplified loci. Using the method reported by Smith et al. [24], the value of the polymorphism information content (PIC) was calculated with the formula:

$$ PIC=1-\sum \limits_{i=1}^n{P_i}^2-\sum \limits_{i=1}^{n-1}\sum \limits_{j=i+1}^n2{P_i}^2{P_j}^2, $$

where PIC represents the PIC value of the ith locus and Pij represents the frequency that allele j appears in the ith locus. The value of PIC varies from 0 to 1, with 0 indicating an absence of polymorphism at a given locus and 1 reflecting multiple alleles at a given locus. The level of polymorphism of each marker was assessed by the polymorphism information content (Botstein et al. [25]), which measures the extent of genetic variation: PIC values smaller than 0.25 indicates low levels of polymorphism associated to a locus, PIC values between 0.25 and 0.5 imply moderate levels of polymorphism, while PIC values greater than 0.5 indicate high levels of polymorphism.

Correlation analysis among genetic distance matrices by three-types of marker dataset

Mantel test was carried out with the batch file of the NTSYS-pc2.10e software.

Genetic constitution analysis

STRUCTURE v2.3.4 was used to assess the population structure of the 50 tea genotypes with 1072 loci. The number of sub-population (K) was set from 1 to 10 based on admixture models and correlated band frequencies. Genetic similarity coefficients were computed using the SM functionality of the NTSYS-pc2.10e software, cluster analysis were conducted using the UPGMA method, and the principal component analysis using the batch file under the NTSYS-pc2.10e software.



Effective multiplex ratio


Expressed sequence tags-Simple sequence repeats


Average band informativeness


Minor allele frequency


Marker index




Principal component analyses


Polymorphism information content


Percentage of polymorphic bands


Resolving power


Start codon targeted polymorphism


Sequence-related amplified polymorphism


Unweighted pair group method with arithmetic mean


  1. 1.

    Yao MZ, Chen L, Ma CL, et al. Comparative analysis of genetic diversity among tea cultivars from China, Japan and Kenya by ISSR and EST-SSR. Mol Plant Breed. 2009;7(5):897–903.

    CAS  Google Scholar 

  2. 2.

    Yao MZ, Qiao TT, Ma CL, et al. The association analysis of phenotypic traits with EST-SSR markers in tea plants. J Tea Sci. 2010;30(1):45–51.

    Google Scholar 

  3. 3.

    Liu B, Sun X, Li Y, et al. Analysis of genetic diversity of tea plants by using EST-SSR and ISSR markers. Chin J Trop Crop. 2009;30(11):1577–83.

    Google Scholar 

  4. 4.

    Qiao TT, Ma CL, Zhou YH, et al. EST-SSR genetic diversity and population structure of tea landraces and developed cultivars (lines) in Zhejiang Province, China. Acta Agron Sin. 2010;36(5):744–53.

    CAS  Article  Google Scholar 

  5. 5.

    Li SJ, Wang X, Duan JH, Dong LJ, Zhang SG. Genetic diversity and genetic structure of 16 tea cultivars based on SSR markers. Hunan Agric Sci. 2011;12(23):6–9.

    CAS  Google Scholar 

  6. 6.

    Li G, Quiros CF. Sequence-related amplified polymorphism (SRAP), a new marker system based on a simple PCR reaction: its application to mapping and gene tagging in Brassica. Theor Appl Genet. 2001;103(2):455–61.

    CAS  Article  Google Scholar 

  7. 7.

    Shen CW, Ning ZX, Huang JA, et al. Genetic diversity of Camellia sinensis germplasm in Guangdong Province based on morphological parameters and SRAP markers. Chin J Appl Ecol. 2009;20(7):1551–8.

    CAS  Google Scholar 

  8. 8.

    Shen CW, Huang JA, Zhao SH, Ning ZX, Li JX, Zhao CY, Chen D. Analysis of genetic diversity of camellia sinensis germplasm in Guangdong Province by srap and issr markers. J Nucl Agric Sci. 2010;24(5):948–55.

    CAS  Google Scholar 

  9. 9.

    Xi CY, Tang Q. Wu YS, Xu JY, Chen H, Wu Q. Genetic diversity and relationship of 30 tea plant germplasms in Sichuan revealed by SRAP marker. Guizhou Agric Sci. 2013;41(2):6–9.

    CAS  Google Scholar 

  10. 10.

    Liu Z, Zhao Y, Yang PD, Chen Y, Ning J, Yang Y. Comparison of parents identification for tea variety based on SSR, SRAP and ISSR markers. J Tea Sci. 2014;34(6):617–24.

    CAS  Google Scholar 

  11. 11.

    Chen XJ, Zhou KH, Zong HX, Fang R. Genetic diversity of capsicum frutescens in China as revealed by SRAP and SSR markers. Acta Bot Bor-Occid Sin. 2012;32(11):2201–5.

    CAS  Google Scholar 

  12. 12.

    Xia FG, Zhong XW, Wu F, et al. SRAP marker analysis of genetic diversity and relationship in Wuyi rock tea Germplasm resources. J Tea Sci. 2017;37(1):78–85.

    Google Scholar 

  13. 13.

    Bertrand C, David J, Mackill CY, et al. Start codon targeted (SCoT) polymorphism:a simple, novel DNA marker technique for generating gene-targeted markers in plants. Plant Mol Biol Rep. 2009;27(1):86–93.

    Article  Google Scholar 

  14. 14.

    Luo T, Yang HX, Cen HF, Liu XH, Gao YJ, Duan WX, et al. Application of SCoT molecular marker in construction of molecular genetic linkage map of saccharum spontaneum L. J Plant Genet Resour. 2013;14(4):704–10.

    CAS  Google Scholar 

  15. 15.

    Jiang LF, Zhang XQ, Huang LK, Ma X, Yan DF, Hu Q, et al. Analysis of genetic diversity in a cocksfoot (Dactylis glomerata) variety using SCoT markers. Acta Pratacult Sin. 2014;23(1):229–38.

    Google Scholar 

  16. 16.

    Luo C. Study on SCoT marker and analysis on genes of stress-related and important flowering time in mango. Nanning Guangxi: Guangxi University; 2012.

  17. 17.

    Xiong FQ, Jiang J, Zhong RC, Han ZQ, He LQ, Li Z, Zhuang WJ, Tang RH. Application of SCoT molecular marker in genus arachis. Acta Agron Sin. 2010;36(12):2055–61.

    CAS  Article  Google Scholar 

  18. 18.

    Chen S. Study on genetic diversity and smut resistance evaluation of sugarcane parents. Guangzhou: South China Agricultural University; 2016.

  19. 19.

    Wei YL, He XH, Luo C, Chen H. Genetic diversity of podocarpus by SCoT markers. Guihaia. 2012;32(1):90–3.

    CAS  Google Scholar 

  20. 20.

    Hou XG, Wang J, Jia T, Zhang YQ, Hou J, Li JJ. Orthogonal optimization of SCoT-PCR system and primer screening of tree peony. Acta Agriculturae Boreali-Sinica. 2011;26(5):92–6.

    Google Scholar 

  21. 21.

    Mantel N. The detection of disease clustering and a generalized regression approach. Cancer Res. 1967;27(2):209–20.

    CAS  PubMed  Google Scholar 

  22. 22.

    Evanno G, Regnaut S, Goudet J. Detecting the number of clusters of individuals using the software STRUCTURE: a simulation study. Mol Ecol. 2005;14(8):2611–20.

    CAS  Article  PubMed  Google Scholar 

  23. 23.

    Rohlf FJ. NTSYS-pc - numerical taxonomy and multivariate analysis System; 1998. p. 2.1.

    Google Scholar 

  24. 24.

    Smith S, Helentjaris T. DNA fingerprinting and plant variety protection. In: Paterson AH, editor. Genome mapping in plants. Texas: Landes Company; 1996. p. 95–110.

    Google Scholar 

  25. 25.

    Botstein D, White RL, Skolnick M, et al. Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am J Hum Genet. 1980;32(3):314–31.

    CAS  PubMed  PubMed Central  Google Scholar 

Download references


This study was supported by the earmarked fund for the Natural Science Basic Research Project of Shaanxi Province (2013JZ008), the Sci-technological Project of Shaanxi Province (2016KTCQ02–06) and the Qinling-Bashan Mountains Bioresources Comprehensive Development C. I. C (QBXT-17-5).

Author information




YZ analyzed the data, and wrote the manuscript. XZ edited and revised the manuscript. XC, WS and JL performed the experiments, and all authors approved the manuscript.

Corresponding author

Correspondence to Yu Zhang.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Zhang, X., Chen, X. et al. Genetic diversity and structure of tea plant in Qinba area in China by three types of molecular markers. Hereditas 155, 22 (2018).

Download citation


  • Camellia sinensis
  • Marker efficiency
  • Correlation coefficient
  • Genetic diversity
  • Population structure