Development of novel EST microsatellite markers for genetic diversity analysis and correlation analysis of velvet antler growth characteristics in Sika deer

Background Sika deer is one of the most popular and valued animals in China. However, few studies have been conducted on the microsatellite of Sika deer, which has hampered the progress of genetic selection breeding. To develop and characterize a set of microsatellites for Sika deer which provide helpful information for protection of Sika deer natural resources and effectively increase the yield and quantity of velvet antler. Results We conducted a transcriptome survey of Sika deer using next-generation sequencing technology. One hundred eighty-two thousand two hundred ninety-five microsatellite markers were identified in the transcriptome, 170 of 200 loci were successfully amplified across panels of 140 individuals from Shuangyang Sika deer population. And 29 loci were found to be obvious polymorphic. Number of alleles is from 3 to 14. The expected heterozygosity ranged from 0.3087 to 0.7644. The observed heterozygosity ranged from 0 to 0.7698. The polymorphism information content values of those microsatellites varied ranged from 0.2602 to 0.7507. The marker-trait association was tested for 6 important and kernel characteristics of two-branched velvet antler in Shuangyang Sika deer through one-way analysis of variance. The results showed that marker-trait associations were identified with 8 different markers, especially M009 and M027. Conclusions This study not only provided a large scale of microsatellites which were valuable for future genetic mapping and trait association in Sika deer, but also offers available information for molecular breeding in Sika deer.


Background
Molecular markers are commonly used to test the genetic characteristics of species within or among populations [1]. These molecular markers mainly include microsatellite, single nucleotide polymorphisms (SNPs), random amplified polymorphic DNA (RAPD), amplified fragment length polymorphisms (AFLPs), sequencerelated amplified polymorphisms (SRAPs) and so on [2]. Microsatellite markers are tandem repeats of 1-6 bp, usually associated with copy slippage and DNA repair, and are distributed in the genome of all organisms [3]. Microsatellite markers are characterized by high polymorphism, high information content, specificity, dominance, and reproducibility [4]. They can be classified into genomic microsatellite and expressed sequence tag microsatellite (EST microsatellite) based on their locations [5]. EST microsatellite was found in selectively more constrained regions of the genome [6]. Although compared with genomic microsatellite, EST microsatellite has some disadvantages: (1) The polymorphism level of EST microsatellite was lower than that of genomic microsatellite; (2) Due to a consequence of the undetected presence of introns in flanking regions, the amplicon size may be different from expected [7]. However, this was balanced by the following advantages over genomic microsatellite: (1) EST microsatellite was used to detect the variation in the expressed portion of the genome, which may give the concerned marker-trait associations; (2) EST microsatellite could be developed at no cost from the EST databases which become a fast, efficient and low-cost choice; (3) Unlike genomic microsatellite, once developed, EST microsatellite may be highly conserved among a number of related species, resulting in a high level of transferability [6,8].
Due to the lack of available genomic information, it is difficult to develop ideal microsatellite. Transcriptome sequencing is an alternative to whole-genome sequencing for obtaining microsatellite, which are essential for developing abundant EST microsatellite and identifying target genomic regions for important characteristics [9]. In the last decade, deep transcriptome sequencing has been generated in model and non-model organisms, which have greatly accelerated the collection of genetic resources and the understanding of the interest of gene expression, regulation and networks [10][11][12]. Because on these advantages, transcriptome sequencing has been widely used in non-model animals for EST microsatellite development, including the Siberian tiger, the ridgetail white prawn Exopalaemon carinicauda, the Yellow catfish and so on [13][14][15].
Sika deer is one of the most popular and valued animals in China [16]. The velvet antler has long been a traditional tonic or alternative medicine, according to ancient Chinese pharmaceutical monographs [17]. It is also a useful model for the study of the mechanisms of organ regeneration, rapid growth and mineralization in mammals [18]. At present, the key problem of raising economic benefits of deer breeding is to improve the yield and quality of velvet antler in China. The growth characteristics of velvet antler are closely related to genetic factors. Despite substantial age-and environmentrelated variation, velvet antler size was also heritable [19]. Conventional breeding cycles are long, and it is difficult to make early selection. The research on the velvet antler growth characteristics by using molecular genetic markers can effectively overcome the shortcomings of traditional methods and improve the velvet antler yield of Sika deer. Due to the lack of sufficient molecular markers, Sika deer breeding technology is relatively backward. Although Zu et al. (2001) developed 200 microsatellite markers in Sika deer, most microsatellite markers were borrowed from cattle and the polymorphism was relatively low [20]. Until recently, 22,479 microsatellite markers were obtained from 125 M bp genomic data of Sika deer by using IlluminaHiSeq™2500 sequencing technology. Among them, 29 microsatellite markers were identified to be polymorphic, of which 10 loci could be used for paternity testing of Sika deer [21]. So far, only genomic microsatellite markers have been developed and EST microsatellite markers have not been reported. In addition, microsatellite markers have been mainly applied to genetic diversity, genetic structure analysis and paternity testing in Sika deer [22][23][24]. Only a few microsatellite markers related to the production performance of the velvet antler have been reported. Li et al. (2011) found that BE, CE genotype at BM1225 and AA genotype at T172 could be regarded as linkage genetic marker of velvet antler weight of Xingkaihu Sika deer [25]. Yang et al. (2014) found that the second exon C629G of melatonin receptor 1A gene showed significant association with velvet antler yield, and the velvet antler weight of CC genotype was higher than that of GC genotype in Wusan Sika deer [26]. However, complex characteristics could not be completely controlled by a single gene, and it required a series of genes.
To solve this problem, through Illumina sequencing and bioinformatics analysis, approximately 289 gigabytes (289G) high-quality reads were obtained and were assembled into 1,348,618 unigenes. In our study, the frequency and distribution of 182,295 potential EST microsatellite markers in the Sika deer transcriptome were analyzed. Two hundred EST microsatellite markers loci which expressed in antler were selected for primer design, of these, 170 loci were successfully amplified across panels of 140 individuals from Shuangyang Sika deer population. And 29 loci were found to be obvious polymorphic. Finally, the marker-trait association was tested for important and kernel characteristics of velvet antler in Shuangyang Sika deer. We found 8 EST microsatellites, especially M009 and M027, which can be used as molecular markers in the breeding process of weight of the two-branched velvet antler. The development of tremendous Sika deer molecular markers could help to protect germplasm resources. At the same time, it can effectively increase the yield and quantity of velvet antler. The present study thus represents the first comprehensive application of transcriptome sequencing in the development of Sika deer EST microsatellite markers.

Characterization of EST microsatellites in the Sika deer Transcriptome
A total of 1,348,618 unigenes covering 0.74 GB and 182, 295 EST microsatellite loci were identified from 182,295 sequences of unigenes. The average density of EST microsatellite was 9.53 kb. In total, 22,261 unigenes (12.21%) contained more than one EST microsatellite loci.
In this study, the length of less than 12 bp microsatellite was removed. The total number of EST microsatellites screened from Sika deer was 72,316 (ranging in length from 12 bp to 313 bp). Among them, the lengths of 56,632 EST microsatellites (90.76%) were 12-20 bp, and the lengths of 6684 EST microsatellites (9.24%) were longer than 20 bp.

Genetic diversity analysis
To verify the identified microsatellite markers, 18,692 primer pairs, which removed the length of less than 12 bp and the mono-nucleotide duplication microsatellites, were designed from 58,503 microsatellites expressed in velvet antler. Then, a total of 200 EST microsatellites of velvet antler were selected for validation. In this study, TP-M13 microsatellite PCR method was used to screen for primer. One hundred seventy of two hundred primer pairs successfully PCR-amplified, producing products of the expected size. Of the 170 primer pairs, 29 pairs showed polymorphisms among 140 shuangyang Sika deer samples (N a ≥ 2). In total, 153 alleles were detected across the collected individuals, with the number of alleles per locus varying from 3 to 14. The average N a was 5.6 and the average N e was 2.6. The number of N a of microsatellite M082 and M027 were 14 and 12 respectively. The number of N e of microsatellite M005, M084, M099, M121, M131 and M132 were the least, all of which were 3 ( Table 2).
Among the 29 EST microsatellite loci, the H e varied from 0.3087 to 0.7644, with an average of 0.6018 ± 0.1190, while the H o varied from 0 to 0.7698, with an average of 0.5127 ± 0.2010. The PIC values ranged from 0.2602 to 0.7507, with an average of 0.5450 ± 0.1262 (Table 2). These 29 microsatellites exhibited medium and high levels of PIC (PIC > 0.25). The effective candidate microsatellite markers developed in this study would greatly promote the genetic diversity studies of varieties in Sika deer.

Analysis of correlation between EST microsatellite and velvet antler characteristics
Characteristics of the model and coefficients that resulted from the multiple linear regressions were detailed in Table 3. The model suggested that main beam circle, main beam length, mouth circle, first tine circle, mouth length explain 86.9% of the normalized velvet antler weight (R 2 = 0.869). Main beam circle (ß = 0.071, P < 0.001), main beam length (ß = 0.027, P < 0.001), mouth circle (ß = 0.043, P < 0.001), mouth length (ß = 0.016, P < 0.05), first tine circle (ß = 0.019, P < 0.05) were positively correlated with velvet antler weight. The correlation between EST microsatellite markers and the above growth characteristics of velvet antler of Shuangyang Sika deer was analyzed by one-way analysis of variance. The results showed that 8 EST microsatellite loci (M001, M009, M027, M084, M093, M113, M136, and M159) were significantly correlated with the above characteristics (Table 4).
Of particular concern was M009, which screened 9 genotypes of Sika deer (Fig. 2). Among them, genotype 357/363 had the highest mean value of velvet antler weight and main beam length. On the contrary, genotype 361/361 had the lowest mean values of velvet antler weight and main beam length (Table 5, Fig. 3a, b). The correlation between M009 alleles and velvet antler traits of Sika deer were further analyzed. The result showed that locus 363 and 357 had the highest mean value of velvet antler weight and main beam length, and there was no significant difference in contribution between them. However, locus 361 was the opposite (Table 6). Combined with other genotype individuals, it could be speculated that both the locus 357 and 363 of genotype 357/363 had a positive effect, while the locus 361 of genotype 361/361 had a negative impact on velvet antler weight and main beam length of two-branched velvet antler of shuangyang Sika deer.
Another concern was M027, which screened 19 genotypes of Sika deer (Fig. 4). Among them, genotype 289/ 321 had the highest mean value of velvet antler weight and main beam circle. On the contrary, genotype 263/ 263 had the lowest mean values of velvet antler weight and main beam circle (Table 7, Fig. 3c, d). The correlation between M027 alleles and velvet antler traits of Sika deer were further analyzed. The result showed that locus 295 and 321 had the highest mean value of velvet

Discussion
EST microsatellite were powerful molecular markers for analyzing population genetic diversity, molecular breeding and functions [27]. However, to date, few studies about EST microsatellite markers had been reported from Sika deer. Development of EST microsatellite markers based on transcriptome data was still an economic and efficient development strategy of DNA molecular markers. Our study was the first attempt to obtain transcriptional information of Sika deer for the EST microsatellite and to further examine the diversity and velvet antler yield characters. We successfully and efficiently isolated 182,295 EST microsatellite markers by IlluminaHiSeq™2500 sequencing technology. So far, we have obtained the largest number of Sika deer microsatellite databases to facilitate future research. The distribution of EST microsatellites with 1-6 bp repeat motif showed that, mono-nucleotide is the most abundant motif type, followed by di-nucleotide, tri-nucleotide, tetra-nucleotide, penta-nucleotide and hexa-nucleotide repeats. This was similar to the previous study on the development of microsatellites by reducedrepresentation genome sequencing in Sika deer [21]. The difference was that the number of microsatellites obtained by transcriptome sequencing was far more than that obtained by genome sequencing. However, the number of microsatellites highly repetitive motifs (≥10) from tri-nucleotide to hexa-nucleotide obtained by genome sequencing was more than that obtained by transcriptome sequencing. In this study, both mononucleotide and di-nucleotide repeats have been found to be the predominant repeat types. A/T, AC/GT, AGC/ CTG were the most abundant repeat among mono, di and tri-nucleotide types, which were similar to previous studies by reduced-representation genome sequencing [21]. However, AATG/ATTC was the most frequent repeat among tetra-nucleotide types, unlike AAAC previously reported in Sika deer [21]. Thus, the EST microsatellite types obtained by transcriptome sequencing and reduced-representation genome sequencing without reference genomes were different. The microsatellite loci obtained from transcriptome showed lower polymorphism compared with genome but were more likely to be related to functional loci.
As we all know, Shuangyang Sika deer was the first breed that had been identified by the state in China. It had excellent traits such as high yield, strong adaptability, early maturity and so on. Among all the breeds, the breeding scale and social benefits of Shuangyang Sika deer were the largest. In order to further study the genetics of Shuangyang Sika deer and to screen microsatellites related to velvet antler growth characteristics, 29 highly polymorphic EST microsatellites were developed from 170 microsatellites expressed in velvet antler, which showed that transcriptome sequencing technology could be used to efficiently develop microsatellites. The indicators reflecting the genetic diversity of the population mainly include He, Ho, PIC, Ne and so on. Firstly,  . The result was mainly due to the relatively conservative coding region, the polymorphism of the EST microsatellite derived from the transcriptome was lower than that of the G microsatellite derived from the genome. Then it was caused by the difference in geographical location and breeding degree. Finally, the Na of some EST microsatellites, such as M027 (Na = 12, Ne = 2.1), M082 (Na = 14, Ne = 3.0) and M176 (Na = 10, Ne = 3.8), was much larger than the Ne. However, the average Na of 29 EST microsatellites was close to Ne (Na = 5.6, Ne = 2.6). This indicated that some of the microsatellite alleles were unevenly distributed in the population, but the overall distribution remains uniform.
The reason for this phenomenon may be that during the long-term breeding process of Shuangyang Sika deer, some loci were subject to strong selection pressures associated with velvet antler growth characteristics, resulting in uneven distribution of alleles. Multiple linear regression analysis identified that velvet antler weight was positively correlated with main beam circle, main beam length, mouth circle, mouth length, first tine circle. Main beam circle, main beam length, mouth circle, mouth length and first tine circle have the potential to predict 86.9% of the variability of velvet antler weight. Surprisingly, there was no linear relationship between first tine length and velvet antler weight. Therefore, the data of first tine length have not been further analyzed. One of most important implications of this study was that 8 EST microsatellites were found can be used in correlation analysis of velvet antler growth characteristics. In particular, M009 and M027 can be used as molecular markers in the breeding process of weight of the two-branched velvet antler. Velvet antler weight was very important for the production performance of Chinese Sika deer [28]. Therefore, it was necessary to develop large-scale molecular markers associated with velvet antler weight by sequencing. Hu et al. (2019) found that 94 SNP genetic variations were related to the threebranched velvet antler weight [28]. There were two significant differences between our study and that of   In the same column, those with the same letters indicate that there was no significant difference between the two genotypes and those with different letters indicate that there was significant difference between the two genotypes two-branched velvet antler had lowered the rate of ossification, more organic components and greater medicinal value [29]. Secondly, most of the SNP sites developed by Hu et al. (2019) were in the intron region of the gene. However, the microsatellites we developed were in the exon region of the gene, which may obtain the concerned functional genes possibly associated with the growth and development of velvet antler. As we all know, complex characteristics could not be completely controlled by a single gene, and it required a series of genes. So far, screening weight related molecular markers of velvet antlers were still very limited. Molecular markers must be developed by using highthroughput omics data. In this study, microsatellite databases of Sika deer were developed by using transcriptome sequences. We investigated candidate EST microsatellite related to the characteristics in twobranched velvet antler. The result would facilitate further studies breeding of Sika deer and genetic mechanism of velvet antler weight difference.

Conclusion
In this study, we conducted a transcriptome survey of sika deer using next-generation sequencing technology. We obtained useful data of EST microsatellite, such as the frequency and distribution. Secondly, EST microsatellites were selected and further characterized for polymorphism analysis. Finally, the marker-trait association was tested for important and kernel characteristics of velvet antler in sika deer. The development of a large number of sika deer molecular markers could help to breeding and culture.

Sample collection
Twelve male Sika deer (three for each of the four developmental stages) were exsanguinated after general anesthesia. Ten types of tissue (adrenal, velvet antler, brain, heart, kidney, lung, liver, skeletal muscle, spleen and testes) from one-year-old (juvenile), 3 years old (adolescence), 5 years old (adult), and 10 years old (aged) Sika deer were then collected. Fresh samples of these tissues were immediately frozen in liquid nitrogen, and then stored at − 80°C for RNA extraction. One hundred forty healthy 2-year-old male Shuangyang Sika deer were randomly selected. All Sika deer came from the commercial farm under the same living conditions (Jilin Province). The samples were blooded from jugular vein of Sika deer after general anesthesia. Anticoagulant blood was stored in − 20°C until DNA extraction. The price of two-branched velvet antler was higher than that of three-branched velvet antler in Asian market. It took about 45 days for velvet antler to grow  In the same column, those with the same letters indicate that there was no significant difference between the two genotypes and those with different letters indicate that there was significant difference between the two genotypes into two-branched velvet antler. Therefore, after individual anesthesia, velvet antlers were harvested for measuring (velvet antler weight, main beam length, main beam circle, first tine length, first tine circle, mouth length, mouth circle) and sample collection at 50 days of growth.

Development of EST microsatellite markers derived from transcriptome of Sika deer
We have carried out Illumina Hiseq 2500 technology to perform high-throughput deep sequencing of the Sika deer transcriptome, with a cDNA library constructed by one RNA pool which had an equal quantity of total RNA extracted from adrenal, velvet antler, brain, heart, kidney, lung, liver, skeletal muscle, spleen and testes of Sika deer (Specific methods referred to the articles which we have published [3]. By comparing the professional software such as picard-tools and samtools, the duplicate reads were removed, and the original results were filtered. All types of microsatellite from mononucleotides to hexa-nucleotides were identified from the unigenes using MISA software under default parameter settings: a minimum of ten repeats for mono-nucleotide, six repeats for di-nucleotide microsatellites, five repeats for tri-nucleotide, tetra-nucleotide, penta-nucleotide and hexa-nucleotide microsatellites. Finally, in order to verify the identified microsatellite markers, all the microsatellites data need to be further screened. Polymorphism was one of the important criteria for judging the usability of microsatellite markers. The length of microsatellite fragment was one of the important factors which affecting its polymorphism. The polymorphism was high when the length of microsatellite was more than 20 bp, medium when the length of microsatellite was between 12 and 20 bp, and very low when the length of microsatellite was less than 12 bp [30]. Therefore, in this study,  the length of less than 12 bp microsatellite was removed. Among the mono-nucleotide motifs, the most common motif was A/T. It should be noted that mono-nucleotide duplication was prone to mismatches, leading to sequencing failure, so this study was not an option. A total of 200 EST microsatellites expressed in velvet antler were selected as candidate molecular markers.

EST microsatellites primer pair design
In this study, TP-M13-microsatellite PCR method was used to screen for primer [31]. Three primers were designed for PCR amplification: the first primer was F + M13, i.e. the 5'end of the EST microsatellite forward primer and M13 (5′-TGTAAAACGACGCCAGT-3′) to be linked together. The second primer was the EST microsatellite reverse primer, and the third primer was M13 primer fluorescently labeled at the 5'end with Cy5.

DNA extraction, EST microsatellite markers validation and polymorphism examination
The total blood genomic DNA was extracted from 140 Sika deer following the traditional proteinase K and phenol-chloroform extraction method and verified by electrophoresis on 1% agarose gel. DNA was stored at − 20°C until used for PCR amplification. Fluorescence PCR amplification was carried out in a 20 μL volume containing 8 μL 10× Buffer I, 0.4 μL TP -M13(5 M), 2 μL specific primer(5 M), 0.2 μL Taq DNA Polymerase, 2 μL DNA, 7.6 μL ddH 2 O. Taq DNA Polymerase was purchased from Takara Co.,Ltd. The primers were synthesized from Beijing Microread Genetics Co., Ltd. The PCR program consisted of an initial step of 95°C for 5 min, followed by 30 cycles of 94°C for 30 s, 56°C for 45 s, and 72°C for 45 s, followed by 10 cycles of 94°C for 30 s, 53°C for 45 s, 72°C for 45 s, and a final extension at 72°C for 12 min. The amplified products were evaluated on the ABI 3730XL DNA capillary sequencer with GeneScan 500 ROX Size Standard (Applied Biosystems, USA). The criterion for accepting a peak as polymorphic was that one peak was greater than or equal to one tenth of the other.

Data analysis
GenAlEx version 6.5 was employed for the allele data processing, which included the number of expected heterozygosity (H e ), observed heterozygosity (H o ), polymorphism information content (PIC), number of alleles (N a ), effective number of alleles (N e ) [32]. GENEPOP software was used to investigate linkage disequilibrium and to determine deviation from Hardy-Weinberg equilibrium [33]. The correlation between velvet antler weight and main beam length, main beam circle, first tine length, first tine circle, mouth length, mouth circle of velvet antler of Shuangyang Sika deer was analyzed by  In the same column, those with the same letters indicate that there was no significant difference between the two genotypes and those with different letters indicate that there was significant difference between the two genotypes multiple linear regressions. One-way analysis of variance (one-way ANOVA)and post hoc Bonferroni tests were performed on significant analysis of EST microsatellite markers with growth characteristics (velvet antler weight, main beam length, main beam circle, first tine length, first tine circle, mouth length, mouth circle) of 140 individuals in velvet antler. A p value≤0.05 was considered statistically significant difference.
Additional file 1. Distribution of EST microsatellite type in transcriptome of Sika deer.