Skip to main content

Regulation of DNA methylation on key parasitism genes of Cysticercus cellulosae revealed by integrative epigenomic-transcriptomic analyses



The life cycle of Taenia solium is characterized by different stages of development, requiring various kinds of hosts that can appropriately harbor the eggs (proglottids), the oncospheres, the larvae and the adults. Similar to other metazoan pathogens, T. solium undergoes transcriptional and developmental regulation via epigenetics during its complex lifecycle and host interactions.


In the present study, we integrated whole-genome bisulfite sequencing and RNA-seq technologies to characterize the genome-wide DNA methylation and its effect on transcription of Cysticercus cellulosae of T. solium. We confirm that the T. solium genome in the cysticercus stage is epigenetically modified by DNA methylation in a pattern similar to that of other invertebrate genomes, i.e., sparsely or moderately methylated. We also observed an enrichment of non-CpG methylation in defined genetic elements of the T. solium genome. Furthermore, an integrative analysis of both the transcriptome and the DNA methylome indicated a strong correlation between these two datasets, suggesting that gene expression might be tightly regulated by DNA methylation. Importantly, our data suggested that DNA methylation might play an important role in repressing key parasitism-related genes, including genes encoding excretion-secretion proteins, thereby raising the possibility of targeting DNA methylation processes as a useful strategy in therapeutics of cysticercosis.


Cysticercus cellulosae, the larval stage of T. solium, resides in the central nervous system, skeletal muscle, and other organs of both pigs and humans [1, 2], resulting in the high prevalence of cysticercosis worldwide. As a neglected tropical disease prioritized by the World Health Organization, serious human disease burden [3, 4] and annual economic losses in livestock are caused by infection with this pork tapeworm. To better control this disease, the mechanisms of transcriptional and developmental regulation during its complex lifecycle and host interactions should be better understood.

DNA methylation, i.e., 5-methylcytosine (m5C) is an important epigenetic mechanism that is present in the genomes of Trichinella spiralis [5] and Platyhelminthes (Schistosoma mansoni [6]) parasitic nematodes. Via regulating gene transcription, DNA methylation plays an important role in parasitism. Similar to T. solium, S. mansoni belongs the phylum of Platyhelminthes. A previous study showed that S. mansoni contains conserved DNA methyltransferase 2 (DNMT2) and methyl-CpG binding proteins (MBD) [6, 7] and it is suggest that pharmacological inhibition of SmMBD2/3 and/or SmCBX biology could prove useful in the development of future schistosomiasis control strategies [8]. Importantly, demethylation induced by 5-azacytidine can disrupt egg production and maturation, indicating an essential role for DNA methylation in the normal development of this parasitic worms in this phylum [6]. On other side, schistosomiasis also induces the host’s persistent DNA methylation and tuberculosis specific immune changes [9].

In the present study, we aimed to identify functional DNA methylation machinery and detect cytosine methylation,which can be seen fromthe characterize of methylation patterns, the relationship between methylation and gene expression, and the DNA methylation on key parasitism genes, in the cysticercus cellulosae of T. solium based on a draft genome that has been sequenced and annotated previously [10]. To achieve this aim, we applied the whole-genome bisulfite sequencing (WGBS) method to characterize the genome-wide DNA methylation pattern at single-base resolution [11]. Based on this unbiased characterization, our results confirm that in the cysticercus stage, the T. solium genome [10] is epigenetically modified by DNA methylation in a pattern similar to that of other invertebrate genomes, i.e., sparsely or moderately methylated [12, 13]. We also observed an enrichment of non-CpG methylation in defined genetic elements of T. solium genome, which is a pattern different from mammalian methylomes [12, 13]. Furthermore, we applied RNA-seq technology to profile gene expression. An integrative analysis on both the transcriptome and DNA methylome indicated a strong correlation between these two datasets, suggesting that gene expression might be tightly regulated by DNA methylation. Importantly, our data suggested that DNA methylation might play an important role in repressing key parasitism-related genes, including genes encoding excretion–secretion proteins. In summary, for the first time, we provide data to characterize the DNA methylome and the transcriptome of the T. solium cysticercus cellulosae. Our data will be valuable to the community and will allow researchers to provide new insights into the mechanism of methylation in cysticercosis in future studies.

Materials and methods

Ethics statement

Animals were treated according to the guidelines of the National Institute of Health (publication No. 85–23, revised 1996). Animal protocols have been reviewed and approved by the Ethical Committee of the Jilin University affiliated with the Provincial Animal Health Committee, Jilin Province, China (Ethical Clearance number IZ-2009-08).

Sample collection and nuclei acid extraction

Masseter part of Individual cysticerci were isolated from a single, naturally infected Landrace (Inner Mongolia, China). The cysts of Cysticercus can be identified by eyes, which have milky oval cysts and filled with liquid and milky scolex on the wall, unbroken cysts sampling in aseptic conditions environment, and rinsed thoroughly several times with phosphate-buffered saline. The cysticerci were first frozen in liquid nitrogen and then finely ground to a powder-like texture. Genomic DNA was extracted using the phenol chloroform extraction method, and total RNA was purified using Trizol reagent (Invitrogen, CA, USA) according to the manufacturer’s instructions. The quantity and quality of the DNA and RNA were tested by ultraviolet-Vis spectrophotometry with a NanoDrop 2000 (Thermo Scientific CA, USA).

BlastP searches and phylogenetic analysis of DNMTs

Reciprocal BlastP comparisons were first performed to identify DNMTs and MBD orthologs.

Briefly, raw amino acid sequence of were blasted with NR library, Significant hits were defined as those satisfying the following criteria: the result with E-value < 1e-5 and aligned segments covering at least 30% of the sequence length of the hit were preliminary selected, and then, the hits for each gene with maximum of alignment length and with higher identity were used for further analysis. For phylogenetic analysis, multiple sequence alignment was performed by Clustal W [14]. The MEGA7 with the neighbor-joining method [15, 16] based on the JTT + G (Jones-Taylor-Thornton and Gamma Distribution) model was applied to reconstruct the phylogenetic tree with default parameters.

MethylC-seq library construction and sequencing

Prior to library construction, 5 μg of genomic DNA extracted from a cysticercosis body was spiked with 25 ng unmethylated lambda DNA (Promega, Madison, WI, USA) and fragmented using a Covarias sonication system to a mean size of approximately 200 bp. After fragmentation, libraries were constructed according to the Illumina Paired-End protocol with some modifications. Briefly, purified randomly fragmented DNA was treated with a mix of T4 DNA polymerase, Klenow fragment and T4 polynucleotide kinase to repair blunt ends and phosphorylate the ends. The blunt DNA fragments were subsequently 3’ adenylated using Klenow fragment (3’-5’ exo-), followed by ligation to adaptors synthesized with 5’-methylcytosine instead of cytosine using T4 DNA ligase. After each step, DNA was purified using a QIAquick PCR purification kit (Qiagen, Shanghai, China). Next, a ZYMO EZ DNA Methylation-Gold Kit^TM (ZYMO Research, Irvine, CA, USA) was employed to convert unmethylated cytosine to uracil, according to the manufacturer’s instructions. Finally, PCR was carried out in a final reaction volume. The PCR products were purified using a QIAquick gel extraction kit (Qiagen). Before analysis with an Illumina Hiseq2500, the purified products were analyzed using a Bioanalyzer analysis system (Agilent, Santa Clara, CA, USA) and quantified by real-time PCR. Raw sequencing data were processed using the Illumina base-calling pipeline (Illumina Pipeline version 1.3.1). The sodium bisulfite non-conversion rate was calculated as the percentage of cytosines sequenced at cytosine reference positions in the lambda genome. Bisulfate sequence libraries were prepared for sequencing using standard Illumina protocols, and it was sequenced by Illumina HiSeq 4000.

RNA-seq library construction and sequencing

Total RNA was extracted using the Invitrogen TRIzol Reagent and then treated with RNase-free DNase I (Ambion, Guangzhou, China) for 30 min. The integrity of total RNA was checked using an Agilent 2100 Bioanalyzer. cDNA libraries were prepared according to the manufacturer’s instructions (Illumina). To avoid priming bias when synthesizing the cDNA, mRNA was fragmented before cDNA synthesis. Fragmentation was performed using divalent cations at an elevated temperature. The resulting cDNA was purified using a QIAquick PCR Purification Kit (Qiagen). Then, the cDNA was subjected to end repair and phosphorylation using T4 DNA polymerase, Klenow DNA polymerase and T4 Polynucleotide Kinase (PNK). Subsequent purifications were performed using the QIAquick PCR Purification Kit (Qiagen). These repaired cDNA fragments were 3’-adenylated using KlenowExo (Illumina) and purified using the MinElute PCR Purification Kit (Qiagen), producing cDNA fragments with a single ‘A’ base overhang at the 3’ end for subsequent ligation to the adapters. Illumina PE adapters were ligated to the ends of these 3’-adenylated cDNA fragments and then purified using the MinElute PCR Purification Kit (Qiagen). To select a size range of templates for downstream enrichment, the products of the ligation reaction were purified on 2% TAE-Certified Low-Range Ultra Agarose (Bio-Rad, Hercules, CA, USA). cDNA fragments (200 ± 20 bp) were excised from the gel and extracted using the QIAquick Gel Extraction Kit (Qiagen). Fifteen rounds of PCR amplification were performed to enrich the adapter-modified cDNA library using primers complementary to the ends of the adapters (PCR Primer PE 1.0 and PCR Primer PE 2.0; Illumina). Illumina TruSeq RNA Sample Prep Kit was used with 1 ug of total RNA for the construction of sequencing libraries. RNA libraries were prepared for sequencing using standard Illumina protocols and sequenced by Illumina HiSeq 4000 platform.

Transcriptome mapping

The raw reads were obtained by Illumina sequencing. Then we get the clean reads from the data set by filtering reads contained the adaptor sequence, or the reads which had low quality and N bases occupy more than 50% of the read’s length with soapnuke [17]. Reads were mapped using Tophat 2.0.11 [18]. The current assembly of the T. solium database were used as reference sequence for the transcriptome mapping [9]. Expression was quantified using cufflinks 2.1.1 [19]. RepeatMasker [20] were used to identify tandem repeats.

Bisulfite mapping and methylation calling

The raw reads were obtained by Illumina sequencing. Then we get the clean reads from the data set by filtering reads contained the adaptor sequence, or the reads which had low quality and N bases occupy more than 50% of the read’s length with soapnuke [17]. Reads were mapped using BSMAP 2.2.74 [21]. The current assembly of the T. solium genome were used ss a reference sequence for the bisulfite mapping [9]. Only reads mapping with the properly paired reads were used. The CpG-specificity was calculated by determining the number of cytosines called in all mapped reads at all non-CpG positions and dividing by the number of all bases in all mapped reads at all non-CpG positions. Methylation ratios were determined using a Python script ( distributed together with the BSMAP package for both the forward and reverse strands. R package “seqLogo” was used to perform sequence in the form of a position weight matrix. An important summary measure of a given position weight matrix is its information content profile based on entropy [22].

Protein network analyses

The STRING online tool [7] was used with default parameters. Peptide sequences of key genes were the input and were aligned to Caenorhabditis elegans protein sequences.

Data availability

The T. solium methylome data have been deposited at NCBI/GEO/ under the accession number GSE84086.


The presence of DNA methylation in the T. solium genome

The methylation status of DNA is related to three types of enzymes, including DNA methyltransferases, which affect maintenance methylation and de novo methylation. To understand whether T. solium possesses the ability to methylate DNA, we first conducted a reciprocal Blast alignment to identify genes that might be homologous to known DNA (cytosine-5)-methyltransferases. As a result, two genes (Scaffold00200.gene8095 and LongOrf.asmbl_16366) were identified that are homologous to DNMT3B and DNMT2, respectively, with high sequence similarity (e-value < 1e-10). Scaffold00067.gene4890 was aligned (e-value < 1e-5) with either DNMT3A or DNMT3B from multiple species. In addition, more than one gene was matched with DNMT1, among which Scaffold00068.gene4920 had the best hit (e-value < 1e-10) (Table S1). Phylogenetic analyses by MEGA7 also supported these results (Figure S1). Moreover, we searched for genes homologous to methyl-CpG binding domain protein (MBD). Two candidate genes (LongOrf.asmbl_5021 and LongOrf.asmbl_14047) were homologous to MBDs in multiple species, including Echinococcus granulosus, which is closely related to T. solium evolutionarily (Table S1 and Figure S2). A full repertoire of functionally conserved amino acid residues was identified for both the potential DNMT2 and DNMT3 and the MBDs of T. solium, indicating that these proteins are functionally active (Table S2). However, a high level of divergence between T. solium and other species was observed for DNMT1 homologs (Table S2), which was in agreement with previous studies.

Given these results, we assessed the genome-wide DNA methylation profiles in T. solium using MethylC-Seq. There were 54.21 million raw reads generated (Table S3). BSMAP [21] was used to align the sequenced reads to the T. solium reference sequence, reaching an approximately 76.41% mapping rate. The average read depth was 11.32 per strand, while on average, over 50 Mb (90.52%) of each strand of the T. solium reference sequence was covered. Because of the potential occurrence of non-conversion and thymidine-cytosine sequencing errors, the false-positive rate was estimated by calculating the methylation level of lambda DNA, which is normally unmethylated (Materials and methods). We then applied the error rate (0.0041) to correct methylated cytosine sites (mC) identification according to the method described by Lister et al. [11], which is based on a binomial test and false discovery rate constraints. As a result, approximately 76.6 thousand mCs were estimated in the T. solium genome (accounting for 0.20% of the total cytosines sequenced with depth ≥ 5X). Both symmetrical CpG methylation and asymmetrical non-CpG methylation were revealed.

Characterization of overall methylation patterns

We further characterized the global patterns of DNA methylation in the genomes of T. solium. First, we showed the percentage of methylated cytosine of each sequence context. Among the 76.6 thousand mCs across the entire genome, a majority (69.5%) were in the context of CHH. In contrast, only 15.38 and 15.12% of the mCs were located in the contexts of CHG and CpG, respectively (Figure S3A). Furthermore, most of the CpG and non-CpGs displayed a low methylation fraction (< 30%) (Figure S3B and C). These patterns are highly different from mammalian methylomes, in which most 5mCs are located in CpG contexts and the majority of the CpGs are highly methylated (> 50%) [12]. Since most (69.5%) of the mCs in the T. solium genome were in the CHH context, we further analyzed the sequence context of mCHHs across the entire genome to further examine whether there is any sequence bias in the enrichment of cytosine methylation in the CHH context. As a result, mCpA was shown to be preferentially enriched within the methylated CHH dinucleotide (Fig. 1A). There were more than 21,000 methylated CpAs in each strand, meaning that 55.73% of total CpAs were methylated in the entire genome (Fig. 1B, D). This result was consistent with reports that mCpA was predominantly found in another tapeworm, S. mansoni [23, 24]. With regard to methylation levels, we did not observe significant differences among different sequence contexts for mCs (Fig. 1C).

Fig. 1
figure 1

Cytosine DNA methylation in T. solium. A Logo plots of the sequences proximal to sites of cytosine DNA methylation in each sequence context in T. solium; B Number of mCs for each type of dinucleotide; C Distribution of mCs; D Percentage of each type of dinucleotide; E, F Prevalence of mCA/mCT sites (y-axis) as a function of the number of bases between adjacent mCA/mCT sites (x-axis) based on all non-redundant pair-wise distances up to 50 nt in all introns. The blue line represents smoothing with cubic splines

We next examined whether there was any preference for the distance between adjacent sites of DNA methylation in the T. solium genome. The relative distance between mCs in each context within 50 nucleotides in introns was then analyzed because of the steady methylation without any selective pressure by protein coding genes in intron regions. Similar to the periodicity of 8–10 bases revealed in previous studies on the Arabidopsis and human genomes [25], we also observed a strong tendency of peaked enrichment of mCpA sites, which might be explained by a single turn of the DNA helix (Fig. 1E). Moreover, we found that mCpT revealed a similar periodicity of 8–12 bases (Fig. 1F), though the numbers of cytosines in the context of CpG and CpC were too few to yield reliable results (Figure S3D and E). In summary, our results indicated that the molecular mechanisms governing de novo methylation at CpA sites may be similar among the cysticercus and the plant and animal kingdoms.

We then examined the distribution of methylation levels for the four categories of methylated cytosines across the entire genome. In general, similar mosaic distribution patterns were observed for methylation levels of all types of mCs, that is, relatively highly methylated domains were interspersed within regions with low methylation (Figure S4A). Furthermore, the distribution of mCs across the genome was also uneven; dense mCs of specific categories were occasionally enriched in specific scaffolds (Figure S4B). Such a pattern has been observed in previous studies on other invertebrates. We also examined the patterns of methylation in annotated elements, including genes, tandem repeats, and transposable elements. The methylation percentage of each cytosine context in exons was higher than that in other annotated elements, especially CpAs, which accounted for a more than twofold greater percentage than the other contexts in exons (Fig. 2A).

Fig. 2
figure 2

Average methylation levels of different genomic regions. A, B Average density of methylation; C Average density of mC methylation distributed on the genome. Two-kilobase regions upstream and downstream of each gene were divided into 100-bp (bp) intervals. Each coding sequence or intron was divided into 20 intervals (5% per interval). D Average mC methylation level on repeat elements; E Number of mCs on each repeat element

We then examined the average methylation level in each element, which showed that average CpG methylation levels were higher than those other types of methylated cytosines, similar to mammalian genomes. However, the genome-wide pattern was again divergent from mammalian genomes, as higher average methylation in exons and lower methylation in introns of CpG sites were observed (Fig. 2B, C and Figure S5). The trend for the average methylation of CpC and CpT was similar to that of CpG. However, a uniform distribution of CpA methylation levels in each annotated element was displayed (Fig. 2C and Figure S5). We also analyzed the methylation level of each cytosine context in repeat regions (Fig. 2D, E). Previous studies have indicated that transposable elements are usually unmethylated in the honey bee Apis mellifera and silkworm Bombyx mori [26, 27]. In cysticercus, we observed a similar phenomenon as the above species except that relatively highly methylated rRNAs were observed in T. solium. Notably, CpAs were methylated at a higher level or frequency than other types in LINE/L1 (Fig. 2D, E).

The relationship between methylation and gene expression

It was reported that DNA methylation plays an important role in regulating gene expression. We evaluated gene expression in T. solium using Illumina high-throughput RNA-seq technology. Most of the raw reads could be uniquely mapped to previously annotated genes (88.17%). A total of 9,718 annotated genes out of 11,903 could be aligned with at least one unique read. To characterize the relationship between DNA methylation and gene expression, we divided the expressed genes with at least one read into quartiles of expression levels. We then examined the distribution of methylation levels for different quartiles of expressed genes and genes exhibiting no expression. High CpG and CpC methylation levels were observed in upstream and exon regions of genes with the lowest expression. Moreover, a negative correlation could also be observed between CpA and CpT methylation levels of upstream and exons and expression levels of these expressed genes. However, for silent genes, mainly high CpG and CpC methylation levels of downstream regions were observed (Fig. 3). Taken together, methylation levels of mCs from both CpG or non-CpG sequence contexts were correlated with gene expression levels, though different regulation mechanisms might be involved.

Fig. 3
figure 3

Relationship between mC DNA methylation and expression levels of genes in T. solium. Percentage of methylation within genes that were classified based on expression levels. The first class includes silent genes with no sequencing reads detected, and the second to fifth classes cover expressed genes from the lowest 25% to the highest 25%. Regions of 2 kb upstream and downstream of each gene was divided into 100-bp intervals, and each gene was divided into 20 intervals (5% per interval)

Next, to infer whether methylated genes were enriched for specific molecular functions, we filtered out a total of 1,647 of the genes with the lowest expression and 1,354 of the most highly expressed genes, based on the criteria that at least one mC was present within their genic regions. Then, we applied the WEGO (Web Gene Ontology Annotation Plotting) tool [28] to functionally categorize the gene ontology (GO) terms of these genes. We found that these two sets of genes displayed similar patterns of GO enrichment, specifically, “cell” and “cell part” in Cellular Component, “binding” and “catalytic” Molecular Functions, and “cellular process” and “metabolic process” in Biological Process were relatively enriched. This result suggested that the genes heavily regulated by DNA methylation were more prone to signaling regulation or interaction with environmental factors, e.g., diet or metabolism (Figure S6). In summary, these results suggested the potential for the regulation of T. solium genes by DNA methylation, especially those that function as regulators of cell–cell or cell-environmental communication. Furthermore, different molecular mechanisms might be involved depending on different mC contexts and genes.

Regulation of DNA methylation on key parasitism genes of T. solium

To obtain further insight into the epigenetic regulation of parasite development, survival and parasite-host interactions of T. solium, we next studied conserved genes across tapeworm-species and genes encoding excretion–secretion proteins (ESPs) in T. solium. For conserved genes, we applied a gene set that was reported previously in a study by Bjorn Victor et al., in which 261 genes conserved between Taenia and Echinococcus tapeworms were obtained by comparing the transcriptomes of five important intestinal parasites, including T. multiceps, T. solium, E. granulosus, E. multilocularis and T. pisiformis [29]. Based on their results, we further retrieved 216 genes with the best blastx hit for each contig (e < 1e-10) and studied their DNA methylation status. A total of 190 of these genes contained at least one mC across their genic regions. As indicated in Fig. 4C, CpG and CpC methylation levels in upstream and exon regions were higher than other types of methylation and in other genic regions. A further examination of the 190 genes revealed that 71 genes contained CpG or CpC methylation within their upstream or exon regions. Therefore, we searched for extensively methylated genes based on the criterion that the CpG and CpC methylation levels of the examined gene were significantly higher than the average value of the 71 genes. As a result, we revealed 14 conserved genes that were extensively methylated on CpG/CpC sites within their upstream regions and exons (p < 0.05). Compared with those 26 genes without mC, we found these 14 genes were expressed at a significantly lower level (Fig. 4A), suggesting DNA methylation maybe one of the key mechanisms for the transcriptional regulation of these conserved genes. For ESPs, we also applied a dataset containing 76 ESPs for T. solium, which was identified by Bjorn Victor et al. using a proteomics strategy [30]. We applied the BlastP algorithm to align these ESPs back to the genome and revealed 111 gene sequences that might encode these ESPs (Table S4). Using the same criterion for conserved genes, we found 13 extensively methylated genes. Similarly, gene expression comparisons again revealed that these 13 genes were expressed at significantly lower levels than the 26 non-methylated genes (Fig. 4A). Using a similar strategy, we also looked into genes containing methylated CpAs and CpTs within their upstream or exon regions. However, no clear difference in gene expression levels was observed (Fig. 4B). These results indicated that CpG/CpC methylation in upstream regions and exons played a more important role in T. solium gene repression. Furthermore, we found a different distribution pattern between mCpG and mCpC for these repressed genes, in which mCpG were mostly distributed in upstream regions, while mCpC were more often in exons (Figure S5). Based on the above analyses, we revealed 27 key genes that might be repressed by DNA methylation mechanisms. Interestingly, a protein–protein interaction analysis using the STRING online tool [7] indicated strong mutual interactions among these conserved proteins and ESPs (Fig. 5) based on annotation of the model organism Caenorhabditis elegans.

Fig. 4
figure 4

Boxplots of gene expression levels of un-methylated genes, and genes with conserved methylation and methylated ESP genes based on either A CpG/CpC methylation or B CpA/CpT methylation

Fig. 5
figure 5

Protein–protein interactions of conserved genes and ESP genes


The larval stage of the pork tapeworm T. solium is responsible for cysticercosis, which represents an important public health problem that occurs mainly in developing countries. T. solium cysticerci have developed diverse mechanisms to protect themselves from host immune attack [31], among which epigenetics may play an important role in gene regulation related to parasitism [29]. Recently, Geyer et al. found that essential DNA methylation machinery components, such as DNMT2 and MBD, are well conserved throughout the Platyhelminthes [6, 32]. Invertebrate DNMT2s are believed to retain strong DNA methyltransferase activity [33], which is different from vertebrate DNMT2s, which are considered tRNA methyltransferases [34]. Our computational searches indicated that both DNMT2 and DNMT3 are found in T. solium, which implies the potential existence of a more sophisticated DNA methylation machinery. In addition, MBD2/3 homologs were also identified in the T. solium genome.

Based on these results, our present study focused on characterizing the DNA methylome and transcriptome of T. solium cysticerci, aiming for providing comprehensive omics profiles for this important parasitic stage of T. solium. We revealed a mosaic methylation pattern in T. solium that is typical of other invertebrates [5, 6, 26, 27, 35, 36]. Cytosine methylation was predominantly found in the CpA dinucleotide context, similar to other invertebrate species, including Drosophila melanogaster [37] and other platyhelminths such as S. mansoni [32], which might be mediated by MBD2/3 proteins [38, 39]. These patterns in the DNA methylome might be closely related to the activity of different DNMTs. As DNMT1 functions as a maintenance methylase by copying methylation after DNA replication with the help of Uhrf1 [40], a lack of DNMT1 might help to explain why much non-symmetrical methylation was observed in the platyhelminth genome. We also found that a periodicity for two pairs of mCpA and mCpT sites spaced with 13 bases between the pairs, corresponding to a single turn of the DNA helix, as previously observed. A structural study of the mammalian de novo methyltransferase DNMT3A and its partner protein DNMT3L found that two copies of each form a heterotetramer that contains two active sites separated by a length of 8–10 nucleotides in a DNA helix [41, 42]. Because we could not locate DNMT3L in the T. solium genome, the consistent 8–10 nucleotide spacing we observed in the T. solium genome might be due to DNMT3A alone or an unknown factor other than DNMT3L.

Gene methylation is believed to be an evolutionarily ancient means of transcriptional control. Among plants, vertebrates and some invertebrates such as T. spiralis, the notion that methylation in promoters primarily represses genes by impeding transcriptional initiation has been widely accepted [13, 23, 24], whereas intermediate levels of expression have been associated with genes experiencing the greatest extent of methylation in the gene body, indicating a bell-shaped relationship [43,44,45]. However, in invertebrates, such as the fungus Neurospora crassa [46] and the silkworm Bombyx mori, transcription initiation is unaffected. Thus, DNA methylation shows remarkable diversity in its extent and function across eukaryotic evolution. In our T. solium results, we also found that methylation levels of mCs were correlated with gene expression levels. Depending on different sequence contexts, methylation seemed to function differently in transcriptional regulation. Intriguingly, high CpG and CpC methylation levels of downstream regions, but not of promoter regions, were observed for silent genes (Fig. 4). In contrast, upstream methylation seemed to mostly affect genes with low expression. Currently, knowledge on methylation patterns and their effects on gene regulation in non-vertebrates are still limited, though species-specific diversity has been observed [47]. Therefore, more data should be collected for the DNA methylomes of each specific species to characterize their patterns and functions.

In addition to characterizing the general distribution pattern of genome-wide DNA methylation, we also focused on methylation status of important T. solium genes. Based on previous studies, we looked into 27 extensively methylated genes that are important for T. solium development, survival and parasite-host interactions. We found 13 of these 27 genes mutually interacted based on annotations in the model organism C. elegans. Specifically, ESPs formed the core of the protein–protein-interaction network, while proteins encoded by conserved genes directly interacted with specific ESPs. Among the ESPs that were potentially regulated by DNA methylation, we found that two heat shock proteins (HSPs), hsp-90 and hsp-1, were highlighted and mutually interacted. The heat shock response is a general homeostatic mechanism that protects cells and organisms from the deleterious effects of environmental stress [48]. Together with COX-2, these proteins were previously reported to be important parasitism-related proteins [49]. Furthermore, we also revealed two genes encoding diagnostic antigens that might be regulated by DNA methylation, including diagnostic antigen gp50 and an 8 kDa diagnostic protein. GP50 is a glycosylated and GPI-anchored membrane protein. In recent years, one component of the lentil lectin purified glycoprotein (LLGP) antigens has been used for antibody-based diagnosis of cysticercosis [50]. The 8 kDa family members are metacestode excretory/secretory glycoproteins, which invoke strong antibody reactions in infected individuals [51]. Importantly, our data suggest that DNA methylation might play a key role in repressing their transcription, implying a potential for drug development in the future that can target epigenetic modification machinery to control this important neglected tropical disease.

The limitation of this study is that subject to research funding, we only performed one sample at cysticercus stage, more stages and environmental stress conidiations need to be designed in the further studies. Whereas it’s the first dataset of integrated DNA methylome and transcriptome of T. solium, and it may offer some clues for further studies of T. solium.

Availability of data and materials

All data are included in the article.



Whole-genome bisulfite sequencing


DNA methyltransferase


Methyl-CpG binding proteins


  1. Schantz PM, et al. Potential eradicability of taeniasis and cysticercosis. Bull Pan Am Health Organ. 1993;27(4):397–403.

    CAS  PubMed  Google Scholar 

  2. Van Belle S, et al. Peripheral Taenia infection increases immunoglobulins in the central nervous system. Int J Parasitol. 2021;51(8): 685–92.

  3. Sciutto E, et al. Taenia solium disease in humans and pigs: an ancient parasitosis disease rooted in developing countries and emerging as a major health problem of global dimensions. Microbes Infect. 2000;2(15):1875–90.

    Article  CAS  PubMed  Google Scholar 

  4. Skrip LA, et al. Data-driven analyses of behavioral strategies to eliminate cysticercosis in sub-Saharan Africa. PLoS Negl Trop Dis. 2021;15(3):e0009234.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Gao F, et al. Differential DNA methylation in discrete developmental stages of the parasitic nematode Trichinella spiralis. Genome Biol. 2012;13(10):R100.

    Article  PubMed  PubMed Central  Google Scholar 

  6. Geyer KK, et al. Cytosine methylation regulates oviposition in the pathogenic blood fluke Schistosoma mansoni. Nat Commun. 2011;2:424.

    Article  PubMed  CAS  Google Scholar 

  7. Maldonado LL, et al. The Echinococcus canadensis (G7) genome: a key knowledge of parasitic platyhelminth human diseases. BMC Genomics. 2017;18(1):204.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  8. Geyer KK, et al. Methyl-CpG-binding (SmMBD2/3) and chromobox (SmCBX) proteins are required for neoblast proliferation and oviposition in the parasitic blood fluke Schistosoma mansoni. PLoS Pathog. 2018;14(6):e1007107.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  9. DiNardo AR, et al. Schistosomiasis induces persistent DNA methylation and tuberculosis-specific immune changes. J Immunol. 2018;201(1):124–33.

    Article  CAS  PubMed  Google Scholar 

  10. Aguilar-Diaz H, et al. The genome project of Taenia solium. Parasitol Int. 2006;55(Suppl):S127–30.

    Article  CAS  PubMed  Google Scholar 

  11. Lister R, Ecker JR. Finding the fifth base: genome-wide sequencing of cytosine methylation. Genome Res. 2009;19(6):959–66.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Feng S, et al. Conservation and divergence of methylation patterning in plants and animals. Proc Natl Acad Sci U S A. 2010;107(19):8689–94.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Zemach A, et al. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science. 2010;328(5980):916–9.

    Article  CAS  PubMed  Google Scholar 

  14. Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4(4):406–25.

    CAS  PubMed  Google Scholar 

  15. Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8(3):275–82.

    CAS  PubMed  Google Scholar 

  16. Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33(7):1870–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Chen Y, et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience. 2018;7(1):1–6.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  18. Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2009;25:04.10.01-04.10.04.

    Article  Google Scholar 

  21. Xi Y, Li W. BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics. 2009;10:232.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  22. Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990;18(20):6097–100.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Zhang X. The epigenetic landscape of plants. Science. 2008;320(5875):489–92.

    Article  CAS  PubMed  Google Scholar 

  24. Weber M, et al. Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat Genet. 2007;39(4):457–66.

    Article  CAS  PubMed  Google Scholar 

  25. Cokus SJ, et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008;452(7184):215–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Xiang H, et al. Single base-resolution methylome of the silkworm reveals a sparse epigenomic map. Nat Biotechnol. 2010;28(5):516–20.

    Article  CAS  PubMed  Google Scholar 

  27. Lyko F, et al. The honey bee epigenomes: differential methylation of brain DNA in queens and workers. PLoS Biol. 2010;8(11):e1000506.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  28. Ye J, et al. WEGO: a web tool for plotting GO annotations. Nucleic Acids Res. 2006;34(Web Server issue):W293-7.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Robert McMaster W, Morrison CJ, Kobor MS. Epigenetics: a new model for intracellular parasite-host cell regulation. Trends Parasitol. 2016;32(7):515–21.

    Article  CAS  PubMed  Google Scholar 

  30. Victor B, et al. Proteomic analysis of Taenia solium metacestode excretion-secretion proteins. Proteomics. 2012;12(11):1860–9.

    Article  CAS  PubMed  Google Scholar 

  31. Hewitson JP, Grainger JR, Maizels RM. Helminth immunoregulation: the role of parasite secreted proteins in modulating host immunity. Mol Biochem Parasitol. 2009;167(1):1–11.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Geyer KK, et al. Cytosine methylation is a conserved epigenetic feature found throughout the phylum Platyhelminthes. BMC Genomics. 2013;14:462.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Phalke S, et al. Retrotransposon silencing and telomere integrity in somatic cells of Drosophila depends on the cytosine-5 methyltransferase DNMT2. Nat Genet. 2009;41(6):696–702.

    Article  CAS  PubMed  Google Scholar 

  34. Goll MG, et al. Methylation of tRNAAsp by the DNA methyltransferase homolog Dnmt2. Science. 2006;311(5759):395–8.

    Article  CAS  PubMed  Google Scholar 

  35. Bird AP, Taggart MH, Smith BA. Methylated and unmethylated DNA compartments in the sea urchin genome. Cell. 1979;17(4):889–901.

    Article  CAS  PubMed  Google Scholar 

  36. del Gaudio R, Di Giaimo R, Geraci G. Genome methylation of the marine annelid worm Chaetopterus variopedatus: methylation of a CpG in an expressed H1 histone gene. FEBS Lett. 1997;417(1):48–52.

    Article  PubMed  Google Scholar 

  37. Lyko F, Ramsahoye BH, Jaenisch R. DNA methylation in Drosophila melanogaster. Nature. 2000;408(6812):538–40.

    Article  CAS  PubMed  Google Scholar 

  38. Raddatz G, et al. Dnmt2-dependent methylomes lack defined DNA methylation patterns. Proc Natl Acad Sci U S A. 2013;110(21):8627–31.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Marhold J, et al. The Drosophila MBD2/3 protein mediates interactions between the MI-2 chromatin complex and CpT/A-methylated DNA. Development. 2004;131(24):6033–9.

    Article  CAS  PubMed  Google Scholar 

  40. Law JA, Jacobsen SE. Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat Rev Genet. 2010;11(3):204–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Huff JT, Zilberman D. Dnmt1-independent CG methylation contributes to nucleosome positioning in diverse eukaryotes. Cell. 2014;156(6):1286–97.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Jia D, et al. Structure of Dnmt3a bound to Dnmt3L suggests a model for de novo DNA methylation. Nature. 2007;449(7159):248–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  43. Zilberman D, et al. Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat Genet. 2007;39(1):61–9.

    Article  CAS  PubMed  Google Scholar 

  44. Jjingo D, et al. On the presence and role of human gene-body DNA methylation. Oncotarget. 2012;3(4):462–74.

    Article  PubMed  PubMed Central  Google Scholar 

  45. Nanty L, et al. Comparative methylomics reveals gene-body H3K36me3 in Drosophila predicts DNA methylation and CpG landscapes in other invertebrates. Genome Res. 2011;21(11):1841–50.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  46. Rountree MR, Selker EU. DNA methylation inhibits elongation but not initiation of transcription in Neurospora crassa. Genes Dev. 1997;11(18):2383–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Schubeler D. Function and information content of DNA methylation. Nature. 2015;517(7534):321–6.

    Article  CAS  PubMed  Google Scholar 

  48. Ferrer E, et al. Taenia solium: characterization of a small heat shock protein (Tsol-sHSP35.6) and its possible relevance to the diagnosis and pathogenesis of neurocysticercosis. Exp Parasitol. 2005;110(1):1–11.

    Article  CAS  PubMed  Google Scholar 

  49. Choi W, Chu J. The characteristics of the expression of heat shock proteins and COX-2 in the liver of hamsters infected with Clonorchis sinensis, and the change of endocrine hormones and cytokines. Folia Parasitol (Praha). 2012;59(4):255–63.

    Article  CAS  Google Scholar 

  50. Hancock K, et al. Characterization and cloning of GP50, a Taenia solium antigen diagnostic for cysticercosis. Mol Biochem Parasitol. 2004;133(1):115–24.

    Article  CAS  PubMed  Google Scholar 

  51. Ferrer E, et al. Diagnostic epitope variability within Taenia solium 8 kDa antigen family: implications for cysticercosis immunodetection. Exp Parasitol. 2012;130(1):78–85.

    Article  CAS  PubMed  Google Scholar 

Download references


Thanks to all authors for our contributions to this article.


This work was supported by the National Key Research and Development Program of China (2017YFD0501303); the National Natural Science Foundation of China (NSFC 31960707, 31460658, 31160504); the Inner Mongolia Provincial Natural Science Foundation (2017MS0321, 2021MS03037).

Author information

Authors and Affiliations



SS and LM conceived and designed the experiments. SY performed the experiments. LX and LX analyzed the data. WX, SW and JG wrote the manuscript. All the authors contributed to the article and approved the submitted version.

Corresponding authors

Correspondence to Mingyuan Liu or Shumin Sun.

Ethics declarations

Ethics approval and consent to participate

The animal study was reviewed and approved by the University of JILIN Animal Care and Use Committee (IZ-2009-08). Written informed consent was obtained from the owners for the participation of their animals in this study.

Consent for publication


Competing interests

GJ was employed by the company Shenzhen E-GENE, Co., Ltd.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1: Figure S1.

Phylogenetic tree of DNMT proteins.

Additional file 2: Figure S2.

Phylogenetic tree of mbd proteins.

Additional file 3: Figure S3.

Patterns and chromosomal distribution of DNA methylation in T. solium.

Additional file 4: Figure S4.

DNA methylation patterns and chromosomal distribution.

Additional file 5: Figure S5.

Average density of methylation levels of cytosine distributed on genome.

Additional file 6: Figure S6.

Gene Ontology (GO) analysis for the genes with the lowest expression (2nd) and the most highly expressed (5th) genes.

Additional file 7: Table S1.

Results of reciprocal BlastP searches of T.s DNMTs and MBD.

Additional file 8: Table S2.

Results of ClustW on dnmt protein sequences.

Additional file 9: Table S3.

Data summary of MethylC-seq and RNA-seq.

Additional file 10: Table S4.

Summary of key parasitism genes that are methylated.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Wang, X., Song, W., Ji, G. et al. Regulation of DNA methylation on key parasitism genes of Cysticercus cellulosae revealed by integrative epigenomic-transcriptomic analyses. Hereditas 158, 28 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Cysticercus cellulosae
  • DNA methylation
  • Epigenetics
  • Gene regulation