- Open Access
Regulation of DNA methylation on key parasitism genes of Cysticercus cellulosae revealed by integrative epigenomic-transcriptomic analyses
Hereditas volume 158, Article number: 28 (2021)
The life cycle of Taenia solium is characterized by different stages of development, requiring various kinds of hosts that can appropriately harbor the eggs (proglottids), the oncospheres, the larvae and the adults. Similar to other metazoan pathogens, T. solium undergoes transcriptional and developmental regulation via epigenetics during its complex lifecycle and host interactions.
In the present study, we integrated whole-genome bisulfite sequencing and RNA-seq technologies to characterize the genome-wide DNA methylation and its effect on transcription of Cysticercus cellulosae of T. solium. We confirm that the T. solium genome in the cysticercus stage is epigenetically modified by DNA methylation in a pattern similar to that of other invertebrate genomes, i.e., sparsely or moderately methylated. We also observed an enrichment of non-CpG methylation in defined genetic elements of the T. solium genome. Furthermore, an integrative analysis of both the transcriptome and the DNA methylome indicated a strong correlation between these two datasets, suggesting that gene expression might be tightly regulated by DNA methylation. Importantly, our data suggested that DNA methylation might play an important role in repressing key parasitism-related genes, including genes encoding excretion-secretion proteins, thereby raising the possibility of targeting DNA methylation processes as a useful strategy in therapeutics of cysticercosis.
Cysticercus cellulosae, the larval stage of T. solium, resides in the central nervous system, skeletal muscle, and other organs of both pigs and humans [1, 2], resulting in the high prevalence of cysticercosis worldwide. As a neglected tropical disease prioritized by the World Health Organization, serious human disease burden [3, 4] and annual economic losses in livestock are caused by infection with this pork tapeworm. To better control this disease, the mechanisms of transcriptional and developmental regulation during its complex lifecycle and host interactions should be better understood.
DNA methylation, i.e., 5-methylcytosine (m5C) is an important epigenetic mechanism that is present in the genomes of Trichinella spiralis  and Platyhelminthes (Schistosoma mansoni ) parasitic nematodes. Via regulating gene transcription, DNA methylation plays an important role in parasitism. Similar to T. solium, S. mansoni belongs the phylum of Platyhelminthes. A previous study showed that S. mansoni contains conserved DNA methyltransferase 2 (DNMT2) and methyl-CpG binding proteins (MBD) [6, 7] and it is suggest that pharmacological inhibition of SmMBD2/3 and/or SmCBX biology could prove useful in the development of future schistosomiasis control strategies . Importantly, demethylation induced by 5-azacytidine can disrupt egg production and maturation, indicating an essential role for DNA methylation in the normal development of this parasitic worms in this phylum . On other side, schistosomiasis also induces the host’s persistent DNA methylation and tuberculosis specific immune changes .
In the present study, we aimed to identify functional DNA methylation machinery and detect cytosine methylation,which can be seen fromthe characterize of methylation patterns, the relationship between methylation and gene expression, and the DNA methylation on key parasitism genes, in the cysticercus cellulosae of T. solium based on a draft genome that has been sequenced and annotated previously . To achieve this aim, we applied the whole-genome bisulfite sequencing (WGBS) method to characterize the genome-wide DNA methylation pattern at single-base resolution . Based on this unbiased characterization, our results confirm that in the cysticercus stage, the T. solium genome  is epigenetically modified by DNA methylation in a pattern similar to that of other invertebrate genomes, i.e., sparsely or moderately methylated [12, 13]. We also observed an enrichment of non-CpG methylation in defined genetic elements of T. solium genome, which is a pattern different from mammalian methylomes [12, 13]. Furthermore, we applied RNA-seq technology to profile gene expression. An integrative analysis on both the transcriptome and DNA methylome indicated a strong correlation between these two datasets, suggesting that gene expression might be tightly regulated by DNA methylation. Importantly, our data suggested that DNA methylation might play an important role in repressing key parasitism-related genes, including genes encoding excretion–secretion proteins. In summary, for the first time, we provide data to characterize the DNA methylome and the transcriptome of the T. solium cysticercus cellulosae. Our data will be valuable to the community and will allow researchers to provide new insights into the mechanism of methylation in cysticercosis in future studies.
Materials and methods
Animals were treated according to the guidelines of the National Institute of Health (publication No. 85–23, revised 1996). Animal protocols have been reviewed and approved by the Ethical Committee of the Jilin University affiliated with the Provincial Animal Health Committee, Jilin Province, China (Ethical Clearance number IZ-2009-08).
Sample collection and nuclei acid extraction
Masseter part of Individual cysticerci were isolated from a single, naturally infected Landrace (Inner Mongolia, China). The cysts of Cysticercus can be identified by eyes, which have milky oval cysts and filled with liquid and milky scolex on the wall, unbroken cysts sampling in aseptic conditions environment, and rinsed thoroughly several times with phosphate-buffered saline. The cysticerci were first frozen in liquid nitrogen and then finely ground to a powder-like texture. Genomic DNA was extracted using the phenol chloroform extraction method, and total RNA was purified using Trizol reagent (Invitrogen, CA, USA) according to the manufacturer’s instructions. The quantity and quality of the DNA and RNA were tested by ultraviolet-Vis spectrophotometry with a NanoDrop 2000 (Thermo Scientific CA, USA).
BlastP searches and phylogenetic analysis of DNMTs
Reciprocal BlastP comparisons were first performed to identify DNMTs and MBD orthologs.
Briefly, raw amino acid sequence of were blasted with NR library, Significant hits were defined as those satisfying the following criteria: the result with E-value < 1e-5 and aligned segments covering at least 30% of the sequence length of the hit were preliminary selected, and then, the hits for each gene with maximum of alignment length and with higher identity were used for further analysis. For phylogenetic analysis, multiple sequence alignment was performed by Clustal W . The MEGA7 with the neighbor-joining method [15, 16] based on the JTT + G (Jones-Taylor-Thornton and Gamma Distribution) model was applied to reconstruct the phylogenetic tree with default parameters.
MethylC-seq library construction and sequencing
Prior to library construction, 5 μg of genomic DNA extracted from a cysticercosis body was spiked with 25 ng unmethylated lambda DNA (Promega, Madison, WI, USA) and fragmented using a Covarias sonication system to a mean size of approximately 200 bp. After fragmentation, libraries were constructed according to the Illumina Paired-End protocol with some modifications. Briefly, purified randomly fragmented DNA was treated with a mix of T4 DNA polymerase, Klenow fragment and T4 polynucleotide kinase to repair blunt ends and phosphorylate the ends. The blunt DNA fragments were subsequently 3’ adenylated using Klenow fragment (3’-5’ exo-), followed by ligation to adaptors synthesized with 5’-methylcytosine instead of cytosine using T4 DNA ligase. After each step, DNA was purified using a QIAquick PCR purification kit (Qiagen, Shanghai, China). Next, a ZYMO EZ DNA Methylation-Gold Kit^TM (ZYMO Research, Irvine, CA, USA) was employed to convert unmethylated cytosine to uracil, according to the manufacturer’s instructions. Finally, PCR was carried out in a final reaction volume. The PCR products were purified using a QIAquick gel extraction kit (Qiagen). Before analysis with an Illumina Hiseq2500, the purified products were analyzed using a Bioanalyzer analysis system (Agilent, Santa Clara, CA, USA) and quantified by real-time PCR. Raw sequencing data were processed using the Illumina base-calling pipeline (Illumina Pipeline version 1.3.1). The sodium bisulfite non-conversion rate was calculated as the percentage of cytosines sequenced at cytosine reference positions in the lambda genome. Bisulfate sequence libraries were prepared for sequencing using standard Illumina protocols, and it was sequenced by Illumina HiSeq 4000.
RNA-seq library construction and sequencing
Total RNA was extracted using the Invitrogen TRIzol Reagent and then treated with RNase-free DNase I (Ambion, Guangzhou, China) for 30 min. The integrity of total RNA was checked using an Agilent 2100 Bioanalyzer. cDNA libraries were prepared according to the manufacturer’s instructions (Illumina). To avoid priming bias when synthesizing the cDNA, mRNA was fragmented before cDNA synthesis. Fragmentation was performed using divalent cations at an elevated temperature. The resulting cDNA was purified using a QIAquick PCR Purification Kit (Qiagen). Then, the cDNA was subjected to end repair and phosphorylation using T4 DNA polymerase, Klenow DNA polymerase and T4 Polynucleotide Kinase (PNK). Subsequent purifications were performed using the QIAquick PCR Purification Kit (Qiagen). These repaired cDNA fragments were 3’-adenylated using KlenowExo (Illumina) and purified using the MinElute PCR Purification Kit (Qiagen), producing cDNA fragments with a single ‘A’ base overhang at the 3’ end for subsequent ligation to the adapters. Illumina PE adapters were ligated to the ends of these 3’-adenylated cDNA fragments and then purified using the MinElute PCR Purification Kit (Qiagen). To select a size range of templates for downstream enrichment, the products of the ligation reaction were purified on 2% TAE-Certified Low-Range Ultra Agarose (Bio-Rad, Hercules, CA, USA). cDNA fragments (200 ± 20 bp) were excised from the gel and extracted using the QIAquick Gel Extraction Kit (Qiagen). Fifteen rounds of PCR amplification were performed to enrich the adapter-modified cDNA library using primers complementary to the ends of the adapters (PCR Primer PE 1.0 and PCR Primer PE 2.0; Illumina). Illumina TruSeq RNA Sample Prep Kit was used with 1 ug of total RNA for the construction of sequencing libraries. RNA libraries were prepared for sequencing using standard Illumina protocols and sequenced by Illumina HiSeq 4000 platform.
The raw reads were obtained by Illumina sequencing. Then we get the clean reads from the data set by filtering reads contained the adaptor sequence, or the reads which had low quality and N bases occupy more than 50% of the read’s length with soapnuke . Reads were mapped using Tophat 2.0.11 . The current assembly of the T. solium database were used as reference sequence for the transcriptome mapping . Expression was quantified using cufflinks 2.1.1 . RepeatMasker  were used to identify tandem repeats.
Bisulfite mapping and methylation calling
The raw reads were obtained by Illumina sequencing. Then we get the clean reads from the data set by filtering reads contained the adaptor sequence, or the reads which had low quality and N bases occupy more than 50% of the read’s length with soapnuke . Reads were mapped using BSMAP 2.2.74 . The current assembly of the T. solium genome were used ss a reference sequence for the bisulfite mapping . Only reads mapping with the properly paired reads were used. The CpG-specificity was calculated by determining the number of cytosines called in all mapped reads at all non-CpG positions and dividing by the number of all bases in all mapped reads at all non-CpG positions. Methylation ratios were determined using a Python script (methratio.py) distributed together with the BSMAP package for both the forward and reverse strands. R package “seqLogo” was used to perform sequence in the form of a position weight matrix. An important summary measure of a given position weight matrix is its information content profile based on entropy .
Protein network analyses
The STRING online tool  was used with default parameters. Peptide sequences of key genes were the input and were aligned to Caenorhabditis elegans protein sequences.
The T. solium methylome data have been deposited at NCBI/GEO/ under the accession number GSE84086.
The presence of DNA methylation in the T. solium genome
The methylation status of DNA is related to three types of enzymes, including DNA methyltransferases, which affect maintenance methylation and de novo methylation. To understand whether T. solium possesses the ability to methylate DNA, we first conducted a reciprocal Blast alignment to identify genes that might be homologous to known DNA (cytosine-5)-methyltransferases. As a result, two genes (Scaffold00200.gene8095 and LongOrf.asmbl_16366) were identified that are homologous to DNMT3B and DNMT2, respectively, with high sequence similarity (e-value < 1e-10). Scaffold00067.gene4890 was aligned (e-value < 1e-5) with either DNMT3A or DNMT3B from multiple species. In addition, more than one gene was matched with DNMT1, among which Scaffold00068.gene4920 had the best hit (e-value < 1e-10) (Table S1). Phylogenetic analyses by MEGA7 also supported these results (Figure S1). Moreover, we searched for genes homologous to methyl-CpG binding domain protein (MBD). Two candidate genes (LongOrf.asmbl_5021 and LongOrf.asmbl_14047) were homologous to MBDs in multiple species, including Echinococcus granulosus, which is closely related to T. solium evolutionarily (Table S1 and Figure S2). A full repertoire of functionally conserved amino acid residues was identified for both the potential DNMT2 and DNMT3 and the MBDs of T. solium, indicating that these proteins are functionally active (Table S2). However, a high level of divergence between T. solium and other species was observed for DNMT1 homologs (Table S2), which was in agreement with previous studies.
Given these results, we assessed the genome-wide DNA methylation profiles in T. solium using MethylC-Seq. There were 54.21 million raw reads generated (Table S3). BSMAP  was used to align the sequenced reads to the T. solium reference sequence, reaching an approximately 76.41% mapping rate. The average read depth was 11.32 per strand, while on average, over 50 Mb (90.52%) of each strand of the T. solium reference sequence was covered. Because of the potential occurrence of non-conversion and thymidine-cytosine sequencing errors, the false-positive rate was estimated by calculating the methylation level of lambda DNA, which is normally unmethylated (Materials and methods). We then applied the error rate (0.0041) to correct methylated cytosine sites (mC) identification according to the method described by Lister et al. , which is based on a binomial test and false discovery rate constraints. As a result, approximately 76.6 thousand mCs were estimated in the T. solium genome (accounting for 0.20% of the total cytosines sequenced with depth ≥ 5X). Both symmetrical CpG methylation and asymmetrical non-CpG methylation were revealed.
Characterization of overall methylation patterns
We further characterized the global patterns of DNA methylation in the genomes of T. solium. First, we showed the percentage of methylated cytosine of each sequence context. Among the 76.6 thousand mCs across the entire genome, a majority (69.5%) were in the context of CHH. In contrast, only 15.38 and 15.12% of the mCs were located in the contexts of CHG and CpG, respectively (Figure S3A). Furthermore, most of the CpG and non-CpGs displayed a low methylation fraction (< 30%) (Figure S3B and C). These patterns are highly different from mammalian methylomes, in which most 5mCs are located in CpG contexts and the majority of the CpGs are highly methylated (> 50%) . Since most (69.5%) of the mCs in the T. solium genome were in the CHH context, we further analyzed the sequence context of mCHHs across the entire genome to further examine whether there is any sequence bias in the enrichment of cytosine methylation in the CHH context. As a result, mCpA was shown to be preferentially enriched within the methylated CHH dinucleotide (Fig. 1A). There were more than 21,000 methylated CpAs in each strand, meaning that 55.73% of total CpAs were methylated in the entire genome (Fig. 1B, D). This result was consistent with reports that mCpA was predominantly found in another tapeworm, S. mansoni [23, 24]. With regard to methylation levels, we did not observe significant differences among different sequence contexts for mCs (Fig. 1C).
We next examined whether there was any preference for the distance between adjacent sites of DNA methylation in the T. solium genome. The relative distance between mCs in each context within 50 nucleotides in introns was then analyzed because of the steady methylation without any selective pressure by protein coding genes in intron regions. Similar to the periodicity of 8–10 bases revealed in previous studies on the Arabidopsis and human genomes , we also observed a strong tendency of peaked enrichment of mCpA sites, which might be explained by a single turn of the DNA helix (Fig. 1E). Moreover, we found that mCpT revealed a similar periodicity of 8–12 bases (Fig. 1F), though the numbers of cytosines in the context of CpG and CpC were too few to yield reliable results (Figure S3D and E). In summary, our results indicated that the molecular mechanisms governing de novo methylation at CpA sites may be similar among the cysticercus and the plant and animal kingdoms.
We then examined the distribution of methylation levels for the four categories of methylated cytosines across the entire genome. In general, similar mosaic distribution patterns were observed for methylation levels of all types of mCs, that is, relatively highly methylated domains were interspersed within regions with low methylation (Figure S4A). Furthermore, the distribution of mCs across the genome was also uneven; dense mCs of specific categories were occasionally enriched in specific scaffolds (Figure S4B). Such a pattern has been observed in previous studies on other invertebrates. We also examined the patterns of methylation in annotated elements, including genes, tandem repeats, and transposable elements. The methylation percentage of each cytosine context in exons was higher than that in other annotated elements, especially CpAs, which accounted for a more than twofold greater percentage than the other contexts in exons (Fig. 2A).
We then examined the average methylation level in each element, which showed that average CpG methylation levels were higher than those other types of methylated cytosines, similar to mammalian genomes. However, the genome-wide pattern was again divergent from mammalian genomes, as higher average methylation in exons and lower methylation in introns of CpG sites were observed (Fig. 2B, C and Figure S5). The trend for the average methylation of CpC and CpT was similar to that of CpG. However, a uniform distribution of CpA methylation levels in each annotated element was displayed (Fig. 2C and Figure S5). We also analyzed the methylation level of each cytosine context in repeat regions (Fig. 2D, E). Previous studies have indicated that transposable elements are usually unmethylated in the honey bee Apis mellifera and silkworm Bombyx mori [26, 27]. In cysticercus, we observed a similar phenomenon as the above species except that relatively highly methylated rRNAs were observed in T. solium. Notably, CpAs were methylated at a higher level or frequency than other types in LINE/L1 (Fig. 2D, E).
The relationship between methylation and gene expression
It was reported that DNA methylation plays an important role in regulating gene expression. We evaluated gene expression in T. solium using Illumina high-throughput RNA-seq technology. Most of the raw reads could be uniquely mapped to previously annotated genes (88.17%). A total of 9,718 annotated genes out of 11,903 could be aligned with at least one unique read. To characterize the relationship between DNA methylation and gene expression, we divided the expressed genes with at least one read into quartiles of expression levels. We then examined the distribution of methylation levels for different quartiles of expressed genes and genes exhibiting no expression. High CpG and CpC methylation levels were observed in upstream and exon regions of genes with the lowest expression. Moreover, a negative correlation could also be observed between CpA and CpT methylation levels of upstream and exons and expression levels of these expressed genes. However, for silent genes, mainly high CpG and CpC methylation levels of downstream regions were observed (Fig. 3). Taken together, methylation levels of mCs from both CpG or non-CpG sequence contexts were correlated with gene expression levels, though different regulation mechanisms might be involved.
Next, to infer whether methylated genes were enriched for specific molecular functions, we filtered out a total of 1,647 of the genes with the lowest expression and 1,354 of the most highly expressed genes, based on the criteria that at least one mC was present within their genic regions. Then, we applied the WEGO (Web Gene Ontology Annotation Plotting) tool  to functionally categorize the gene ontology (GO) terms of these genes. We found that these two sets of genes displayed similar patterns of GO enrichment, specifically, “cell” and “cell part” in Cellular Component, “binding” and “catalytic” Molecular Functions, and “cellular process” and “metabolic process” in Biological Process were relatively enriched. This result suggested that the genes heavily regulated by DNA methylation were more prone to signaling regulation or interaction with environmental factors, e.g., diet or metabolism (Figure S6). In summary, these results suggested the potential for the regulation of T. solium genes by DNA methylation, especially those that function as regulators of cell–cell or cell-environmental communication. Furthermore, different molecular mechanisms might be involved depending on different mC contexts and genes.
Regulation of DNA methylation on key parasitism genes of T. solium
To obtain further insight into the epigenetic regulation of parasite development, survival and parasite-host interactions of T. solium, we next studied conserved genes across tapeworm-species and genes encoding excretion–secretion proteins (ESPs) in T. solium. For conserved genes, we applied a gene set that was reported previously in a study by Bjorn Victor et al., in which 261 genes conserved between Taenia and Echinococcus tapeworms were obtained by comparing the transcriptomes of five important intestinal parasites, including T. multiceps, T. solium, E. granulosus, E. multilocularis and T. pisiformis . Based on their results, we further retrieved 216 genes with the best blastx hit for each contig (e < 1e-10) and studied their DNA methylation status. A total of 190 of these genes contained at least one mC across their genic regions. As indicated in Fig. 4C, CpG and CpC methylation levels in upstream and exon regions were higher than other types of methylation and in other genic regions. A further examination of the 190 genes revealed that 71 genes contained CpG or CpC methylation within their upstream or exon regions. Therefore, we searched for extensively methylated genes based on the criterion that the CpG and CpC methylation levels of the examined gene were significantly higher than the average value of the 71 genes. As a result, we revealed 14 conserved genes that were extensively methylated on CpG/CpC sites within their upstream regions and exons (p < 0.05). Compared with those 26 genes without mC, we found these 14 genes were expressed at a significantly lower level (Fig. 4A), suggesting DNA methylation maybe one of the key mechanisms for the transcriptional regulation of these conserved genes. For ESPs, we also applied a dataset containing 76 ESPs for T. solium, which was identified by Bjorn Victor et al. using a proteomics strategy . We applied the BlastP algorithm to align these ESPs back to the genome and revealed 111 gene sequences that might encode these ESPs (Table S4). Using the same criterion for conserved genes, we found 13 extensively methylated genes. Similarly, gene expression comparisons again revealed that these 13 genes were expressed at significantly lower levels than the 26 non-methylated genes (Fig. 4A). Using a similar strategy, we also looked into genes containing methylated CpAs and CpTs within their upstream or exon regions. However, no clear difference in gene expression levels was observed (Fig. 4B). These results indicated that CpG/CpC methylation in upstream regions and exons played a more important role in T. solium gene repression. Furthermore, we found a different distribution pattern between mCpG and mCpC for these repressed genes, in which mCpG were mostly distributed in upstream regions, while mCpC were more often in exons (Figure S5). Based on the above analyses, we revealed 27 key genes that might be repressed by DNA methylation mechanisms. Interestingly, a protein–protein interaction analysis using the STRING online tool  indicated strong mutual interactions among these conserved proteins and ESPs (Fig. 5) based on annotation of the model organism Caenorhabditis elegans.
The larval stage of the pork tapeworm T. solium is responsible for cysticercosis, which represents an important public health problem that occurs mainly in developing countries. T. solium cysticerci have developed diverse mechanisms to protect themselves from host immune attack , among which epigenetics may play an important role in gene regulation related to parasitism . Recently, Geyer et al. found that essential DNA methylation machinery components, such as DNMT2 and MBD, are well conserved throughout the Platyhelminthes [6, 32]. Invertebrate DNMT2s are believed to retain strong DNA methyltransferase activity , which is different from vertebrate DNMT2s, which are considered tRNA methyltransferases . Our computational searches indicated that both DNMT2 and DNMT3 are found in T. solium, which implies the potential existence of a more sophisticated DNA methylation machinery. In addition, MBD2/3 homologs were also identified in the T. solium genome.
Based on these results, our present study focused on characterizing the DNA methylome and transcriptome of T. solium cysticerci, aiming for providing comprehensive omics profiles for this important parasitic stage of T. solium. We revealed a mosaic methylation pattern in T. solium that is typical of other invertebrates [5, 6, 26, 27, 35, 36]. Cytosine methylation was predominantly found in the CpA dinucleotide context, similar to other invertebrate species, including Drosophila melanogaster  and other platyhelminths such as S. mansoni , which might be mediated by MBD2/3 proteins [38, 39]. These patterns in the DNA methylome might be closely related to the activity of different DNMTs. As DNMT1 functions as a maintenance methylase by copying methylation after DNA replication with the help of Uhrf1 , a lack of DNMT1 might help to explain why much non-symmetrical methylation was observed in the platyhelminth genome. We also found that a periodicity for two pairs of mCpA and mCpT sites spaced with 13 bases between the pairs, corresponding to a single turn of the DNA helix, as previously observed. A structural study of the mammalian de novo methyltransferase DNMT3A and its partner protein DNMT3L found that two copies of each form a heterotetramer that contains two active sites separated by a length of 8–10 nucleotides in a DNA helix [41, 42]. Because we could not locate DNMT3L in the T. solium genome, the consistent 8–10 nucleotide spacing we observed in the T. solium genome might be due to DNMT3A alone or an unknown factor other than DNMT3L.
Gene methylation is believed to be an evolutionarily ancient means of transcriptional control. Among plants, vertebrates and some invertebrates such as T. spiralis, the notion that methylation in promoters primarily represses genes by impeding transcriptional initiation has been widely accepted [13, 23, 24], whereas intermediate levels of expression have been associated with genes experiencing the greatest extent of methylation in the gene body, indicating a bell-shaped relationship [43,44,45]. However, in invertebrates, such as the fungus Neurospora crassa  and the silkworm Bombyx mori, transcription initiation is unaffected. Thus, DNA methylation shows remarkable diversity in its extent and function across eukaryotic evolution. In our T. solium results, we also found that methylation levels of mCs were correlated with gene expression levels. Depending on different sequence contexts, methylation seemed to function differently in transcriptional regulation. Intriguingly, high CpG and CpC methylation levels of downstream regions, but not of promoter regions, were observed for silent genes (Fig. 4). In contrast, upstream methylation seemed to mostly affect genes with low expression. Currently, knowledge on methylation patterns and their effects on gene regulation in non-vertebrates are still limited, though species-specific diversity has been observed . Therefore, more data should be collected for the DNA methylomes of each specific species to characterize their patterns and functions.
In addition to characterizing the general distribution pattern of genome-wide DNA methylation, we also focused on methylation status of important T. solium genes. Based on previous studies, we looked into 27 extensively methylated genes that are important for T. solium development, survival and parasite-host interactions. We found 13 of these 27 genes mutually interacted based on annotations in the model organism C. elegans. Specifically, ESPs formed the core of the protein–protein-interaction network, while proteins encoded by conserved genes directly interacted with specific ESPs. Among the ESPs that were potentially regulated by DNA methylation, we found that two heat shock proteins (HSPs), hsp-90 and hsp-1, were highlighted and mutually interacted. The heat shock response is a general homeostatic mechanism that protects cells and organisms from the deleterious effects of environmental stress . Together with COX-2, these proteins were previously reported to be important parasitism-related proteins . Furthermore, we also revealed two genes encoding diagnostic antigens that might be regulated by DNA methylation, including diagnostic antigen gp50 and an 8 kDa diagnostic protein. GP50 is a glycosylated and GPI-anchored membrane protein. In recent years, one component of the lentil lectin purified glycoprotein (LLGP) antigens has been used for antibody-based diagnosis of cysticercosis . The 8 kDa family members are metacestode excretory/secretory glycoproteins, which invoke strong antibody reactions in infected individuals . Importantly, our data suggest that DNA methylation might play a key role in repressing their transcription, implying a potential for drug development in the future that can target epigenetic modification machinery to control this important neglected tropical disease.
The limitation of this study is that subject to research funding, we only performed one sample at cysticercus stage, more stages and environmental stress conidiations need to be designed in the further studies. Whereas it’s the first dataset of integrated DNA methylome and transcriptome of T. solium, and it may offer some clues for further studies of T. solium.
Availability of data and materials
All data are included in the article.
Whole-genome bisulfite sequencing
Methyl-CpG binding proteins
Schantz PM, et al. Potential eradicability of taeniasis and cysticercosis. Bull Pan Am Health Organ. 1993;27(4):397–403.
Van Belle S, et al. Peripheral Taenia infection increases immunoglobulins in the central nervous system. Int J Parasitol. 2021;51(8): 685–92.
Sciutto E, et al. Taenia solium disease in humans and pigs: an ancient parasitosis disease rooted in developing countries and emerging as a major health problem of global dimensions. Microbes Infect. 2000;2(15):1875–90.
Skrip LA, et al. Data-driven analyses of behavioral strategies to eliminate cysticercosis in sub-Saharan Africa. PLoS Negl Trop Dis. 2021;15(3):e0009234.
Gao F, et al. Differential DNA methylation in discrete developmental stages of the parasitic nematode Trichinella spiralis. Genome Biol. 2012;13(10):R100.
Geyer KK, et al. Cytosine methylation regulates oviposition in the pathogenic blood fluke Schistosoma mansoni. Nat Commun. 2011;2:424.
Maldonado LL, et al. The Echinococcus canadensis (G7) genome: a key knowledge of parasitic platyhelminth human diseases. BMC Genomics. 2017;18(1):204.
Geyer KK, et al. Methyl-CpG-binding (SmMBD2/3) and chromobox (SmCBX) proteins are required for neoblast proliferation and oviposition in the parasitic blood fluke Schistosoma mansoni. PLoS Pathog. 2018;14(6):e1007107.
DiNardo AR, et al. Schistosomiasis induces persistent DNA methylation and tuberculosis-specific immune changes. J Immunol. 2018;201(1):124–33.
Aguilar-Diaz H, et al. The genome project of Taenia solium. Parasitol Int. 2006;55(Suppl):S127–30.
Lister R, Ecker JR. Finding the fifth base: genome-wide sequencing of cytosine methylation. Genome Res. 2009;19(6):959–66.
Feng S, et al. Conservation and divergence of methylation patterning in plants and animals. Proc Natl Acad Sci U S A. 2010;107(19):8689–94.
Zemach A, et al. Genome-wide evolutionary analysis of eukaryotic DNA methylation. Science. 2010;328(5980):916–9.
Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4(4):406–25.
Jones DT, Taylor WR, Thornton JM. The rapid generation of mutation data matrices from protein sequences. Comput Appl Biosci. 1992;8(3):275–82.
Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33(7):1870–4.
Chen Y, et al. SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data. Gigascience. 2018;7(1):1–6.
Trapnell C, Pachter L, Salzberg SL. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009;25(9):1105–11.
Trapnell C, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5.
Tarailo-Graovac M, Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2009;25:04.10.01-04.10.04.
Xi Y, Li W. BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics. 2009;10:232.
Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990;18(20):6097–100.
Zhang X. The epigenetic landscape of plants. Science. 2008;320(5875):489–92.
Weber M, et al. Distribution, silencing potential and evolutionary impact of promoter DNA methylation in the human genome. Nat Genet. 2007;39(4):457–66.
Cokus SJ, et al. Shotgun bisulphite sequencing of the Arabidopsis genome reveals DNA methylation patterning. Nature. 2008;452(7184):215–9.
Xiang H, et al. Single base-resolution methylome of the silkworm reveals a sparse epigenomic map. Nat Biotechnol. 2010;28(5):516–20.
Lyko F, et al. The honey bee epigenomes: differential methylation of brain DNA in queens and workers. PLoS Biol. 2010;8(11):e1000506.
Ye J, et al. WEGO: a web tool for plotting GO annotations. Nucleic Acids Res. 2006;34(Web Server issue):W293-7.
Robert McMaster W, Morrison CJ, Kobor MS. Epigenetics: a new model for intracellular parasite-host cell regulation. Trends Parasitol. 2016;32(7):515–21.
Victor B, et al. Proteomic analysis of Taenia solium metacestode excretion-secretion proteins. Proteomics. 2012;12(11):1860–9.
Hewitson JP, Grainger JR, Maizels RM. Helminth immunoregulation: the role of parasite secreted proteins in modulating host immunity. Mol Biochem Parasitol. 2009;167(1):1–11.
Geyer KK, et al. Cytosine methylation is a conserved epigenetic feature found throughout the phylum Platyhelminthes. BMC Genomics. 2013;14:462.
Phalke S, et al. Retrotransposon silencing and telomere integrity in somatic cells of Drosophila depends on the cytosine-5 methyltransferase DNMT2. Nat Genet. 2009;41(6):696–702.
Goll MG, et al. Methylation of tRNAAsp by the DNA methyltransferase homolog Dnmt2. Science. 2006;311(5759):395–8.
Bird AP, Taggart MH, Smith BA. Methylated and unmethylated DNA compartments in the sea urchin genome. Cell. 1979;17(4):889–901.
del Gaudio R, Di Giaimo R, Geraci G. Genome methylation of the marine annelid worm Chaetopterus variopedatus: methylation of a CpG in an expressed H1 histone gene. FEBS Lett. 1997;417(1):48–52.
Lyko F, Ramsahoye BH, Jaenisch R. DNA methylation in Drosophila melanogaster. Nature. 2000;408(6812):538–40.
Raddatz G, et al. Dnmt2-dependent methylomes lack defined DNA methylation patterns. Proc Natl Acad Sci U S A. 2013;110(21):8627–31.
Marhold J, et al. The Drosophila MBD2/3 protein mediates interactions between the MI-2 chromatin complex and CpT/A-methylated DNA. Development. 2004;131(24):6033–9.
Law JA, Jacobsen SE. Establishing, maintaining and modifying DNA methylation patterns in plants and animals. Nat Rev Genet. 2010;11(3):204–20.
Huff JT, Zilberman D. Dnmt1-independent CG methylation contributes to nucleosome positioning in diverse eukaryotes. Cell. 2014;156(6):1286–97.
Jia D, et al. Structure of Dnmt3a bound to Dnmt3L suggests a model for de novo DNA methylation. Nature. 2007;449(7159):248–51.
Zilberman D, et al. Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat Genet. 2007;39(1):61–9.
Jjingo D, et al. On the presence and role of human gene-body DNA methylation. Oncotarget. 2012;3(4):462–74.
Nanty L, et al. Comparative methylomics reveals gene-body H3K36me3 in Drosophila predicts DNA methylation and CpG landscapes in other invertebrates. Genome Res. 2011;21(11):1841–50.
Rountree MR, Selker EU. DNA methylation inhibits elongation but not initiation of transcription in Neurospora crassa. Genes Dev. 1997;11(18):2383–95.
Schubeler D. Function and information content of DNA methylation. Nature. 2015;517(7534):321–6.
Ferrer E, et al. Taenia solium: characterization of a small heat shock protein (Tsol-sHSP35.6) and its possible relevance to the diagnosis and pathogenesis of neurocysticercosis. Exp Parasitol. 2005;110(1):1–11.
Choi W, Chu J. The characteristics of the expression of heat shock proteins and COX-2 in the liver of hamsters infected with Clonorchis sinensis, and the change of endocrine hormones and cytokines. Folia Parasitol (Praha). 2012;59(4):255–63.
Hancock K, et al. Characterization and cloning of GP50, a Taenia solium antigen diagnostic for cysticercosis. Mol Biochem Parasitol. 2004;133(1):115–24.
Ferrer E, et al. Diagnostic epitope variability within Taenia solium 8 kDa antigen family: implications for cysticercosis immunodetection. Exp Parasitol. 2012;130(1):78–85.
Thanks to all authors for our contributions to this article.
This work was supported by the National Key Research and Development Program of China (2017YFD0501303); the National Natural Science Foundation of China (NSFC 31960707, 31460658, 31160504); the Inner Mongolia Provincial Natural Science Foundation (2017MS0321, 2021MS03037).
Ethics approval and consent to participate
The animal study was reviewed and approved by the University of JILIN Animal Care and Use Committee (IZ-2009-08). Written informed consent was obtained from the owners for the participation of their animals in this study.
Consent for publication
GJ was employed by the company Shenzhen E-GENE, Co., Ltd.
The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Phylogenetic tree of DNMT proteins.
Phylogenetic tree of mbd proteins.
Patterns and chromosomal distribution of DNA methylation in T. solium.
DNA methylation patterns and chromosomal distribution.
Average density of methylation levels of cytosine distributed on genome.
Gene Ontology (GO) analysis for the genes with the lowest expression (2nd) and the most highly expressed (5th) genes.
Results of reciprocal BlastP searches of T.s DNMTs and MBD.
Results of ClustW on dnmt protein sequences.
Data summary of MethylC-seq and RNA-seq.
Summary of key parasitism genes that are methylated.
About this article
Cite this article
Wang, X., Song, W., Ji, G. et al. Regulation of DNA methylation on key parasitism genes of Cysticercus cellulosae revealed by integrative epigenomic-transcriptomic analyses. Hereditas 158, 28 (2021). https://doi.org/10.1186/s41065-021-00195-9
- Cysticercus cellulosae
- DNA methylation
- Gene regulation