Identification of candidate biomarkers and pathways associated with psoriasis using bioinformatics analysis

Background The aim of this study was to identify the candidate biomarkers and pathways associated with psoriasis. GSE13355 and GSE14905 were extracted from the Gene Expression Omnibus (GEO) database. Then the differentially expressed genes (DEGs) with |logFC| > 2 and adjusted P < 0.05 were chosen. In addition, the Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses for DEGs were performed. Then, the GO terms with P < 0.05 and overlap coefficient greater than 0.5 were integrated by EnrichmentMap. Additionally, risk subpathways analysis for DEGs was also conducted by using the iSubpathwayMiner package to obtain more psoriasis-related DEGs and pathways. Finally, protein-protein interaction (PPI) network analysis was performed to identify the hub genes, and the DGIdb database was utilized to search for the candidate drugs for psoriasis. Results A total of 127 DEGs which were mostly associated with keratinization, keratinocyte differentiation, and epidermal cell differentiation biological processes were identified. Based on these GO terms, 3 modules (human skin, epidermis and cuticle differentiation, and enzyme activity) were constructed. Moreover, 9 risk subpathways such as steroid hormone biosynthesis, folate biosynthesis, and pyrimidine metabolism were screened. Finally, PPI network analysis demonstrated that CXCL10 was the hub gene with the highest degree, and CXCR2, CXCL10, IVL, OASL, and ISG15 were the potential gene targets of the drugs for treating psoriasis. Conclusion Psoriasis may be mostly caused by keratinization, keratinocyte differentiation, and epidermal cell differentiation; the pathogeneses were more related with pathways such as steroid hormone biosynthesis, folate biosynthesis, and pyrimidine metabolism. Besides, some psoriasis-related genes such as SPRR genes, HSD11B1, GGH, CXCR2, IVL, OASL, ISG15, and CXCL10 may be important targets in psoriatic therapy.


Background
Psoriasis is a common, chronic, relapsing, and immunemediated skin disease, prevalent in 2-4% of the total population [1]. Psoriasis is not purely a skin disorder and it may also affect many other organs and cause other diseases [2][3][4]. The cause of psoriasis in nearly one-third of the patients is hereditary, though it is often caused or influenced by environmental factors [5]. Oxidative stress, stress, and systemic corticosteroid withdrawal are thought to be the contributing factors to psoriasis [6]. Therefore, exploring the possible mechanisms of psoriasis is of great importance.
Nair et al. [10] identified psoriasis susceptibility loci and Swindell et al. [11] used genome-wide transcriptional analysis to compare gene expression patterns in human psoriatic lesions with those in 5 mouse models of psoriasis. However, they had not performed in-depth bioinformatics analyses to explore the rehabilitation mechanisms of psoriasis. Using the gene sets (GSE13355 and GSE14905) deposited by Nair et al. and Swindell et al., the gene expression profiling data related to psoriasis were used to generate the differentially expressed genes (DEGs) involved in psoriasis. Moreover, the gene ontology (GO) term enrichment analysis, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, risk subpathway analysis, and construction of protein-protein interaction (PPI) network were all performed, and the DGIdb database was utilized to search for the candidate drugs for psoriasis. Finally, the DEGs and pathways mostly correlated with psoriasis were analyzed to investigate the pathogenesis. Therefore, our study provides a deep understanding of susceptibility genes in psoriasis and may provide appropriate drug targets for psoriatic therapies. A flow chart depicting the schematic of this study has been shown in Fig. 1.

Differentially expressed genes between normal and lesional groups
A total of 127 DEGs were screened between normal and lesional groups with |logFC| > 2 and adjusted P < 0.05, including 97 (76%) up-regulated DEGs and 30 (24%) down-regulated DEGs (Supplementary Table 1). The heatmap of the DEGs has been shown in Fig. 2, and the result shows that the two groups could be significantly separated, indicating that the results of the difference analysis were reliable. Principal component analysis of differentially expressed genes As presented in Fig. 3, lesional and normal control samples were completely separated by the selected DEGs, meaning that the expression patterns of screened DEGs were specific and could be used to completely distinguish between lesional and normal control samples.

Functional enrichment analysis of differentially expressed genes
Functional enrichment analyses of the GO terms and KEGG pathway were performed for both up-regulated and down-regulated DEGs. All the DEGs were mainly enriched in 48 GO terms, containing 30 ontology terms in biological process, and the top 10 GO terms in biological process have been shown in Table 1. Biological processes of DEGs were mostly associated with keratinization (GO: 0031424), keratinocyte differentiation (GO: 0030216), and epidermal cell differentiation (GO: 0009913). By integrating these GO terms using EnrichmentMap, 3 modules were obtained, which mostly associated with human skin (module 1), epidermis and cuticle differentiation (module 2), and enzyme activity (module 3) (Fig. 4). In the KEGG pathway analysis, the DEGs were enriched in 4 pathways and most significantly related to the chemokine signaling pathway (10 genes, hsa04062), cytokine-cytokine receptor interaction (9 genes, hsa04060), toll-like receptor signaling pathway (6 genes, hsa04620), and RIG-I-like receptor signaling pathway (4 genes, hsa04622) ( Table 2). The   CXCL1, IL8, CCL20, CXCL13, S100A9, CXCL9, CXCR2, CCL27, CCL18, CXCL10 Fig. 4 Integration of GO terms for differentially expressed genes (DEGs). Red nodes represent GO terms. Green lines indicate that overlapping DEGs exist between two GO terms. Three circles represent three modules: module 1 mostly associates with skin, module 2 mostly associates with epidermis and cuticle differentiation, and module 3 mostly connects with enzyme activity enrichment analysis showed that small proline-rich protein (SPRR, including SPRR2C, SPRR1A and SPRR2B) genes were involved in all the biological processes related to epidermis and cuticle differentiation (Table 1). Moreover, CXC chemokine ligand (CXCL, including CXCL1, CXCL9 and CXCL10) genes were mostly involved in human skin-related GO terms and 4 KEGG pathways, and the C-X-C motif chemokine receptor 2 (CXCR2) was involved in the chemokine signaling pathway and cytokinecytokine receptor interaction (Table 1 and Table 2).

Construction of protein-protein interaction network based on differentially expressed genes
To explore the interactions among psoriasis-related DEGs, the PPI network containing 71 nodes and 225 edges was constructed (Fig. 5). Of all the nodes, only 8 nodes were down-regulated DEGs, while the other 63 nodes were up-regulated DEGs. In this PPI network, the top 16 proteins with higher degrees were chosen, among which CXCL10 had the highest degree (Table 4).

Discussion
Psoriasis is a common skin disease, and it is an organspecific autoimmune disease which is a common immune-mediated skin disease in adults [12]. Numerous studies have reported that psoriasis is generally caused by aberrant keratinocyte proliferation and epidermal hyperplasia [7,13]. In this study, GO terms in biological processes (e.g. keratinization, keratinocyte differentiation, and epidermal cell differentiation) for DEGs were mostly associated with epidermis and cuticle differentiation, and the SPRR gene family were the most correlated DEGs. SPRR genes, consisting of two genes for SPRR1 (SPRR1A and SPRR1B), seven genes for SPRR2, and a single gene for SPRR3, encode a novel class of small proline-rich proteins that could be strongly induced during the differentiation of epidermal keratinocytes [14]. Kainu et al. found that three candidate gene clusters, the S100, SPRR, and PGLYRP genes, contained functionally interesting psoriasis candidate genes [15], which were consistent with our results. Besides, the  KEGG pathway enrichment analysis showed that CXCR2 was involved in the chemokine signaling pathway and cytokine-cytokine receptor interaction. It has been shown that the expression of epidermal CXCR2 is increased in psoriasis, suggesting that activation of keratinocytes mediated by CXCR2 contributes to the characteristic epidermal changes observed in psoriasis [16]. The results of this study showed that CXCR2 was the potential gene target of the drugs for treating psoriasis. Moreover, proinflammatory cytokines are found to be associated with the progression of cerebral white matter injury (WMI) in preterm infants, and cytokinereceptor interaction may be critical in determining the effects of inflammation in the development of the disease, which further suggested that CXCR2 might have been related to the development of psoriasis via the chemokine signaling pathway and cytokine-cytokine receptor interaction. Through risk subpathway enrichment analysis, subpathways such as steroid hormone biosynthesis, folate biosynthesis, and pyrimidine metabolism were identified, and HSD11B1 wass involved in steroid hormone biosynthesis, and GGH was involved in folate biosynthesis. HSD11B1 is the main regulator of tissue-specific effects of excessive circulating glucocorticoids, which may be related to inflammation [17,18]. Thus, HSD11B1 was connected with psoriasis which has been always recognized as a chronic inflammatory disease of the skin.  Additionally, Sevilla et al. showed that HSD11B1 was associated with osteoporosis, and HSD11B1/HSD11B2 were responsible for cortisol to cortisone interconversion [19]. Therefore, HSD11B1 might be recognized as the probable gene associated with psoriasis. Furthermore, Warren et al. reported that GGH was an important target gene of psoriasis which was involved in folate biosynthesis from our study [20]. Pyrimidine metabolism was another important subpathway identified in the current study. Zhou et al. constructed a ceRNA network to determine the regulatory roles of lncRNAs in psoriasis, and found that the upregulated mRNAs were mainly enriched in the pyrimidine metabolism, cell cycle, and cytokine-cytokine receptor interaction signaling pathways [21]. In summary, DEGs involved in risk subpathways were more likely related to psoriasis and might be important targets in psoriatic therapy. Also, all the subpathways identified by risk subpathway analysis were considered as the most relevant pathways with psoriasis. The PPI network analysis showed that protein with the highest degree was encoded by CXCL10, a member of the CXC chemokine family of cytokines [22]. The results of this study showed that CXCL10 was the potential gene target of the drugs for treating psoriasis. CXCL10 has been reported to be expressed in many Th1-type human inflammatory diseases, including skin diseases (e.g. psoriasis) [23]. A recent study suggested that the decrease in serum CXCL10 level over time was related to the new onset of psoriatic arthritis in patients with psoriasis [24], and our study revealed a same result that CXCL10 was up-regulated in psoriasis. In the present study, the CXC chemokine family of cytokines (containing CXCL1, CXCL9, and CXCL10) was mostly enriched in human skin-related GO terms and KEGG pathways such as the chemokine signaling pathway and cytokine-cytokine receptor interaction, implying that the gene family might be important in regulating psoriasis. Luster et al. reviewed that chemokines-chemotactic cytokines could mediate inflammation, which might be the reason for the association between CXC chemokine family and psoriasis [25]. IVL, OASL, and ISG15 were also the potential gene targets of the drugs for treating psoriasis in this study. Chen et al. found that IVL was related to the earlyonset plaque psoriasis [26]. The expression pattern of OASL may offer useful information for treating autoimmune diseases and chronic infections [27]. In addition, Raposo et al. observed significant overexpression of 16 antiviral genes in lesional psoriatic skin, with more than two-fold increase in ISG15 [28].
However, there are some limitations in this study. For instance, the microarray data was obtained from the GEO database, and not generated by the authors. Therefore, further experiments should be conducted to verify whether these target genes can be used in the clinical treatment of psoriasis.

Conclusions
In the current study, risk subpathway analysis based on DEGs was considered a useful method to identify pathways mostly related to psoriasis. Psoriasis may be mostly caused by keratinization, keratinocyte differentiation, and epidermal cell differentiation and the pathogeneses were more related to pathways such as steroid hormone biosynthesis, folate biosynthesis, and pyrimidine metabolism. Besides, some psoriasis-related genes such as SPRR genes, HSD11B1, GGH, CXCR2, IVL, OASL, ISG15, and CXCL10 may be important targets in psoriatic therapy. Therefore, our study explored the pathogeneses of psoriasis in depth and provided probable target genes for psoriatic therapy.

Microarray data
Gene  Table 2 and Supplementary Table 3.

Data preprocessing and screening of differentially expressed genes
In order to obtain the common gene expression matrix and eliminate the heterogeneity among datasets, we used the ComBat function of sva package [29] of R language to eliminate batch effect between datasets. In addition, the Affy package [30] was applied for preprocessing the obtained data by conducting normalization, background correction and expression calculation. Then the probes were annotated by matrix data combined with chip platform annotation file. The average value of diverse probes would be considered as the eventual expression level of gene if they corresponded to the identical mRNA. Finally, a total of 54,675 probes were mapped to 19,851 gene symbols without repetition. After data preprocessing, the limma package [31] of R language was utilized to screen DEGs. The false discovery rate (FDR) of Benjamini and Hochberg (BH) method was applied to adjust p-values for multiple comparisons. The significant differentially expressed cut-off was set as |logFC| > 2 and adjusted P < 0.05. Finally, a heatmap was drawn to observe the clustering of samples.

Principal component analysis of differentially expressed genes
Principal component analysis (PCA), a multivariate regression analysis, was used to distinguish samples with multiple measurements [32]. A PCA of DEGs was conducted using the ggord package (version: 1.1.4) in R language in the present study. The PCA graph was then obtained, in which DEGs were considered as variables and the differences between normal control and lesional samples were observed.

GO and KEGG enrichment analysis
The enrichment analysis of DEGs was carried out by utilizing the online software DAVID [33], in which species was set as human and the other parameters were the default values. GO terms with a cut-off of P < 0.05 and gene count ≥5 were chosen. Likewise, the KEGG pathways with P < 0.05 were selected. Then a plug-in EnrichmentMap in Cytoscape [34] was used to integrate the GO terms with the cutoff of P < 0.05 and overlap coefficient larger than 0.5.

Risk subpathway screening for differentially expressed genes
To obtain more important genes and pathways in psoriasis, the iSubpathwayMiner package (version: 3.0) [35] in R language (version: 3.4.3) was selected to identify the risk subpathways for DEGs. Moreover, the subpathways with P < 0.05 were screened, which reflected the pathways involved in psoriasis more clearly.

Protein-protein interaction network construction
To further analyze the impacts of DEGs on psoriasis, the PPI network was constructed among various DEGs. Also, the online software STRING was used to analyze the interactions of proteins encoded by DEGs [36], and the PPI score was set as 0.4 (referred as median confidence). Then the Cytoscape software [34] (https://cytoscape.org/) was utilized to visualize the PPI network. Finally, the degree of each DEG was obtained by analyzing the topological structure of the PPI network.

Drug-hub gene interaction
The hub genes with high degree (degree ≥10) of connectivity in PPI were selected as the promising targets for searching drugs through the DGIdb database (version: 3.0.3, http://www.dgidb.org/). This database contains drug-gene interaction data from 30 disparate sources including ChEMBL, DrugBank, Ensembl, NCBI Entrez, PharmGKB, PubChem, clinical trial databases, and literature in NCBI PubMed. The results of this process were arranged such that each entry was a specific drug-gene interaction associated with its source link [37]. The identified target network was visualized using the Cytoscape software.
Additional file 1: Table S1. Differentially expressed genes between normal and lesional groups.