R语言可视化通路富集网络图

Python012

R语言可视化通路富集网络图,第1张

我们输入的数据包含 gene ID 和 vector(单样本)部分,这里的 gene ID 是一个通用概念,可以是基因、转录本、酶或蛋白质。这里的 vector 可以是样本的表达量、倍数变化, p-value, 组蛋白修饰数据等可测量的属性。下面我们以一个 RNA-seq 差异分析后的数据为例,来学习 pathview 的用法。

在 KEGG PATHWAY Database 查询,例如查询小鼠的"Cell Cycle"这条通路:

得到通路 ID 为"04110",物种为"mmu"

我们通过指定 gene.data pathway.id 来观察我们数据里的基因在信号通路“Pathways in cancer”上的表达变化:

相比于原始的 KEGG 图,我们可以使用 graphviz 产生一个新的布局,并且输出 PDF 格式的文件:

以下是输出结果图

如果我们想要运行的更快一点,并且不介意输出图片的大小,我们可以分图层,用 same.layer = F 将节点颜色和标签添加到另一个图层中,并且原来的 KEGG 基因标签会变成官方的 gene symbols :

在此基础上,修改 kegg.native = FALSE ,我们就可以得到一个主图与图例分成两个页面的 PDF 文件

在原始的 KEGG 视图中,一个基因节点可能代表具有相似或者冗余功能的基因/蛋白质,我们可以将这种包含多个基因的节点拆分成独立的节点,这样可以更好的从基因层面而不是节点层面来查看数据。同时也可以通过汇总基因数据来可视化节点数据:

为了画面有更好的清晰度和可读性,默认不分裂节点,也不单独标记每个成员基因。

代谢途径中,除了基因节点还有化合物节点,我们可以尝试利用代谢途径( Propanoate metabolism)整合基因数据和化合物数据。这里的化合物数据包括代谢物、药物,对它们的测量和它们的属性。在这里我们仍然使用之前 RNA-seq 差异分析的数据作为 gene data,然后,我们生成模拟化合物或代谢组数据,并加载适当的化合物 ID 类型以进行演示:

结果如下

pathview 可以集成并将多个样本或状态绘制成一个图,我们可以使用多个重复样本模拟化合物数据:

结果如下,可以看到基因节点和化合物节点被分成多份,对应不同的样本:

我们可以根据将化合物数据分为绝对值大于 5 和小于 5 两类,构成一组离散型数据:

结果如下:

Pathview 包中的主函数是 pathview() ,有着各种参数,是我们用到最多的函数。在这篇文章中,我们介绍了 pathview()的比较常见的用法,包括包安装,数据准备,以及其他有用的特性。我们也可以使用 pathxiew 的网页版,地址是 https://pathview.uncc.edu/ 。此外,Pathview 在数据整合方面有很强大的功能,包含 4800 个物种,能处理的数据属性和格式包括 连续/离散数据、矩阵/矢量、单个/多个样本数据 ,包中还具有强大的 ID 转换功能,这些都值得我们进一步探索。

生活很好,有你更好

Term_Name GeneHitsInSelectedSet AllGenesInSelectedSet GeneHitsInBackgroundAllGenesInBackgroundp-value enrichFactorGeneListInSelectedSets Qvalue

00941 Flavonoid biosynthesis14 492 41 38573.30E-042.676878842 "[FvH4_2g26480, FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_5g14010, FvH4_7g01160, FvH4_3g44420, FvH4_7g20870, FvH4_4g06180, FvH4_5g01170, FvH4_6g28410, FvH4_3g40570, FvH4_5g22390, FvH4_7g25890]" 0.04909626

00360 Phenylalanine metabolism 14 492 46 38570.001221701 2.38591375 "[FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_6g16060, FvH4_4g06180, FvH4_4g25490, FvH4_6g16460, FvH4_6g27650, FvH4_4g09340, FvH4_7g19130, FvH4_3g40570, FvH4_6g26610, FvH4_6g27940, FvH4_6g26600]" 0.091016736

00945 Stilbenoid, diarylheptanoid and gingerol biosynthesis 9 492 31 38570.012547314 2.275963808 "[FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_6g28410, FvH4_3g40570, FvH4_5g22390, FvH4_6g26800, FvH4_3g44420, FvH4_4g06180]"0.467387431

00270 Cysteine and methionine metabolism17 492 94 38570.083418875 1.417769417 "[FvH4_4g21340, FvH4_1g10540, FvH4_4g01140, FvH4_2g02530, FvH4_6g27650, FvH4_1g18690, FvH4_5g05120, FvH4_3g14020, FvH4_6g26610, FvH4_4g13980, FvH4_1g18490, FvH4_6g26600, FvH4_1g21920, FvH4_1g26460, FvH4_2g05040, FvH4_2g41260, FvH4_4g13280]"0.654179598

04120 Ubiquitin mediated proteolysis23 492 126 38570.04529262 1.431007227 "[FvH4_7g29370, FvH4_6g11010, FvH4_6g38720, FvH4_5g03910, FvH4_3g09200, FvH4_6g17370, FvH4_3g39370, FvH4_4g01260, FvH4_2g39250, FvH4_5g30320, FvH4_3g00910, FvH4_5g29350, FvH4_6g35920, FvH4_5g33030, FvH4_1g05910, FvH4_5g22570, FvH4_4g14790, FvH4_1g25030, FvH4_4g17530, FvH4_7g16630, FvH4_6g09540, FvH4_6g10930, FvH4_3g18500]"0.674860033

00260 Glycine, serine and threonine metabolism 11 492 49 38570.0408107 1.759872242 "[FvH4_1g08890, FvH4_7g07540, FvH4_5g38450, FvH4_2g05310, FvH4_2g22570, FvH4_1g21920, FvH4_2g16830, FvH4_2g36660, FvH4_1g19090, FvH4_4g13290, FvH4_4g25490]"0.675643816

00670 One carbon pool by folate 5 492 18 38570.069014744 2.177619693 "[FvH4_7g07540, FvH4_5g38450, FvH4_1g00040, FvH4_1g19090, FvH4_4g13290]"0.685546458

03015 mRNA surveillance pathway 20 492 114 38570.082844862 1.375338753 "[FvH4_7g29390, FvH4_6g17300, FvH4_5g13570, FvH4_3g29340, FvH4_4g03530, FvH4_2g38640, FvH4_1g18700, FvH4_1g18000, FvH4_2g34040, FvH4_5g33710, FvH4_6g06810, FvH4_5g25490, FvH4_5g03260, FvH4_2g15670, FvH4_4g07000, FvH4_4g36800, FvH4_5g25550, FvH4_2g06580, FvH4_5g05510, FvH4_6g09230]" 0.685771358

00603 Glycosphingolipid biosynthesis - globo and isoglobo series3 492 9 38570.096237762 2.613143631 "[FvH4_7g21240, FvH4_6g11740, FvH4_3g04760]"0.71697133

00400 Phenylalanine, tyrosine and tryptophan biosynthesis 9 492 37 38570.038722924 1.906888596 "[FvH4_7g11530, FvH4_6g27650, FvH4_6g26610, FvH4_4g21980, FvH4_6g26600, FvH4_2g22570, FvH4_6g47770, FvH4_5g36810, FvH4_1g20450]"0.721214462

00071 Fatty acid degradation8 492 35 38570.068800169 1.791869919 "[FvH4_1g26810, FvH4_1g08890, FvH4_5g05130, FvH4_2g14760, FvH4_4g18500, FvH4_1g25230, FvH4_2g37760, FvH4_6g40560]" 0.732230372

04712 Circadian rhythm - plant 5 492 14 38570.024734738 2.799796748 "[FvH4_2g29440, FvH4_7g29370, FvH4_1g17250, FvH4_7g01160, FvH4_5g22570]"0.737095202

03410 Base excision repair 8 492 34 38570.05939718 1.844571975 "[FvH4_4g29150, FvH4_4g36650, FvH4_2g21980, FvH4_6g11530, FvH4_2g39710, FvH4_4g35010, FvH4_2g40160, FvH4_4g35030]" 0.737514985

00130 Ubiquinone and other terpenoid-quinone biosynthesis 8 492 34 38570.05939718 1.844571975 "[FvH4_4g28800, FvH4_4g09340, FvH4_3g40570, FvH4_6g26610, FvH4_6g27940, FvH4_6g26600, FvH4_4g06180, FvH4_6g16460]" 0.737514985

00460 Cyanoamino acid metabolism7 492 32 38570.103957465 1.714875508 "[FvH4_4g26180, FvH4_7g07540, FvH4_5g38450, FvH4_7g05220, FvH4_1g19090, FvH4_4g13290, FvH4_3g43510]"0.737602967

00310 Lysine degradation8 492 30 38570.030124137 2.090514905 "[FvH4_1g08890, FvH4_5g05130, FvH4_3g23070, FvH4_1g16260, FvH4_1g25230, FvH4_2g36660, FvH4_6g40560, FvH4_3g25420]" 0.748082742

00785 Lipoic acid metabolism2 492 4 38570.081725815 3.919715447 "[FvH4_6g44960, FvH4_4g37350]" 0.761071655

00601 Glycosphingolipid biosynthesis - lacto and neolacto series2 492 4 38570.081725815 3.919715447 "[FvH4_6g11740, FvH4_3g04760]" 0.761071655

00940 Phenylpropanoid biosynthesis 26 492 149 38570.056260767 1.367954384 "[FvH4_2g05780, FvH4_4g23870, FvH4_5g35170, FvH4_7g32980, FvH4_2g30540, FvH4_2g26620, FvH4_7g05220, FvH4_3g44420, FvH4_6g16060, FvH4_4g06180, FvH4_6g16460, FvH4_3g43510, FvH4_7g19130, FvH4_4g26180, FvH4_6g28410, FvH4_6g27940, FvH4_4g36130, FvH4_3g46010, FvH4_1g16790, FvH4_6g30610, FvH4_4g09340, FvH4_3g15230, FvH4_3g40570, FvH4_5g22390, FvH4_6g27610, FvH4_5g21320]" 0.762077663

00450 Selenocompound metabolism 4 492 15 38570.113762224 2.090514905 "[FvH4_2g38710, FvH4_7g04540, FvH4_6g24170, FvH4_2g41260]" 0.770480519

00563 Glycosylphosphatidylinositol(GPI)-anchor biosynthesis 3 492 13 38570.224775136 1.809099437 "[FvH4_5g04770, FvH4_2g15820, FvH4_1g19740]"0.797416555

03008 Ribosome biogenesis in eukaryotes 6 492 33 38570.238324783 1.425351072 "[FvH4_1g27070, FvH4_1g17250, FvH4_1g16590, FvH4_2g38700, FvH4_3g27590, FvH4_1g22910]" 0.807054378

00860 Porphyrin and chlorophyll metabolism 8 492 47 38570.244407142 1.334371216 "[FvH4_3g20600, FvH4_5g33760, FvH4_7g25640, FvH4_2g27000, FvH4_3g20590, FvH4_1g04700, FvH4_2g23050, FvH4_4g37020]" 0.809259204

00053 Ascorbate and aldarate metabolism 8 492 47 38570.244407142 1.334371216 "[FvH4_1g08890, FvH4_5g05130, FvH4_3g33910, FvH4_6g20720, FvH4_7g08190, FvH4_7g13380, FvH4_1g25230, FvH4_5g20650]" 0.809259204

00944 Flavone and flavonol biosynthesis 2 492 5 38570.124932679 3.135772358 "[FvH4_6g17070, FvH4_5g14010]" 0.809346486

00040 Pentose and glucuronate interconversions 15 492 96 38570.236599083 1.224911077 "[FvH4_2g26010, FvH4_6g41430, FvH4_6g17310, FvH4_6g17430, FvH4_3g01680, FvH4_5g27090, FvH4_6g53340, FvH4_2g19540, FvH4_5g33570, FvH4_1g00260, FvH4_2g25970, FvH4_7g08190, FvH4_1g26360, FvH4_4g21500, FvH4_1g27720]"0.819843336

03450 Non-homologous end-joining2 492 7 38570.221439565 2.239837398 "[FvH4_4g35010, FvH4_4g35030]" 0.824862379

00942 Anthocyanin biosynthesis 2 492 7 38570.221439565 2.239837398 "[FvH4_3g19220, FvH4_7g33840]" 0.824862379

这是参考以下教程用自己的数据实现一遍,

R语言ggplot2画图系列——Pathway富集分析气泡图 - 生信技能树 - Powered by Discuz! http://www.biotrainee.com/forum.php?mod=viewthread&action=printable&tid=927

R语言ggplot2绘图教程——Pathway富集分析气泡图 - CSDN博客 http://blog.csdn.net/sinat_38163598/article/details/72827851