RNA-seq的3的差异分析R包你选择哪个

在2010-2015年间,RNA-seq本身就是跟现在的单细胞差不多的当红炸子鸡的地位,无数的软件工具,网页数据库,测评文章涌现出来。很多课题组导师都认为做一个RNA-seq项目就能发CNS啦,就跟这两年大家以为做一个单细胞转录组项目就可以发CNS的坚信程度是一模一样的!

直到现在(2020),基于高通量测序技术的RNA-Seq方法仍然是转录组学研究中必不可少的工具。截止到(2016)已经普遍接受的是,标准化预处理步骤可以显着提高分析质量,特别是对于差异基因表达分析而言。 然而,彼时尚未找到金标准归一化方法。我在生信技能树的教程呢,通常是直接就推荐3大R包(limma,edgeR,DEseq2),转录组的基本分析教程合辑

很多人就问我这样推荐的理由,有没有参考文献,但是前些日子一直比较忙,就没有回复大家。恰好最近整理我五年前收集的RNA-seq资料,重新发现了一个能比较好支持3大R包(limma,edgeR,DEseq2)的文献。

文章详情:Maza E (2016) In Papyro Comparison of TMM (edgeR), RLE (DESeq2), and MRN Normalization Methods for a Simple Two-Conditions-Without-Replicates RNA-Seq Experimental Design. Front Genet 7:164. [article]

一图概况如下:

img

文章提到了以下3个算法,做了一下测试数据的比较:

  • The first method is the “Trimmed Mean of M-values” normalization (TMM) described in and implemented in the edgeR package.
  • The second method is the “Relative Log Expression” normalization (RLE) implemented in the DESeq2 package.
  • The third method is the “Median Ratio Normalization” (MRN).

作者的测试数据是:a matrix of counts: 34675 rows (genes) and 9 columns (samples from 3 stages and 3 biological replicates per stage). 一个 in silico calculations carried out on a given real data set from the tomato fruit set.

作者的结论很有意思:

  • For a very simple experimental design, i.e., about two conditions and no replicates, users can use any of the three studied normalization methods with no impact on results.
  • But, for a more complex experimental design, the MRN method could be adopted.

学徒作业,以仅提供bam文件的RNA-seq项目重新分析 教程提到的数据集为例子,比较3大R包(limma,edgeR,DEseq2)差异分析的结果,绘制一个韦恩图或者其它可视化的展现形式!因为这个RNA-seq项目的数据库链接在:https://www.ebi.ac.uk/ena/browser/view/PRJEB36947,仅仅是提供bam文件,如果你搞不定表达矩阵,可以发邮件找我索取,然后完成学徒作业!!!

历年学徒作业目录如下:

Comments are closed.