Bioconductor包chimeraviz嵌合RNA可视化

Bioconductor包chimeraviz嵌合RNA可视化

高通量RNA测序已经能够更高效地检测融合转录本,但是融合检测的技术和相关软件通常产生高错误发现率。而一个自动整合RNA数据和已知基因组特征的可视化框架对于结果的检验是有帮助的。2017年发布的一个bioconductor包,chimeraviz就可以做到自动创建嵌合RNA可视化。

支持来自9种不同融合发现工具(deFuseEricScript、InFusion、JAFFA、FusionCatcher、FusionMap、PRADA、SOAPfuse和STAR-FUSION)的输入。

官网教程

直接在bioconductor可以看到详细说明:https://bioconductor.org/packages/release/bioc/html/chimeraviz.html | HTML | R Script |

下载安装好该R包后,自带一系列的融合基因可视化的测试数据,文件如下:

  1.1K Oct 16 22:36 5267readsAligned.bam
   96B Oct 16 22:36 5267readsAligned.bam.bai
   22K Oct 16 22:36 FusionMap_01_TestDataset_InputFastq.FusionReport.txt
   37K Oct 16 22:36 Homo_sapiens.GRCh37.74.sqlite
   68K Oct 16 22:36 Homo_sapiens.GRCh37.74_subset.gtf
  1.9K Oct 16 22:36 PRADA.acc.fusion.fq.TAF.tsv
   32K Oct 16 22:36 UCSC.HG19.Human.CytoBandIdeogram.txt
   32K Oct 16 22:36 UCSC.HG38.Human.CytoBandIdeogram.txt
   16K Oct 16 22:36 defuse_833ke_results.filtered.tsv
  4.6K Oct 16 22:36 ericscript_SRR1657556.results.total.tsv
  1.7M Oct 16 22:36 fusion5267and11759reads.bam
   57K Oct 16 22:36 fusion5267and11759reads.bam.bai
  4.1K Oct 16 22:36 fusioncatcher_833ke_final-list-candidate-fusion-genes.txt
  2.1K Oct 16 22:36 infusion_fusions.txt
  4.3K Oct 16 22:36 jaffa_results.csv
  2.6K Oct 16 22:36 reads.1.fq
  2.6K Oct 16 22:36 reads.2.fq
  1.0K Oct 16 22:36 reads_supporting_defuse_fusion_5267.1.fq
  1.0K Oct 16 22:36 reads_supporting_defuse_fusion_5267.2.fq
  3.3K Oct 16 22:36 soapfuse_833ke_final.Fusion.specific.for.genes
  2.0K Oct 16 22:36 star-fusion.fusion_candidates.final.abridged.txt

可以看到,所支持的9种融合基因检测工具的示例结果都在这里了,比如我最喜欢的star-fusion的结果节选如下:

#FusionName JunctionReadCount   SpanningFragCount   SpliceType  LeftGene    LeftBreakpoint  RightGene   RightBreakpoint
THRA--AC090627.1    27  93  ONLY_REF_SPLICE THRA^ENSG00000126351.8  chr17:38243106:+    AC090627.1^ENSG00000235300.3    chr17:46371709:+
THRA--AC090627.1    5   93  ONLY_REF_SPLICE THRA^ENSG00000126351.8  chr17:38243106:+    AC090627.1^ENSG00000235300.3    chr17:46384693:+
ACACA--STAC2    12  51  ONLY_REF_SPLICE ACACA^ENSG00000132142.15    chr17:35479453:-    STAC2^ENSG00000141750.6 chr17:37374426:-
RPS6KB1--SNF8   10  43  ONLY_REF_SPLICE RPS6KB1^ENSG00000108443.9   chr17:57970686:+    SNF8^ENSG00000159210.5  chr17:47021337:-
TOB1--SYNRG 8   30  ONLY_REF_SPLICE TOB1^ENSG00000141232.4  chr17:48943419:-    SYNRG^ENSG00000006114.11    chr17:35880751:-
VAPB--IKZF3 4   46  ONLY_REF_SPLICE VAPB^ENSG00000124164.11 chr20:56964573:+    IKZF3^ENSG00000161405.12    chr17:37934020:-
ZMYND8--CEP250  2   44  ONLY_REF_SPLICE ZMYND8^ENSG00000101040.15   chr20:45852970:-    CEP250^ENSG00000126001.11   chr20:34078463:+
AHCTF1--NAAA    3   38  ONLY_REF_SPLICE AHCTF1^ENSG00000153207.10   chr1:247094880:-    NAAA^ENSG00000138744.10 chr4:76846964:-
VAPB--IKZF3 1   46  ONLY_REF_SPLICE VAPB^ENSG00000124164.11 chr20:56964573:+    IKZF3^ENSG00000161405.12    chr17:37944627:-
VAPB--IKZF3 1   46  ONLY_REF_SPLICE VAPB^ENSG00000124164.11 chr20:56964573:+    IKZF3^ENSG00000161405.12    chr17:37922746:-
STX16--RAE1 4   33  ONLY_REF_SPLICE STX16^ENSG00000124222.17    chr20:57227143:+    RAE1^ENSG00000101146.8  chr20:55929088:+

这些结果文件导入R里面统一用import系列函数,比如:

library(chimeraviz)
​
# Get reference to results file from deFuse
defuse833ke <- system.file(
  "extdata",
  "defuse_833ke_results.filtered.tsv",
  package="chimeraviz")
​
# Load the results file into a list of fusion objects
fusions <- importDefuse(defuse833ke, "hg19")
​
## ---- message = FALSE------------------------------------------------------
length(fusions)

基因组全局可视化

soapfuse833ke <- system.file(
  "extdata",
  "soapfuse_833ke_final.Fusion.specific.for.genes",
  package = "chimeraviz")
fusions <- importSoapfuse(soapfuse833ke, "hg38", 10)
# Plot!
plotCircle(fusions)

主要是一个环形图,如下:

chimeraviz-fusion-circle-plot

红色条带-染色体内融合,蓝色条带-染色体间融合。

单独可视化某个融合事件

if(!exists("defuse833ke"))
  defuse833ke <- system.file(
    "extdata",
    "defuse_833ke_results.filtered.tsv",
    package = "chimeraviz")
fusions <- importDefuse(defuse833ke, "hg19", 1)
# Choose a fusion object
fusion <- getFusionById(fusions, 5267)
# Load edb
if(!exists("edbSqliteFile"))
  edbSqliteFile <- system.file(
    "extdata",
    "Homo_sapiens.GRCh37.74.sqlite",
    package="chimeraviz")
edb <- ensembldb::EnsDb(edbSqliteFile)
# bamfile with reads in the regions of this fusion event
if(!exists("fusion5267and11759reads"))
  fusion5267and11759reads <- system.file(
    "extdata",
    "fusion5267and11759reads.bam",
    package = "chimeraviz")
# Plot!
plotFusion(
  fusion = fusion,
  bamfile = fusion5267and11759reads,
  edb = edb,
  nonUCSC = TRUE)
​
## ---- echo = FALSE, message = FALSE, fig.height = 5, fig.width = 10, dev='png'----
# Plot!
plotFusion(
  fusion = fusion,
  bamfile = bamfile5267,
  edb = edb,
  nonUCSC = TRUE,
  reduceTranscripts = TRUE)
​

这个可视化比较复杂一点,需要融合基因的事件详情,包含两个融合基因的bam片段文件,以及参考基因组的数据库信息。

然后有两种展现方式,一种是基于转录本的融合情况,一种是基于基因

chimeraviz-fusion-plot

RCC1-HENMT1融合例子。

顶部:显示融合的染色体位置。支持断裂点(红色曲线)的discordant reads数10(其中split的6,spanning的4),注释的转录本及read数图。

Comments are closed.