Lecture4_my-script-with-perl_enrichment

Jimmyjmzeng1314@outlook.com

Keywords: GOstats/enrichment/

需要自己用代码来计算

计算P值的代码也要自己写:C(k,M)*C(n-k,N-M)/C(n,N)

GOstats包的结果

我的R代码的结果:

顺便对比一下我的perl代码结果:

我的代码如下:

tmp=toTable(org.Hs.egPATH)

GeneID2Path=tapply(tmp[,2],as.factor(tmp[,1]),function(x) x)

Path2GeneID=tapply(tmp[,1],as.factor(tmp[,2]),function(x) x)

diff_gene_has_path=intersect(diff_gene,names(GeneID2Path))

n=length(diff_gene_has_path) #306

N=length(GeneID2Path) #5870

for (i in names(Path2GeneID)){

 M=length(Path2GeneID[[i]])

 exp_count=n*M/N

 k=0

 for (j in diff_gene_has_path){

        if (i %in% GeneID2Path[[j]]) k=k+1

 }

 OddsRatio=k/exp_count

 p=phyper(k-1,M, N-M, n, lower.tail=F)

 print(paste(i,p,OddsRatio,exp_count,k,M,sep="    "))

}