Lecture4_my-script-with-perl_enrichment
Jimmy(jmzeng1314@outlook.com)
Keywords: GOstats/enrichment/
需要自己用代码来计算
计算P值的代码也要自己写:C(k,M)*C(n-k,N-M)/C(n,N)
GOstats包的结果:
我的R代码的结果:
顺便对比一下我的perl代码结果:
我的代码如下:
tmp=toTable(org.Hs.egPATH)
GeneID2Path=tapply(tmp[,2],as.factor(tmp[,1]),function(x)
x)
Path2GeneID=tapply(tmp[,1],as.factor(tmp[,2]),function(x)
x)
diff_gene_has_path=intersect(diff_gene,names(GeneID2Path))
n=length(diff_gene_has_path)
#306
N=length(GeneID2Path)
#5870
for
(i in names(Path2GeneID)){
M=length(Path2GeneID[[i]])
exp_count=n*M/N
k=0
for (j in diff_gene_has_path){
if (i
%in% GeneID2Path[[j]]) k=k+1
}
OddsRatio=k/exp_count
p=phyper(k-1,M, N-M, n, lower.tail=F)
print(paste(i,p,OddsRatio,exp_count,k,M,sep=" "))
}