Lecture4_my-script-with-R_batch-download

Jimmyjmzeng1314@outlook.com

Keywords: XML//ape/read.GenBank()

就是解析源代码拿到pdf后缀的下载地址并且下载即可

library(XML)

library(RCurl)

psu_edu_url='http://www.personal.psu.edu/iua1/courses/2013-BMMB-597D.html';

wp=getURL(psu_edu_url)

base='http://www.personal.psu.edu/iua1/courses/file';

#pse_edu_links=getHTMLLinks(psu_edu_url)

psu_edu_links=getHTMLLinks(wp)

psu_edu_pdf=psu_edu_links[grepl(".pdf$",psu_edu_links,perl=T)]

for (pdf in psu_edu_pdf){

down_url=getRelativeURL(pdf,base)

filename=last(strsplit(pdf,"/")[[1]])

cat("Now we down the ",filename,"\n")

download.file(down_url,filename)

}

useful links : http://www.bio-info-trainee.com/?p=799

http://www.bio-info-trainee.com/?p=941

http://www.personal.psu.edu/iua1/lectures.html

http://www.personal.psu.edu/iua1/2015_fall_852/main_2015_fall_852.html

其实现在迅雷本身就有批量下载的功能了