千人基因组计划的重要性我也不想多说了,由于时间跨度比较长,最终的数据不只是一千人,最新版共有NA编号开头的1182个人,HG开头的1768个人!它的官方网站是:有一个ppt讲得很清楚如何通过官网做的data portal来下载数据:https://www.genome.gov/pages/research/der/ichg-1000genomestutorial/how_to_access_the_data.pdf 我不喜欢可视化的界面,我比较喜欢直接进入ftp自己翻需要的数据,千人基因组计划不仅仅有自己的ftp站点,而且在NCBI,EBI和sanger研究所里面也有数据源可以下载, 是非常丰富的生信入门资源!
09/08/2014 12:00AM 1,663 20131219.populations.tsv 09/09/2014 12:00AM 97 20131219.superpopulations.tsv
其实对大部分人来说,除非你想下载千人基因组计划的原始数据来学习生物信息学分析流程,不然用不着这个ftp站点的,它自己在EBI里面的有一个非常好用的可视化界面来浏览千人基因组计划的variation结果
还有一个java软件-可视化检测千人基因组数据
但是好像不是很好用!
- Coriell Catalog website: 1000 Genomes Project
- 1000 Genomes website: browser.1000genomes.org/index.html (by SNP ID)
- 1000 Genomes website: www.1000genomes.org/data (bulk data)
The most important available existing expression datasets involving 1000g individuals are probably the following:
RNAseq (mRNA & miRNA) on 465 individuals (CEU, TSI, GBR, FIN, YRI)
Pre-publication RNA-sequencing data from the Geuvadis project is available through http://www.geuvadis.org
http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-1/samples.html
http://www.ebi.ac.uk/arrayexpress/experiments/E-GEUV-2/samples.html
RNAseq on 60 CEU individual [1]
http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-197
Expression arrays on about 800 HapMap 3 individuals with a lot of overlap with 1000g data [1,2]
http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-198
http://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-264
RNAseq for 69 YRI individuals [3]