dbSNP的ID直接在NCBI的dbSNP官网可以看到详细介绍,现在已经更新到146版本了,一般人看到一个ID肯定什么信息都获取不到,毕竟这只是人家NCBI规定的一个ID而已。但是HGVS突变形式就有非常详细的信息了。
人类基因组变异协会(HGVS)官方组织规定了mutation该如何记录:http://www.hgvs.org/mutnomen/recs.html 推荐大家都仔细阅读!!!
还有一个程序是根据染色体坐标来得到HGVS突变形式:https://github.com/counsyl/hgvs 这个有点复杂,我们先不讲!
其实YouTube上面有视频教程(BioMart: Variation IDs to HGNC Symbols),考虑到大部分都无法翻墙,我这里给出一个取巧的解决办法!
取巧的办法就是,根据RS ID号直接组合域名,一下三种方式均可!
http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs1800234
http://www.ncbi.nlm.nih.gov/snp/1800234
http://browser.1000genomes.org/Homo_sapiens/Variation/Explore?v=rs1800234
下面详细讲解三种方式的返回结果:
直接爬取dbSNP的返回数据,提取对应的:
比如:http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs1800234
很明显就能看到:
HGVS Names |
---|
你只需要根据你自己想搜索的ID号来组合一个url
http://www.ncbi.nlm.nih.gov/SNP/snp_ref.cgi?rs=rs197278
等等~~~~~~~~~~
或者直接在NCBI的snp页面根据ID来搜索:
http://www.ncbi.nlm.nih.gov/snp/1800234
AACATGAACAAGGTCAAAGCCCGGG[A/C/T]CATCCTCTCAGGAAAGGCCAGTAAC
- Chromosome:
- 22:46219983
- Gene:
- PPARA (GeneView)
- Functional Consequence:
- intron variant,missense
- Validated:
- by 1000G,by cluster,by frequency
- Global MAF:
- C=0.0170/85
- HGVS:
- NC_000022.10:g.46615880T>C, NC_000022.11:g.46219983T>C, NG_012204.1:g.74382T>C, NM_001001928.2:c.680T>C, NM_005036.4:c.680T>C, NP_001001928.1:p.Val227Ala, NP_005027.2:p.Val227Ala, XM_005261653.1:c.680T>C, XM_005261654.1:c.680T>C, XM_005261655.1:c.680T>C, XM_005261655.2:c.680T>C, XM_005261656.1:c.680T>C, XM_005261656.2:c.680T>C, XM_005261657.1:c.680T>C, XM_005261658.1:c.680T>C, XM_006724269.2:c.680T>C, XM_006724270.2:c.680T>C, XM_011530239.1:c.680T>C, XM_011530240.1:c.680T>C, XM_011530241.1:c.680T>C, XM_011530242.1:c.680T>C, XM_011530243.1:c.680T>C, XM_011530244.1:c.278T>C, XM_011530245.1:c.278T>C, XP_005261710.1:p.Val227Ala, XP_005261711.1:p.Val227Ala, XP_005261712.1:p.Val227Ala, XP_005261713.1:p.Val227Ala, XP_005261714.1:p.Val227Ala, XP_005261715.1:p.Val227Ala, XP_006724332.1:p.Val227Ala, XP_006724333.1:p.Val227Ala, XP_011528541.1:p.Val227Ala, XP_011528542.1:p.Val227Ala, XP_011528543.1:p.Val227Ala, XP_011528544.1:p.Val227Ala, XP_011528545.1:p.Val227Ala, XP_011528546.1:p.Val93Ala, XP_011528547.1:p.Val93Ala, XR_244379.1:n.735+1578T>C, XR_937869.1:n.827+1578T>C, XR_937870.1:n.822+1582T>C
还有很多其它类似的数据库都提供类似的服务:
比如Ensembl提供的千人基因组计划的接口:
This variation has 11 HGVS names - click the plus to show
22:g.46615880T>C
ENST00000493286.1:n.890T>C
ENST00000262735.5:c.680T>C
ENSP00000262735.5:p.Val227Ala
ENST00000396000.2:c.680T>C
ENSP00000379322.2:p.Val227Ala
ENST00000434345.2:c.508+1582T>C
ENST00000407236.1:c.680T>C
ENSP00000385523.1:p.Val227Ala
ENST00000402126.1:c.680T>C
ENSP00000385246.1:p.Val227Ala