人类基因命名委员会(HUGO Gene Nomenclature Committee);人类基因组命名委员会!
其实有了NCBI的entrez ID,然后还有refseq里面的ID,还有ensembl的ID,还有基因本身的功能英文缩略简称,已经很麻烦了,又来了一个HGNC,唉,头疼!
The HGNC approves both a short-form abbreviation known as a gene symbol, and also a longer and more descriptive name.
可以下载整个数据,用脚本慢慢研究研究
wget ftp://ftp.ebi.ac.uk/pub/databases/genenames/new/tsv/hgnc_complete_set.txt
还是看看BRCA1这个基因,里面的信息挺多的,主要看HGNC:1100,就是这个数据库对它这个基因的编号
HGNC:1100
BRCA1 这个是基因名,需要得到该组织的认可!!!!
breast cancer 1, early onset protein-coding gene gene with protein product Approved
17q21.31 17q21.31
"RNF53|BRCC1|PPP1R53|FANCS" "BRCA1/BRCA2-containing complex, subunit 1|protein phosphatase 1, regulatory subunit 53|Fanconi anemia, complementation group S" "Ring finger proteins|Protein phosphatase 1 regulatory subunits" "58|694"
1991-02-20T00:00:00Z
2015-04-18T00:00:00Z
672 这里是entrez ID
ENSG00000012048 这里是ensembl的ID,
OTTHUMG00000157426 uc002ict.3 U14680 NM_007294
"CCDS11453|CCDS11454|CCDS11455|CCDS11456|CCDS11459|CCDS11455|CCDS11456|CCDS11459|CCDS11454" P38398 1676470 MGI:104537 RGD:2218
"Breast Cancer|http://research.nhgri.nih.gov/bic/|BRCA1 database at LOVD-China|http://genomed.org/LOVD/BC/home.php?select_db=BRCA1|LOVD - Leiden Open Variation Database|http://chromium.liacs.nl/LOVD2/cancer/home.php?select_db=BRCA1|LOVD - Leiden Open Variation Database|http://proteomics.bio21.unimelb.edu.au/lovd/genes/BRCA1|LRG_292|http://www.lrg-sequence.org/LRG/LRG_292"
BRCA1 113705 119068
数据结构大概就是这个样子的了!
- HGNC Database
- Ensembl
- Entrez Gene
- GeneCards
- Online Mendelian Inheritance in Man (OMIM)
这几个数据库的内容都是互相链接的!
然后我们看看HGNC数据库的一些统计信息
http://www.genenames.org/cgi-bin/statistics
总共有40392个基因信息
其中18990个是能编码蛋白产物的基因,它们大多有GO注释
其中5927个是non-coding RNA,是现在的研究热门。
还有12546个是假基因,挺复杂的
最后还有1188个免疫相关基因,位置基因,病毒基因等等
最后,送给大家一个彩蛋!还有十一个物种也是有一个命名委员会的!
类似于 Mouse Gene Nomenclature Committee (MGNC). Please see the following links:
- Mouse: http://www.informatics.jax.org/mgihome/nomen/
- Rat: http://rgd.mcw.edu/nomen/nomen.shtml
- Chicken: http://www.agnc.msstate.edu/
- Anolis lizard http://lizardbase.org/pages/agnc.html
- Xenopus: http://www.xenbase.org/gene/static/geneNomenclature.jsp
- Zebrafish: https://wiki.zfin.org/display/general/ZFIN+Zebrafish+Nomenclature+Guidelines
- Drosophila: http://flybase.org/static_pages/docs/nomenclature/nomenclature3.html
- elegans: http://wiki.wormbase.org/index.php/Nomenclature
- Arabidopsis: http://www.arabidopsis.org/portals/nomenclature/guidelines.jsp
- cerevisiae: http://www.yeastgenome.org/gene_guidelines.shtml
- pombe: http://www.pombase.org/submit-data/gene-naming-guidelines
参考文献;
Gray KA, Yates B, Seal RL, Wright MW, Bruford EA. genenames.org: the HGNC resources in 2015. Nucleic Acids Res. 2015 Jan;43(Database issue):D1079-85. doi: 10.1093/nar/gku1071. PMID:25361968