Share This
table started by
bio2rdf for the Public Base
This type is used to describe the bioinformatics public database that are accessible throught Bio2RDF's rdfizer service.
more
Add More Topics
Save this view to a base, or just for yourself.
1,829 Bioinformatics database from Banff Manifesto topics matching:
Filter this Collection|
|
|
|
|||
|---|---|---|---|---|---|
| x name | x image | x Reserved namespace | x Description | x Provider homepage | x article |
| x BioCarta : Charting pathways of life |
|
biocarta | Observe how genes interact in dynamic graphical models. Our online maps depict molecular relationships from areas of active research. In an open source approach, this community-fed forum constantly integrates emerging proteomic information from the scientific community. It also catalogs and summarizes important resources providing information for over 120,000 genes from multiple species. Find both classical pathways as well as current suggestions for new pathways. | http://www.biocarta.com/genes/index.asp |
Observe how genes interact in dynamic graphical models. Our online
maps depict molecular relationships from areas of active research. In
an "open source" approach, this community-fed forum constantly
integrates emerging proteomic information from...
|
| Online maps of metabolic and signaling pathways | |||||
| x ChEBI : Chemical Entities of Biological Interest database of small molecules |
|
chebi | http://www.ebi.ac.uk/chebi/ |
Chemical Entities of Biological Interest (ChEBI) is a freely
available dictionary of molecular entities focused on ‘small’ chemical
compounds. The term ‘molecular entity’ refers to any constitutionally
or isotopically distinct atom, molecule, ion,...
|
|
| x PubChem : Information on biological activities of small molecule |
|
pubchem | Structures and biological activities of small organic molecules | http://pubchem.ncbi.nlm.nih.gov/ |
PubChem provides information on the biological activities of small
molecules. It is a component of NIH's Molecular Libraries Roadmap
Initiative. If you would like to learn more about how to use the
PubChem resources, please go to our help page. ...
|
| x GO : Gene Ontology |
|
go | pThe objective of GO is to provide controlled vocabularies for the description of the molecular function, biological process and cellular component of gene products. These terms are to be used as attributes of gene products by collaborating databases, facilitating uniform queries across them. The controlled vocabularies of terms are structured to allow both attribution and querying to be at different levels of granularity.ppGO is a database independent of any other. GO itself is not populated with gene products of any organism, although tools can be built which allowGO to be displayed as if it were (e.g. http:www.fruitfly.organnotgo). Databases external to GO collaborate with GO in three ways: by makingdatabase cross-links between GO terms and objects in their database (typically, gene products, or their surrogates, genes), and then providingtables of these links to GO (and hence the community), second by supporting queries that use these terms in their database, and third bycontributing to the development of the GO database itself expanding the vocabularies and refining the terms.p | http://www.geneontology.org/ |
Biologists currently waste a lot of time and effort in searching
for all of the available information about each small area of research.
This is hampered further by the wide variations in terminology that may
be common usage at any given time,...
|
| Gene ontology consortium database | |||||
| x MeSH : Medical Subject Headings |
|
mesh | http://www.nlm.nih.gov/mesh/2005/MBrowser.html |
MeSH is the National Library of Medicine's controlled vocabulary
thesaurus. It consists of sets of terms naming descriptors in a
hierarchical structure that permits searching at various levels of
specificity. MeSH descriptors are arranged in both an...
|
|
| x Taxon : NCBI Taxonomy | taxon | http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/ |
Organisms are classified in a hierarchical tree structure.
Our taxonomy database contains every node (taxon) of the tree.
UniProtKB taxonomy data is manually curated: next to manually
verified organism names,
we...
|
||
| x GeneID : Database of genes from NCBI RefSeq genomes |
|
geneid | http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene |
Entrez Gene (www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene) is NCBI's database for gene-specific information. It does not include all known or predicted genes; instead Entrez Gene focuses on the genomes that have been completely sequenced, that...
|
|
| x OMIM : Online Mendelian Inheritance in Man |
|
omim | The OMIM(TM) (Online Mendelian Inheritance in Man) database is a catalog of human genes and genetic disorders authored and edited by Dr. Victor A. McKusick and his colleagues at Johns Hopkins and elsewhere, and developed for the World Wide Web by NCBI, the National Center for Biotechnology Information. The database contains textual information, pictures, and reference information. | http://www.genome.ad.jp/dbget-bin/www_bfind_info?omim |
Welcome to OMIM, Online Mendelian Inheritance in Man. This database is a
catalog of human genes and genetic disorders authored and edited by Dr.
Victor A. McKusick and his colleagues at Johns Hopkins and elsewhere, and
developed for the World Wide...
|
| Online Mendelian Inheritance in Man (OMIM) is a comprehensive, authoritative and timely knowledgebase of human genes and genetic disorders compiled to support research and education in human genomics and the practice of clinical genetics. Started by Dr Victor A. McKusick as the definitive reference Mendelian Inheritance in Man, OMIM (a href=http:www.ncbi.nlm.nih.govomimhttp:www.ncbi.nlm.nih.govomima) is now distributed electronically by the National Center for Biotechnology Information (NCBI), where it is integrated with the Entrez suite of databases. Derived from the biomedical literature, OMIM is written and edited at Johns Hopkins University with input from scientists and physicians around the world. Each OMIM entry has a full-text summary of a genetically determined phenotype andor gene and has numerous links to other genetic databases such as DNA and protein sequence, PubMed references, general and locus-specific mutation databases, approved gene nomenclature, and the highly detailed mapviewer, as well as patient support groups and many others. OMIM is an easy and straightforward portal to the burgeoning information in human genetics. | |||||
| x RefSeq : NCBI Reference Sequences |
|
refseq | The NCBI Reference Sequence project (RefSeq) will provide reference sequencestandards for the naturally occurring molecules of the central dogma, fromchromosomes to mRNAs to proteins. RefSeq standards provide a foundation for thefunctional annotation of the human genome. They provide a stable referencepoint for mutation analysis, gene expression studies, and polymorphismdiscovery.pScope: Currently, RefSeq records are provided for the following molecule typesand genomes:PREMolecule Accession Format Genome Complete Genome NC_###### Organelle Complete Chromosome NC_###### Saccharomyces cerevisiae Genomic Contig NT_###### Homo sapiens Mus musculusmRNA NM_###### Limited Vertebrate Homo sapiens Mus musculus Rattus norvegicus Protein NP_###### All of the abovePRE | http://www.genome.ad.jp/dbget-bin/www_bfind_info?refseq |
The Reference Sequence (RefSeq) database
is a non-redundant collection of richly annotated DNA, RNA, and protein
sequences from diverse taxa. The collection includes sequences from
plasmids, organelles, viruses, archaea, bacteria, and eukaryotes....
|
| The National Center for Biotechnology Information Reference Sequence (RefSeq) database provides curated non-redundant sequence standards for genomic regions, transcripts (including splice variants), and proteins. brRecords are compiled using a combined approach of collaboration, automated methods, prediction, and curation and are extensively integrated with other NCBI resources facilitating navigation and discovery. RefSeq records represent the current best view of genomes and their transcript andor protein products. | |||||
| x GenBank : Nucleotide sequence database |
|
genbank | http://www.genome.ad.jp/dbget-bin/www_bfind_info?genbank-today |
The Entrez Nucleotide database is a collection of sequences from
several sources, including GenBank, RefSeq, and PDB. The number of bases
in these databases continues to grow at an exponential rate.
As of April 2006,...
|
|
| x PubMed: NCBI bibliographic database |
|
pubmed | The PubMed database is available on the Entrez retrieval system, and was developed by the National Center for Biotechnology Information (NCBI) at the National Library of Medicine (NLM). PubMed provides free access to MEDLINE, NLM's database of more than 13 million bibliographic citations and abstracts in the fields of biomedicine, nursing, dentistry, veterinary medicine, health care systems, and preclinical sciences. PubMed also includes access to additional selected life sciences journals not in MEDLINE. PubMed's LinkOut feature a href=http:www.ncbi.nlm.nih.goventrezlinkouthttp:www.ncbi.nlm.nih.goventrezlinkouta provides access to a wide variety of relevant web-accessible online resources, including full-text publications, biological databases, consumer health information, and research tools. Links are also available to the molecular biology databases maintained by NCBI. New citations are typically added to PubMed Tuesday through Saturday. | http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed |
PubMed, available via the NCBI Entrez retrieval
system, was developed by the National
Center for Biotechnology Information (NCBI) at the
National Library of Medicine (NLM), located at the
U.S. National Institutes of Health ...
|
| x HomoloGene : Discover Homologs |
|
homologene | http://www.ncbi.nlm.nih.gov/sites/entrez?db=homologene |
HomoloGene is a system for automated detection of homologs among
the annotated genes of several completely sequenced eukaryotic genomes.
|
|
| x UniGene gene-oriented nucleotide sequence clusters | unigene | Expressed sequence tag (EST) analysis is being applied to a growing number of organisms. UniGene provides an organized view of these transcribed sequences by collapsing them into groups that correspond to genes and then connecting these entries to other classes of information that may shed light on gene expression and function. Traditionally, this grouping has been based on all-against-all sequence alignment among the transcript sequences. For the few organisms for which a genomic sequence is available, improved results have recently been obtained by using the genome as a scaffold for grouping transcripts. By taking genomic position into account, it is possible to bring together gene fragments with very short overlaps and to separate near-identical gene family members that might have been inappropriately merged. Tools provided on the UniGene web site allow researchers to browse cDNA libraries by developmental stage or tissue of origin, view surrogate expression profiles based on EST counts, and find groups of genes with similar expression patterns. | http://www.ncbi.nih.gov/UniGene | ||
| x UniSTS : Integrating Markers and Maps |
|
unists | UniSTS is a comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information such as genomic position, genes, and sequences. | http://www.ncbi.nlm.nih.gov/sites/entrez?db=unists |
UniSTS is a comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information such as genomic position, genes, and sequences...
|
| x [bm:links_directory] Bioinformatics Links Directory | links_directory |
The Bioinformatics Links Directory features curated links to molecular
resources, tools and databases. The links listed in this directory are
selected on the basis of recommendations from bioinformatics experts in
the field. We also rely on input...
|
|||
| x CPD : KEGG Ligand Database for Chemical Compound |
|
cpd |
KEGG COMPOUND is a chemical structure database for metabolic compounds and other chemical substances that are relevant to biological systems.
Each entry is identified by the C number, such as C00047 for L-lysine, and contains various links to other...
|
||
| x [bm:omia] Online Mendelian Inheritance in Animals |
|
omia | Catalog of animal genetic and genomic disorders | ||
| x DR : KEGG Ligand Database for Drug |
|
dr |
KEGG DRUG is a chemical structure based information resource for
all approved drugs in Japan and the U.S.A. Each chemical structure is
identified by the D number, and is associated with generic names, trade
names, efficacy, target information, etc....
|
||
| x GL : KEGG Ligand Database for Carbohydrate Structure |
|
gl |
Functional information of genes and proteins is organized in KEGG as ortholog groups, called KEGG Orthology (KO) groups, to cover all organisms. The KO groups for glycosyltransferases are finely classified ortholog groups distinguishing known...
|
||
| x RN : KEGG Ligand Database for Chemical Reaction |
|
rn |
KEGG REACTION contains all reactions taken from KEGG ENZYME and additional reactions taken from the KEGG metabolic pathways.
Each reaction is identified by the R number, such as R00259
for the acetylation of L-glutamate.
Reactions are linked to...
|
||
| x PIRSF; a whole-protein classification database |
|
pirsf | The PIRSF is a network system of protein classification that reflects evolutionary relationships of full-length proteins and domains. The PIRSF classification system accommodates a flexible number of levels that reflect varying degrees of sequence conservation from superfamily to subfamily levels, allowing improved protein annotation, more accurate extraction of conserved functional residues, and classification of distantly related orphan proteins. The primary PIRSF classification unit is the homeomorphic family, whose members are both homologous (evolved from a common ancestor) and homeomorphic (sharing full-length sequence similarity and a common domain architecture). The PIRSF database consists of two data sets, preliminary clusters and curated families. PIRSF families are curated systematically based on literature review and integrative sequence and functional analysis, including sequence and structure similarity, domain architecture, functional association, genome context, and phyletic pattern. The results of classification and expert annotation are summarized in PIRSF family reports with graphical viewers for taxonomic distribution, domain architecture, family hierarchy, and multiple alignment and phylogenetic tree. The PIRSF system provides a comprehensive resource for bioinformatics analysis and comparative studies of protein function and evolution. Domain searches allow identification of evolutionarily related protein families sharing domains or structural folds. Functional convergence and functional divergence are revealed by the relationships between protein classification and curated family functions. The taxonomic distribution allows the identification of lineage-specific or broadly conserved protein families and can reveal horizontal gene transfer. PIRSF is accessible from the web site at a href=http:pir.georgetown.edupirsfhttp:pir.georgetown.edupirsfa for report retrieval and sequence classification, and the data sets can be downloaded at a href=ftp:ftp.pir.georgetown.edupir_databasespirsfdagfilesftp:ftp.pir.georgetown.edupir_databasespirsfdagfilesa. | http://pir.georgetown.edu/pirsf/ | |
| x IProClass : Integrated Protein Knowledgebase |
|
iproclass | http://pir.georgetown.edu/iproclass/ |
The iProClass database provides value-added information reports for UniProtKB and unique NCBI Entrez protein sequences in UniParc, with links to over 90 biological databases, including databases for protein families, functions and pathways,...
|
|
| x [bm:iprolink] Integrated Protein Literature, INformation and Knowledge |
|
iprolink | |||
| x UniRef : UniProt Non-redundant Reference Databases |
|
uniref | The UniProt Reference Clusters are three separate datasets that compress sequence space at different resolutions, achieved by merging sequences and sub-sequences that are 100% (UniRef100), =90% (UniRef90), or =50% (UniRef50) identical, regardless of source organism. The UniRef100 database provides the most comprehensive non-redundant coverage of the known protein sequence space including not only all of UniProtKB but also splice variants that are not separated out in these databases, as well as additional active sequences from UniParc. The UniRef90 and UniRef50 databases provide a more even sampling of sequences by reducing the numbers of closely related sequence. This speeds sequence similarity searches while rendering such searches more informative. The compression of UniRef100 into UniRef90 and UniRef50 yields size reductions of approximately 40% and 65%, respectively. | http://www.uniprot.org/database/nref.shtml |
The UniProt NREF (UniProt Reference Clusters) database.
The two major objectives of UniRef are:
(i) to facilitate sequence merging in UniProt, and
(ii) to allow faster and...
|
| x UniParc : UniProt Archive a non-redundant archive of protein sequences extracted from Swi |
|
uniparc | The UniProt Archive (UniParc) is a comprehensive non-redundant proteinsequence archive. Its protein sequences are retrieved from predominantpublicly accessible resources, including Swiss-Prot, TrEMBL, EMBL, Ensembl,RefSeq and PDB. To avoid redundancy each unique sequence is stored only once with a stable protein identifier, which can be used later to identify the same protein in all source databases. When proteins are loaded into UniParc, database cross-references are created to link them to the origins of the sequences. As a result, performing a sequence search against UniParc is equivalent to performing the same search against all databasescross-referenced by it. | http://www.ebi.ac.uk/uniparc/ |
UniProt Archive (UniParc) is part of UniProt project. It is a non-redundant archive of protein sequences extracted from public databases UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, PIR-PSD, EMBL, EMBL WGS, Ensembl, IPI, PDB, PIR-PSD, RefSeq, FlyBase, ...
|
| The UniProt archive (UniParc), part of the UniProt databases, is an archival protein sequence collection from all major publicly accessible resources. New and revised protein sequences are added daily into UniParc while not deleting the previous versions. A UniParc sequence version is provided and incremented each time the underlying sequence changes, making it possible to observe the history of sequence changes in all source databases. To avoid redundancy, each unique sequence is assigned a unique identifier and is stored only once. The basic information stored with each UniParc entry is the identifier, the sequence, cyclic redundancy check number (CRC64), source database(s) with accession and version numbers, and a time stamp; all other information must be retrieved from the source databases. Each source database accession number is tagged with its status in that database, indicating if the sequence still exists or has been deleted at that source. | |||||
| x UniProt : The Universal Protein Resource |
|
uniprot | The Swiss-Prot, TrEMBL, and PIR protein database activities have united to form the Universal Protein Resource (UniProt), which provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKBSwiss-Prot section and the automatically annotated UniProtKBTrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations, and literature-based evidence attribution enable scientists to analyze proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90), or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. The UniProt databases continue to grow in size and in availability of information. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division, and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. | http://www.uniprot.org/ |
The Swiss-Prot, TrEMBL, and PIR protein database activities have united to form the Universal Protein Resource (UniProt), which provides a central resource on protein sequences and functional annotation with three database components, each...
|
| x Any keyword. e.g.'wuschel' | keyword | ||||
| x [bm:citations] | citations | ||||
| x KEGG : Kyoto Encyclopedia of Genes and Genomes |
|
kegg | A large database of metabolic pathways from multiple organisms. Reference pathways are available that describe a general pathway, which is then projected onto a genome based on the presence of pathway enzymes in that genome. KEGG is actually composed of a number of databases: PATHWAY, GENES and LIGAND. Some information on molecular complexes and signaling pathways is present, although mainly in diagram form. | http://www.genome.ad.jp/kegg/ |
A grand challenge in the post-genomic era is a complete computer
representation of the cell, the organism, and the biosphere, which will
enable computational prediction of higher-level complexity of cellular
processes and organism behaviors from...
|
| KEGG (Kyoto Encyclopedia of Genes and Genomes) is the primary database resource of the Japanese GenomeNet service (a href=http:www.genome.ad.jphttp:www.genome.ad.jpa) for understanding higher order functional meanings and utilities of the cell or the organism from its genome information. KEGG consists of the PATHWAY database for the computerized knowledge on molecular interaction networks such as pathways and complexes, the GENES database for the information about genes and proteins generated by genome sequencing projects, and the LIGAND database for the information about chemical compounds and chemical reactions that are relevant to cellular processes. In addition to these three main databases, limited amounts of experimental data for microarray gene expression profiles and yeast two-hybrid systems are stored in the EXPRESSION and BRITE databases, respectively. Furthermore, a new database named SSDB is made available for exploring the universe of all protein coding genes in the complete genomes and for identifying functional links and ortholog groups. The data objects in the KEGG databases are all represented as graphs and various computational methods are developed to detect graph features that can be related to biological functions. For example, the correlated clusters are graph similarities which can be used to predict a set of genes coding for a pathway or a complex, as summarized in the ortholog group tables, and the cliques in the SSDB graph are used to annotate genes. The KEGG databases are daily updated and made freely available (a href=http:www.genome.ad.jpkegghttp:www.genome.ad.jpkegga). | |||||
| x MGI : Mouse genome database (MGD) from Mouse Genome Informatics (MGI) |
|
mgi | http://www.informatics.jax.org/ |
Gene Expression Database (GXD) Project
GXD integrates different types of gene expression information from the
mouse and provides a searchable index of published experiments on
endogenous gene expression during development.(See About GXD.)
Mouse...
|
|
| x PDB : The RCSB Protein Data Bank; a repository for 3D biological macromolecular |
|
pdb | http://www.rcsb.org/pdb/ |
Welcome. The PDB is the single worldwide repository for the processing and distribution of 3-D structure data of large molecules of proteins and nucleic acids. New structures are released each Wednesday by 1:00am Pacific time. Details about the...
|
|
| x Reactome : A knowledgebase of biological pathways and processes |
|
reactome | Reactome is a database of cellular level processes from simple events, such as biochemical reactions, to complex events, such as the cell cycle. It provides process-level annotation of the structure and function of the Human Genome, with dynamic links to other databases relevant to the human system, and to the relevant literature. Our ontology ensures that the various events are linked in an appropriate spatial and temporal context. The database is produced by faculty-level authors, recruited from the biological research community, who write review-style articles on a set of related pathways using a template tool provided by us. The reviews are edited by the staff at CSHL and the EBI, and entered into a relational database. They are then reviewed by other biological researchers for consistency and accuracy. This database was formerly called The Genome Knowledgebase. | http://www.reactome.org/ |
Reactome is a database of cellular level processes from simple events, such as biochemical reactions, to complex events, such as the cell cycle. It provides process-level annotation of the structure and function of the Human Genome, with dynamic...
|
| Reactome is a curated database of biological processes in humans. It covers biological pathways ranging from the basic processes of metabolism to high-level processes such as hormonal signalling. While Reactome is targeted at human pathways, it also includes many individual biochemical reactions from non-human systems such as rat, mouse, fugu fish and zebra fish. This makes the database relevant to the large number of researchers who work on model organisms. All the information in Reactome is backed up by its provenance: either a literature citation or an electronic inference based on sequence similarity. Our ontology ensures that the various events are linked in an appropriate spatial and temporal context. PThe basic information in Reactome is provided by bench biologists who are experts in that domain of biology. The information is then managed and edited by the Reactome staff at CSHL and the EBI, and entered into a relational database. They are then reviewed by other biological researchers for consistency and accuracy. Following peer-review, the information is published to the web. PReactome supersedes an earlier project called The Genome Knowledgebase and incorporates all the information previously available in its predecessor. Reactome sports a radically redesigned user interface in which the entire set of human pathways known to the database are represented as a series of constellations in a "starry sky." The starry sky can be used to navigate through the universe of human reactions and is invaluable to visualize connections between pathways, some of which will be surprising to biologists who are not familiar with pathways outside their domain of research. | |||||
| x PROSITE : A protein domain and family database |
|
prosite | The PROSITE data bank is composed of two ASCII (text) files. The firstfile (PROSITE.DAT) is a computer readable file that contains all theinformation necessary to programs that will scan sequence(s) with patternsandor matrices. The second file (PROSITE.DOC) contains textualinformation that fully documents each pattern and profile. We must pointout that we strongly urge software developers to build software tools thatmake use of both files. A list of patterns or profiles present in asequence is not very useful to biologists without the relevantdocumentation.p | http://www.genome.ad.jp/dbget-bin/www_bfind_info?prosite |
PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them [More details / References / Disclaimer / Commercial users]. PROSITE is complemented...
|
| The PROSITE database (a href=http:www.expasy.orgprositehttp:www.expasy.orgprositea) consists of a large collection of biologically meaningful signatures that are described as patterns or profiles. Each signature is linked to a documentation that gives useful biological information on the protein family, domain, or functional site identified by the signature. | |||||
| x Ensembl : Eukaryotic genome annotation project |
|
ensembl | Ensembl is a joint project between EMBL - EBI (a href=http:www.ebi.ac.ukhttp:www.ebi.ac.uka) and the Sanger Institute (a href=http:www.sanger.ac.ukhttp:www.sanger.ac.uka) to develop a software system which produces and maintains automatic annotation on metazoan genomes. Ensembl is primarily funded by the Wellcome Trust (a href=http:www.wellcome.ac.ukhttp:www.wellcome.ac.uka). This site provides free access to all the data and software from the Ensembl project. | http://www.ensembl.org/ | |
| x HGNC : Human Gene Nomenclature Database |
|
hgnc | pThe HUGO Gene Nomenclature Committee (HGNC) designates approved symbols for all human genes, in accordance with the Guidelines for Human Gene Nomenclature (a href=http:www.genenames.orgguidelines.htmlhttp:www.genenames.orgguidelines.htmla). It is necessary to provide a unique symbol for each gene, preferably one which maintains parallel construction in different members of a gene family and can also be used in other species, especially the mouse.ppThe HGNC is responsible for the assignment of these symbols as well as alonger and more descriptive gene name. Considerable efforts are made to use a symbol acceptable to workers in the field, but sometimes it is not possible to use exactly what has previously appeared in the literature. However, wherever the HGNC is aware of such symbols, they are listed as aliases and information on the gene in question can be retrieved by searching with the aliases and the approved symbol in the HGNC Database(a href=http:www.genenames.orgcgi-binhgnc_search.plhttp:www.genenames.orgcgi-binhgnc_search.pla). Approved gene symbols are marked as such and given priority in databases including Entrez Gene, Ensembl, GeneCards, OMIM and GenAtlas. The HGNC also works closely with a number of journals to promote standardization of gene nomenclature. These include Nature Genetics, Nature, Genomics, Human Mutation and Cytogenetic and Genome Research.p | http://www.genenames.org/ |
We have already approved over 24,000 symbols;
the vast majority of these are for protein-coding genes, but also
include symbols for pseudogenes, non-coding RNAs,
phenotypes and genomic features (see HGNC Search).
Our current priority is...
|
| Genew, the Human Gene Nomenclature Database, is the only resource that provides data for all human genes which have approved symbols. It is managed by the HUGO Gene Nomenclature Committee (HGNC) as a confidential database, containing over 16,000 records, 80% of which are represented on the Web by searchable text files. The data in Genew are highly curated by HGNC editors and gene records can be searched on the Web by symbol or name to directly retrieve information on gene symbol, gene name, cytogenetic location, OMIM number and PubMed ID. Data are integrated with other human gene databases e.g. GDB, LocusLink and SWISS-PROT and approved gene symbols are carefully co-ordinated with the Mouse Genome Database (MGD). Approved gene symbols are available for querying and browsing at: a href=http:www.gene.ucl.ac.ukcgi-binnomenclaturesearchgenes.plhttp:www.gene.ucl.ac.ukcgi-binnomenclaturesearchgenes.pla. | |||||
| x Path : KEGG PATHWAY Database |
|
path | http://www.genome.ad.jp/kegg/pathway.html |
KEGG PATHWAY is a collection of manually drawn pathway maps representing our knowledge on the molecular interaction and reaction networks for: 1. Metabolism Carbohydrate Energy ...
|
|
| x EC : The Enzyme Commission |
|
ec | http://www.chem.qmw.ac.uk/iubmb/enzyme/ |
ENZYME is a repository of information relative to the nomenclature of enzymes. It is primarily based on the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) and it describes each...
|
|
| x [bm:books] | books | ||||
| x [bm:cancerchromosomes] | cancerchromosomes | ||||
| x Conserved Domain Database | cdd | The Conserved Domain Database (CDD) is a compilation of multiple sequence alignments representing protein domains conserved in molecular evolution. It has been populated with alignment data from the public collections Pfam ( 1) and Smart ( 2), as well as with contributions from colleagues at NCBI. The current version of CDD (v1.53) contains 3551 such models. CDD alignments are linked to protein sequence and structure data in Entrez (3 ). The molecular structure viewer Cn3D (4 ) serves as a tool to interactively visualize alignments and three-dimensional structure, and to annotate three-dimensional residue coordinates with evolutionarily conserved features. CDD can be accessed on the world-wide-web at the URL a href=http:www.ncbi.nlm.nih.govStructurecddcdd.shtmlhttp:www.ncbi.nlm.nih.govStructurecddcdd.shtmla. Protein query sequences may be compared against databases of position-specific score matrices (PSSMs) derived from alignments in CDD, using a service named CD-Search, which can be found at a href=http:www.ncbi.nlm.nih.govStructurecddwrpsb.cgihttp:www.ncbi.nlm.nih.govStructurecddwrpsb.cgia. CD-Search runs reverse position-specific BLAST (RPS-BLAST), a variant of the widely used PSI-BLAST algorithm ( 5). CD-Search is run by default for protein-protein queries submitted to NCBI's BLAST-service at a href=http:www.ncbi.nlm.nih.govBLASThttp:www.ncbi.nlm.nih.govBLASTa. | http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml | ||
| x [bm:domains] | domains | ||||
| x [bm:gds] | gds | ||||
| x [bm:gene] | gene | Gene Aging Nexus (GAN, a href=http:gan.usc.eduhttp:gan.usc.edua) is a data mining platform for the biogerontological-geriatric research community. It enables users to analyze, query, and visualize the aging-related genomic data. Our goal is to facilitate the digestion and usage of the public genomic data. A current focus is on integrative analysis of microarray gene expression data. We are establishing a central database for aging microarray data of six species: human (H. sapiens), rat (R. norvegicus), mouse (M. musculus), fruit fly (D. melanogaster), worm (C. elegans), and yeast (S. cerevisiae). GAN is equipped with a set of bioinformatics tools for analysis of the microarray data sets, cross-platform and cross-species. | |||
| Gene expression in dental tissue | |||||
| x [bm:genome] | genome | Genome Information Broker is a comprehensive database of microbial genomes in the public domain. We started the database to diffuse the complete genome data of Escherichia coli and have expanded to capture all the mirobial genome data from the database of International Nucleotide Sequence Database Collaboration (a href=http:www.insdc.orghttp:www.insdc.orga). The user interface is interacive and composed of text and graphics. For the retrieval and analysis of sequences and annotation, GC plot, blast, search by ORF names and keyword are available either for a single genomes or multiple genomes in a session. | http://www.genome.ad.jp/dbget-bin/www_bfind_info?genome | ||
| The goal of the Genome Reviews project is to provide an up-to-date, standardised and comprehensively annotated view of the genomic sequence of organisms with completely deciphered genomes. PGenome Reviews are curated versions of EMBLGenBankDDBJ database entries. Each Genome Review represents an enhanced version of the original sequence, with additional annotation imported from other data sources such as the UniProt knowledgebase, the GOA (GO Annotation) project, InterPro etc. In addition, annotations used inconsistently among the original submissions have been standardised, and deleted in cases where the coverage is low, making it easier to compare data across several genomes. | |||||
| Genomics of apple, cherry, peach, pear, raspberry, rose and strawberry | |||||
| x [bm:gensat] | gensat | The GENSAT (Gene Expression Nervous System ATlas) database captures information on gene expression in mouse brain at several developmental ages. The project began at Rockefeller University in New York under the direction of Nat Heintz, and has since expanded to include additional experimental protocols authored by Tom Curran at St. Jude Children's Research Hospital in Memphis, TN. The GENSAT project seeks to map the expression of all genes expressed in mouse brain at four different developmental ages - embryonic days 10.5 and 15.5, post-natal day 7, and adult. PThe GENSAT project has mapped 2,148 genes to date. The data are available at NCBI in the form of a searchable image database. Images from the Rockefeller University data set include both confocal and brightfield images at very high resolutions, demonstrating expression at the resolution of cell bodies and processes; images from the St. Jude Children's Research Hospital set include high-throughput in situ hybridization images. All images are browseable in their original unmodified form, as well as several levels of resolution designed to aid interactivity. In addition, the images from Rockefeller University have a searchable set of human-curated annotations remarking expression levels and patterns in a wide variety of anatomical structures. The GENSAT project's techniques are described in the paper A gene expression atlas of the central nervous system based on bacterial artificial chromosomes, published in Nature in 2003 (a href=http:www.ncbi.nlm.nih.goventrezquery.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=14586460http:www.ncbi.nlm.nih.goventrezquery.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=14586460a). PThe GENSAT database is available at NCBI through two search interfaces. The primary Entrez search interface is found at: a href=http:www.ncbi.nlm.nih.goventrezquery.fcgi?CMD=search&DB=gensathttp:www.ncbi.nlm.nih.goventrezquery.fcgi?CMD=search&DB=gensata. Simply searching for a gene symbol or gene name will return information about that gene. An additional search interface exists, allowing a finer-grained level of control over the returned results; this can be found at: a href=http:www.ncbi.nlm.nih.govprojectsgensathttp:www.ncbi.nlm.nih.govprojectsgensata. This second search interface permits easier subsetting of the data based on regional expression patterns. Additional information on the GENSAT project is available from Rockefeller University at a href=http:www.gensat.orgindex.htmlhttp:www.gensat.orgindex.htmla and from St. Jude Children's Research Insititute at a href=http:www.stjudebgem.orgwebmainPagemainPage.phphttp:www.stjudebgem.orgwebmainPagemainPage.phpa. | |||
| x [bm:geo] | geo | GEO serves as a public repository for a wide range of high-throughput experimental data. These data include single and dual channel microarray-based experiments, measuring the abundance of mRNA, genomic DNA and protein molecules. Data from non-array-based high-throughput functional genomics and proteomics technologies are also archived, including serial analysis of gene expression (SAGE) and protein identification technology. Several tools are provided to assist with the visualization and exploration of GEO data, including hierarchical cluster heat maps and searchable individual gene expression profiles. P Suitable GEO records are assembled into biologically and statistically comparable datasets. These datasets are then indexed and loaded into Entrez GEO Profiles and Entrez GEO DataSets, which allows users to query and analyze the data using simple boolean queries, and provides links to other NCBI resources wherever possible (e.g. sequence, publication and genomic mapping). P GEO is available on the World Wide Web at a href=http:www.ncbi.nlm.nih.govgeohttp:www.ncbi.nlm.nih.govgeoa. BR GEO DataSets may be browsed at a href=http:www.ncbi.nlm.nih.govgeogdsgds_browse.cgihttp:www.ncbi.nlm.nih.govgeogdsgds_browse.cgia or searched by keywords at a href=http:www.ncbi.nlm.nih.goventrezquery.fcgi?db=gdshttp:www.ncbi.nlm.nih.goventrezquery.fcgi?db=gdsa . BR GEO Profiles are searchable at (3) a href=http:www.ncbi.nlm.nih.goventrezquery.fcgi?db=geohttp:www.ncbi.nlm.nih.goventrezquery.fcgi?db=geoa. BR All these tools are intralinked and accessible from the GEO home page. | |||
| x [bm:journals] | journals | ||||
| x [bm:nucleotide] | nucleotide |
The Entrez Nucleotide database is a collection of sequences from
several sources, including GenBank, RefSeq, and PDB. The number of
bases in these databases continues to grow at an exponential rate. As
of April 2006, there are over 130 billion bases...
|
|||
| x [bm:pcassay] | pcassay | ||||
| x [bm:pccompound] | pccompound | ||||
| x [bm:pcsubstance] | pcsubstance | ||||
| x [bm:pmc] | pmc | ||||
| x [bm:popset] | popset | ||||
| x Protein : Entrez Protein | protein | The Protein Kinase Resource (PKR) is a curated information source which provides an integrated view of sequence and structure data combined with biochemical and genetic function data focused on a single family of proteins, the protein kinases. In addition, the PKR provides tools for computational analysis of protein kinase sequence and structure and an email discussion forum devoted to protein kinases. | http://www.genome.ad.jp/dbget-bin/www_bfind_info?protein |
The protein entries in the Entrez search and retrieval system have been
compiled from a variety of sources, including SwissProt, PIR, PRF, PDB,
and translations from annotated coding regions in GenBank and RefSeq.
|
|
| The Protein Mutant Database (PMD) is a compilation of protein mutant data, providing information of amino acid mutations at specific positions of proteins and in some cases the structural alterations caused by them. We developed a powerful viewing and retrieving system (a href=http:pmd.ddbj.nig.ac.jphttp:pmd.ddbj.nig.ac.jpa), which is integrated with sequence and tertiary structure databases. The system has the following features: (1) Mutated sequences are displayed after being automatically generated from the information described in the entry together with the sequence data of the wild-type protein. This convenient feature allows the user to see the positions of altered amino acids (shown in a different color) in the entire sequence of the wild-type protein. (2) For those proteins whose 3D structures have been experimentally determined, the 3D structures are displayed with the mutation sites colored differently. (3) A sequence homology search against the PMD can be carried out with any query sequence. (4) A summary of mutations in homologous sequences can be displayed, showing all the mutations of a protein recorded in the PMD. | |||||
| x [bm:snp] | snp | ||||
| x [bm:structure] | structure | Pairwise superposition of TIM-barrel structures | |||
| x Taxonomy : NEWT is the taxonomy database maintained by the UniProt |
|
taxonomy | The taxonomy database of the International Sequence Database Collaboration contains the names of all organisms that are represented in the sequence databases with at least one nucleotide or protein sequence. | http://beta.uniprot.org/taxonomy/ |
NEWT is the taxonomy database maintained by the UniProt group. It integrates taxonomy data compiled in the NCBI database and data specific to the UniProt Knowledgebase. [Reference]. Species with protein sequences stored in...
|
| x Arabidopsis Biological Resource Center | abrc_code | ||||
| x Accession ID of the AFAWE (Automatic functional annotation in a distributed Web | afawe_id | ||||
| x Namespace for the Affymetrix identification code for each set of probe pairs on | affymetrixprobesetid | ||||