Biologists currently waste a lot of time and effort in searching
for all of the available information about each small area of research.
This is hampered further by the wide variations in terminology that may
be common usage at any given time, which inhibit effective searching by
both computers and people. For example, if you were searching for new
targets for antibiotics, you might want to find all the gene products
that are involved in bacteri...
more
Biologists currently waste a lot of time and effort in searching
for all of the available information about each small area of research.
This is hampered further by the wide variations in terminology that may
be common usage at any given time, which inhibit effective searching by
both computers and people. For example, if you were searching for new
targets for antibiotics, you might want to find all the gene products
that are involved in bacterial protein synthesis, and that have
significantly different sequences or structures from those in humans.
If one database describes these molecules as being involved in
'translation', whereas another uses the phrase 'protein synthesis', it
will be difficult for you - and even harder for a computer - to find
functionally equivalent terms.
The Gene Ontology (GO) project is a collaborative
effort to address the need for consistent descriptions of gene products
in different databases. The project began as a collaboration between
three model organism databases, FlyBase
(Drosophila), the Saccharomyces Genome Database
(SGD) and the Mouse Genome Database
(MGD), in 1998. Since then, the GO Consortium has grown to include many
databases, including several of the world's major repositories for
plant, animal and microbial genomes. See the GO Consortium page for a full list of member organizations.
The GO project has developed three structured controlled vocabularies
(ontologies) that describe gene products in terms of their associated
biological processes, cellular components and molecular functions in a
species-independent manner. There are three separate aspects to this
effort: first, the development and maintenance of the ontologies
themselves; second, the annotation of gene products, which entails
making associations between the ontologies and the genes and gene
products in the collaborating databases; and third, development of
tools that facilitate the creation, maintenance and use of ontologies.
The use of GO terms by collaborating databases
facilitates uniform queries across them. The controlled vocabularies
are structured so that they can be queried at different levels: for
example, you can use GO to find all the gene products in the mouse
genome that are involved in signal transduction, or you can zoom in on
all the receptor tyrosine kinases. This structure also allows
annotators to assign properties to genes or gene products at different
levels, depending on the depth of knowledge about that entity.
less