UniProt Archive (UniParc) is part of UniProt project. It is a non-redundant archive of protein sequences extracted from public databases UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, PIR-PSD, EMBL, EMBL WGS, Ensembl, IPI, PDB, PIR-PSD, RefSeq, FlyBase, WormBase, H-Invitational Database, TROME database, European Patent Office proteins, United States Patent and Trademark Office proteins (USPTO) and Japan Patent Office proteins.
UniParc contains only protein sequences. All other information about
the protein must be retrieved from the source databases using the
database cross-references. Each unique sequence is stored only once
with a stable identifier. The format of the identifier is UPI followed
by ten hexadecimal numbers, e.g. UPI000000000A.
UniParc proteins are linked to their source databases by database
cross-references. Each cross-reference links one protein in UniParc to
an accession number in a source database. The database cross-reference
is active as long as the sequence identified by the source accession
number remains unchanged. When the sequence is modified or removed in
the source database, the cross-reference from UniParc becomes inactive.
Active cross-reference can be used to directly access the source
databases but inactive cross-references can only be used to access
sequences archives, such as the Sequence Version Archive.
UniParc is available for text- and sequence-based searches. Sequences,
which are no longer part of any source database, are excluded from
sequence-based searches, but they are available for text-based SRS
searches. Performing a similarity search against UniParc is equivalent
to performing the same search against all databases cross-referenced in
UniParc, as UniParc contains all proteins from its source databases.
Sequence similarity searches can be done using FASTA, BLAST or Mpsrch.
[ - ]
UniProt Archive (UniParc) is part of UniProt project. It is a non-redundant archive of protein...
[ + ]
UniProt Archive; a non-redundant archive of protein sequences extracted from Swiss-Prot, TrEMBL, PIR-PSD, EMBL, Ensembl, IPI, PDB, RefSeq, FlyBase, WormBase, European Patent Office, United States Patent and Trademark Office, and Japanese Patent Office
UniProt Archive (UniParc) is part of UniProt project. It is a non-redundant archive of protein sequences extracted from public databases UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, PIR-PSD, EMBL, EMBL WGS, Ensembl, IPI, PDB, PIR-PSD, RefSeq, FlyBase, WormBase, H-Invitational Database, TROME database, European Patent Office proteins, United States Patent and Trademark Office proteins (USPTO) and Japan Patent Office proteins.
UniParc contains only protein sequences. All other information about
the protein must be retrieved from the source databases using the
database cross-references. Each unique sequence is stored only once
with a stable identifier. The format of the identifier is UPI followed
by ten hexadecimal numbers, e.g. UPI000000000A.
UniParc proteins are linked to their source databases by database
cross-references. Each cross-reference links one protein in UniParc to
an accession number in a source database. The database cross-reference
is active as long as the sequence identified by the source accession
number remains unchanged. When the sequence is modified or removed in
the source database, the cross-reference from UniParc becomes inactive.
Active cross-reference can be used to directly access the source
databases but inactive cross-references can only be used to access
sequences archives, such as the Sequence Version Archive.
UniParc is available for text- and sequence-based searches. Sequences,
which are no longer part of any source database, are excluded from
sequence-based searches, but they are available for text-based SRS
searches. Performing a similarity search against UniParc is equivalent
to performing the same search against all databases cross-referenced in
UniParc, as UniParc contains all proteins from its source databases.
Sequence similarity searches can be done using FASTA, BLAST or Mpsrch.
The UniProt Archive (UniParc) is a comprehensive non-redundant proteinsequence archive. Its protein sequences are retrieved from predominantpublicly accessible resources, including Swiss-Prot, TrEMBL, EMBL, Ensembl,RefSeq and PDB. To avoid redundancy each unique sequence is stored only once with a stable protein identifier, which can be used later to identify the same protein in all source databases. When proteins are loaded into UniParc, database cross-references are created to link them to the origins of the sequences. As a result, performing a sequence search against UniParc is equivalent to performing the same search against all databasescross-referenced by it.
The UniProt archive (UniParc), part of the UniProt databases, is an archival protein sequence collection from all major publicly accessible resources. New and revised protein sequences are added daily into UniParc while not deleting the previous versions. A UniParc sequence version is provided and incremented each time the underlying sequence changes, making it possible to observe the history of sequence changes in all source databases. To avoid redundancy, each unique sequence is assigned a unique identifier and is stored only once. The basic information stored with each UniParc entry is the identifier, the sequence, cyclic redundancy check number (CRC64), source database(s) with accession and version numbers, and a time stamp; all other information must be retrieved from the source databases. Each source database accession number is tagged with its status in that database, indicating if the sequence still exists or has been deleted at that source.