UniProt Archive (UniParc) is part of UniProt project. It is a non-redundant archive of protein sequences extracted from public databases UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, PIR-PSD, EMBL, EMBL WGS, Ensembl, IPI, PDB, PIR-PSD, RefSeq, FlyBase, WormBase, H-Invitational Database, TROME database, European Patent Office proteins, United States Patent and Trademark Office proteins (USPTO) and Japan Patent Office proteins.
UniParc contains only pro...
More
UniProt Archive (UniParc) is part of UniProt project. It is a non-redundant archive of protein sequences extracted from public databases UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, PIR-PSD, EMBL, EMBL WGS, Ensembl, IPI, PDB, PIR-PSD, RefSeq, FlyBase, WormBase, H-Invitational Database, TROME database, European Patent Office proteins, United States Patent and Trademark Office proteins (USPTO) and Japan Patent Office proteins.
UniParc contains only protein sequences. All other information about
the protein must be retrieved from the source databases using the
database cross-references. Each unique sequence is stored only once
with a stable identifier. The format of the identifier is UPI followed
by ten hexadecimal numbers, e.g. UPI000000000A.
UniParc proteins are linked to their source databases by database
cross-references. Each cross-reference links one protein in UniParc to
an accession number in a source database. The database cross-reference
is active as long as the sequence identified by the source accession
number remains unchanged. When the sequence is modified or removed in
the source database, the cross-reference from UniParc becomes inactive.
Active cross-reference can be used to directly access the source
databases but inactive cross-references can only be used to access
sequences archives, such as the Sequence Version Archive.
UniParc is available for text- and sequence-based searches. Sequences,
which are no longer part of any source database, are excluded from
sequence-based searches, but they are available for text-based SRS
searches. Performing a similarity search against UniParc is equivalent
to performing the same search against all databases cross-referenced in
UniParc, as UniParc contains all proteins from its source databases.
Sequence similarity searches can be done using FASTA, BLAST or Mpsrch.
Less