OC Clone Classifications

The ClassificationFile includes all available ORFeome entry clones. These clones have a GenBank accession assigned with their annotated category. One column in this file indicates how good the best RefSeq hit actually is. Please note that the comparison is based on the amino acid sequence and therefore "SNPs" and "PartWithSNPs" indicate that there is at least one amino acid substitution.

Classification File Description:

  1. Column: GenBankID of the entry-clone sequence.
  2. Column: RefSeq ID of best hit in BLAST, "No hit found" is indicated when no RefSeq hits could be identified that matched at least with conditions of PartWithSNPs (see column 3). Such hits are either provisional genes (would be category 9) or are very much truncated as compared to a known gene (e.g. DQ892041).
  3. Column: Similarity is the level of identity between entry-clone sequence and RefSeq hit (of column 2).
  4. Column: Category. Note: Category 9 (hits just with ENSEMBL) has not been implemented yet, however, this will be done in the next version.

For a complete listing of OC Classification files, click here.

For a complete listing of sequence-verified ORFeome clones (with plate and well information) click here.


Key to Categories of OC Clones*

  1. Reviewed by RefSeq and in CCDS
  2. Validated by RefSeq and in CCDS
  3. Reviewed by RefSeq
  4. Validated by RefSeq
  5. Provisional by RefSeq and in CCDS
  6. Provisional by RefSeq
  7. Predicted by RefSeq and in CCDS
  8. Predicted by RefSeq
  9. In Ensembl but not RefSeq

* These nine categories were developed by the ORFeome Collaboration to provide a measure of confidence in the protein-coding sequence of OC clones. They are listed from highest to lowest level of confidence.


CCDS transcripts reflect a collaborative curation effort between the European Bioinformatics Institute (EBI), the National Center for Biotechnology Information (NCBI), the Wellcome Trust Sanger Institute (WTSI) , and the University of California, Santa Cruz.(UCSC) "to identify a core set of protein coding regions that are consistently annotated and of high quality." (For more information, see: CCDS)


RefSeq Status Definitions (For more information, see: RefSeq Status):


For a complete listing of OC Classification files, click here.

For a complete listing of sequence-verified ORFeome clones (with plate and well information) click here.