Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.
URL: http://bioinformatics.vub.ac.be/databases/databases.html
Proper Citation: SABmark (RRID:SCR_011817)
Description: Downloadable data set designed to assess the performance of both multiple and pairwise (protein) sequence alignment algorithms, and is extremely easy to use. Currently, the database contains 2 sets, each consisting of a number of subsets with related sequences. It''s main features are: * Covers the entire known fold space (SCOP classification), with subsets provided by the ASTRAL compendium * All structures have high quality, with 100% resolved residues * Structure alignments have been derived carefully, using both SOFI and CE, and Relaxed Transitive Alignment * At most 25 sequences in each subset to avoid overrepresentation of large folds* Automated running, archiving and scoring of programs through a few Perl scripts The Twilight Zone set is divided into sequence groups that each represent a SCOP fold. All sequences within a group share a pairwise Blast e-value of at least 1, for a theoretical database size of 100 million residues. Sequence similarity is thus very low, between 0-25% identity, and a (traceable) common evolutionary origin cannot be established between most pairs even though their structures are (distantly) similar. This set therefore represents the worst case scenario for sequence alignment, which unfortunately is also the most frequent one, as most related sequences share less than 25% identity. The Superfamilies set consists of groups that each represent a SCOP superfamily, and therefore contain sequences with a (putative) common evolutionary origin. However, they share at most 50% identity, which is still challenging for any sequence alignment algorithm. Frequently, alignments are performed to establish whether or not sequences are related. To benchmark this, a second version of both the Twilight Zone and the Superfamilies set is provided, in which to each alignment problem a number of false positives, i.e. sequences not related to the original set, are added. Database specifications: * Current version: 1.65 (concurrent with PDB, SCOP and ASTRAL) * Twilight Zone set (with false positives): 209 groups, 1740 (3280) sequences, 10667 (44056) related pairs * Superfamilies set (with false positives): 425 groups, 3280 (6526) sequences, 19092 (79095) related pairs
Abbreviations: SABmark
Synonyms: SABmark - Sequence and structure Alignment Benchmark, Sequence Alignment Benchmark, Sequence and structure Alignment Benchmark
Resource Type: data set, data or information resource
Defining Citation: PMID:15333456
Expand Allis listed by |
|
is listed by |
|
has parent organization |
We found {{ ctrl2.mentions.total_count }} mentions in open access literature.
We have not found any literature mentions for this resource.
We are searching literature mentions for this resource.
Most recent articles:
{{ mention._source.dc.creators[0].familyName }} {{ mention._source.dc.creators[0].initials }}, et al. ({{ mention._source.dc.publicationYear }}) {{ mention._source.dc.title }} {{ mention._source.dc.publishers[0].name }}, {{ mention._source.dc.publishers[0].volume }}({{ mention._source.dc.publishers[0].issue }}), {{ mention._source.dc.publishers[0].pagination }}. (PMID:{{ mention._id.replace('PMID:', '') }})
A list of researchers who have used the resource and an author search tool
A list of researchers who have used the resource and an author search tool. This is available for resources that have literature mentions.
No rating or validation information has been found for SABmark.
No alerts have been found for SABmark.
Source: SciCrunch Registry