|
|
CLOBB
Clustering sequences on the basis of BLAST
Current version 2.0
(2.0.2)
The program takes a set of DNA sequences and clusters them into groups which putatively derive from the same gene.
In order to operate, the user must have BLASTALL in their path. The output is a blastable fasta file named
<cluster_id>EST, where cluster_id is given by the user, which contails a list of sequences with identifiers
<cluster_id>00001 to <cluster_id>99999. The program BLASTS each sequence in trun against the growing database of
clusters then examines the BLAST report for High Scoring Pairs (HSPs) which demonstrate near identical regions of
sequence similarity (>95% identity over >30 bases, stringency can be contorlled by the user). The query sequence is
then allocated to an existing or new cluster depending on the strength of the HSP(s) and the quality of the match in
the rest of the overlap.
Version 2.0.2 includes improved annotation of the code, and better error reporting. If you use CLOBB2 within PartiGene, a version is supplied within the PartiGene distribution.
CLOBB2 parses the output of old-style megablast searches (ie not megablast searches performed within the BLAST+ suite) - and furthermore, various changes to the megablast output from BLAST version 2.2.18 onwards 'break' CLOBB2. To use PartiGene you MUST therefore install BLAST version 2.2.17, available from the NCBI FTP site at ftp://ftp.ncbi.nlm.nih.gov/blast/executables/release/2.2.17/
For further details please see the User guide.
|
|
Website Highlight
The plant-parasitic nematode
Heterodera glycines
.
Plant parasitic nematodes cause major economic losses worldwide.
Heterodera glycines
parasitises
soy beans
. See NEMBASE4 for analyses of ESTs from this parasite and many other nematodes.
|
|
|