[BioC] Blast analysis of two sequences in R

Martin Morgan mtmorgan at fhcrc.org
Wed Apr 23 19:12:10 CEST 2014


On 04/21/2014 03:51 AM, prabhakar ghorpade wrote:
> Hello,
>        I have following sequences for which I want to BLAST and see result only for sequences showing > 95% Query coverage and >90% identity?
>
> Sequences
>> 1
> CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC
>> 2
> CTATGTCATCCTCAATCTCTATGAAAGCAACACCGCTACCATAGA

The 'annotate' package has 'blastSequences'; I'm not sure that it's useful 
enough for your purposes. In the 'devel' branch (see

   http://bioconductor.org/developers/how-to/useDevel/

it has been updated to be more responsive and to return richer data, e.g.,

    df = blastSequences("CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC",
                        timeout=40, as="data.frame")

 > head(df, 1)
   Hit_num                      Hit_id
1       1 gi|380719094|gb|JQ281544.1|
                                         Hit_def Hit_accession Hit_len Hsp_num
1 Expression vector pAV-UCSF, complete sequence      JQ281544   11534       1
   Hsp_bit-score Hsp_score Hsp_evalue Hsp_query-from Hsp_query-to Hsp_hit-from
1       82.4379        90 1.2063e-13              1           45         2126
   Hsp_hit-to Hsp_query-frame Hsp_hit-frame Hsp_identity Hsp_positive Hsp_gaps
1       2170               1             1           45           45        0
   Hsp_align-len                                      Hsp_qseq
1            45 CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC
                                        Hsp_hseq
1 CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC
                                     Hsp_midline
1 |||||||||||||||||||||||||||||||||||||||||||||
 >

Hope that helps; would be happy to hear of other R solutions.

Martin

>
>
> Can you please Suggest how can I select them in R in NCBI BLAST so that I get sequences showing > 95% Query coverage and >90% identity. Is there program in R to select them? I want to detect number of organism showing uniques results for given sequences.
>   Thanks.
>
>
> Dr. Ghorpade Prabhakar B.
> PhD Scholar ( Veterinary Biochemistry),
> IVRI,
> Izatnagar,
> Bareilly, U.P.,
> India
> 	[[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list