[BioC] Blast analysis of two sequences in R
Martin Morgan
mtmorgan at fhcrc.org
Wed Apr 23 19:12:10 CEST 2014
On 04/21/2014 03:51 AM, prabhakar ghorpade wrote:
> Hello,
> I have following sequences for which I want to BLAST and see result only for sequences showing > 95% Query coverage and >90% identity?
>
> Sequences
>> 1
> CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC
>> 2
> CTATGTCATCCTCAATCTCTATGAAAGCAACACCGCTACCATAGA
The 'annotate' package has 'blastSequences'; I'm not sure that it's useful
enough for your purposes. In the 'devel' branch (see
http://bioconductor.org/developers/how-to/useDevel/
it has been updated to be more responsive and to return richer data, e.g.,
df = blastSequences("CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC",
timeout=40, as="data.frame")
> head(df, 1)
Hit_num Hit_id
1 1 gi|380719094|gb|JQ281544.1|
Hit_def Hit_accession Hit_len Hsp_num
1 Expression vector pAV-UCSF, complete sequence JQ281544 11534 1
Hsp_bit-score Hsp_score Hsp_evalue Hsp_query-from Hsp_query-to Hsp_hit-from
1 82.4379 90 1.2063e-13 1 45 2126
Hsp_hit-to Hsp_query-frame Hsp_hit-frame Hsp_identity Hsp_positive Hsp_gaps
1 2170 1 1 45 45 0
Hsp_align-len Hsp_qseq
1 45 CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC
Hsp_hseq
1 CTTTGTCTTCCCCTATGTCATCCTCAATCTCTATGAAAGCAACAC
Hsp_midline
1 |||||||||||||||||||||||||||||||||||||||||||||
>
Hope that helps; would be happy to hear of other R solutions.
Martin
>
>
> Can you please Suggest how can I select them in R in NCBI BLAST so that I get sequences showing > 95% Query coverage and >90% identity. Is there program in R to select them? I want to detect number of organism showing uniques results for given sequences.
> Thanks.
>
>
> Dr. Ghorpade Prabhakar B.
> PhD Scholar ( Veterinary Biochemistry),
> IVRI,
> Izatnagar,
> Bareilly, U.P.,
> India
> [[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioconductor
mailing list