[BioC] BLAST E-value

John Herbert j.m.herbert at bham.ac.uk
Thu Feb 18 12:08:32 CET 2010


Dear Alla,
Both Richard and Michael have a similar idea to me but to elaborate on their ideas a little. 

I am not sure your exact biological question but say, for instance, your species are rat and mouse. 

1) download the command line version of NCBI blast and install it
http://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=Download

2) (assuming your proteins are in FASTA format) use the formatdb command to format your rat proteins into A BLAST-able database
formatdb -i rat_proteins -n rat_proteins -o T -p T

3) run BLAST pairwise sequence alignment algorithm by searching the mouse proteins against your rat protein database. Use this command:
blastall -p blastp -i mouse_proteins -d rat_proteins -e 0.01 -o mouse_vs_rat_proteins

The file mouse_vs_rat_proteins will contain each mouse protein as a query against the database of rat proteins, which will include an alignment and an e-value for each. You can just type blastall to see the different command line arguments, where you can change output formats and restrict the number of hits reported etc. You may want a perl parser to process your blast output to make sense of all the alignments (you say you have thousands of proteins). 

If you are interested in ortholog relationships between species, I wrote a simple pipeline that uses a Conditional Stepped Reciprocal Best Hit approach to infer ortholog relationships between model and non-model species. 
http://www.biomedcentral.com/1471-2164/10/490
If you want to use that, email me personally and I will see if it is a appropriate to your data and biological question. 

I hope that helps. 

Kind regards,

John.

-----Original Message-----
From: bioconductor-bounces at stat.math.ethz.ch [mailto:bioconductor-bounces at stat.math.ethz.ch] On Behalf Of michael watson (IAH-C)
Sent: 17 February 2010 22:56
To: michael watson (IAH-C); Alla Bulashevska; bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] BLAST E-value

http://www.ncbi.nlm.nih.gov/staff/tao/URLAPI/bl2seq.html

This does indeed provide a BLAST e-value.
________________________________________
From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of michael watson (IAH-C) [michael.watson at bbsrc.ac.uk]
Sent: 17 February 2010 22:44
To: Alla Bulashevska; bioconductor at stat.math.ethz.ch
Subject: Re: [BioC] BLAST E-value

Have you tried the bl2seq program from NCBI?
________________________________________
From: bioconductor-bounces at stat.math.ethz.ch [bioconductor-bounces at stat.math.ethz.ch] On Behalf Of Alla Bulashevska [alla.bullashevska at fdm.uni-freiburg.de]
Sent: 16 February 2010 18:36
To: bioconductor at stat.math.ethz.ch
Subject: [BioC] BLAST E-value

Dear Bioconductor users,
i am doing cross-species analysis and searching for the way
how to pairwise align thousands of sequences efficiently. I
should become the BLAST E-value as output.
The pairwiseAlignment in the Biostrings package can align
 two protein sequences but outputs Smith Watermann score.
I will appreciate it greatly if somebody can tell me, how
this score could be transferred into E-value.

Thank you for your help,
Alla.

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

_______________________________________________
Bioconductor mailing list
Bioconductor at stat.math.ethz.ch
https://stat.ethz.ch/mailman/listinfo/bioconductor
Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list