[BioC] chip-chip data

Joern Toedling Joern.Toedling at curie.fr
Mon Mar 8 10:42:31 CET 2010


Hi Som,

On Sun, 7 Mar 2010 15:56:15 -0500, somnath bandyopadhyay wrote 
> Hello Joern, 
>   
> I am having some trouble with the mapping of the reporters to the genomic
positions. 

there are many ways to map the reporter sequences to the genome and the one
using Exonerate that is mentioned in the tutorial is just one of them, so you
could use an alternative one that you are more familiar with.
For example, using the Biostrings function "matchPDict", you can even do this
in R now.

>   
> 1. I have downloaded the ccTutorial and have takena look at the
exonerateData folder. I was wondering how I could create a .fsa file like
RenMM5TilingProbe-Sequences.fsa from my Nimlegen .ndf file for the sequences
for all the reporters on my 385k chip? 

These sequences are one of the columns of the ndf file (the sixth, I think).
Also unique identifiers for each reporter are given in another column of that
file. So you need to extract these two columns and write them out into a
Fasta-file. You can read in the file into R using "read.delim" and then use
the function "cat" to export just two of the columns.
The package Ringo also contains a Perl script "extractProbeSequenceFasta.pl"
in its scripts directory which performs the same task.

>   
> 2. I have the .sh and .pl files in the exonerateData folder.. Once I have my
.fsa file from above, what exactly do I need to run to create the
"allChromExonerateOut..txt" file?Do all the files ...   .fsa, .sh and the .pl
have to be in the current working directory? 

Basically, you need to modify the .sh file to change the paths such that the
work for you system. The Perl script in there is just for combining the
multiple output files from Exonerate into one file.

For details on the Exonerate parameters, please read the manual that comes
with Exonerate. And again, if you are not happy with Exonerate, you also use
another tool for this step.

Regards,
Joern


--- 
 Joern Toedling 
 Institut Curie -- U900 
 26 rue d'Ulm, 75005 Paris, FRANCE 
 Tel. +33 (0)156246927



More information about the Bioconductor mailing list