[BioC] Reading GFF using Starr

Feseha Abebe-Akele fai4 at cisunix.unh.edu
Fri Mar 4 21:44:09 CET 2011


Dear Wolfgang;

"cat" indeed helped reading the GFF. However, I am still unclear about the
feature="transcript" parameter. In the example that shipped with the package
all entries are "transcript". In the gff I downloaded from NCBI the same
column is populated by things like CDS, gene, tRNA etc.. Am I suposed to
convert entries like: CDS, gene, mRNA, tRNA, snRNA ... which appear in the
4th column of the gff in to a generic "transcript" entry or would Starr take
them in as is with the feature="transcript" parameter and use them?

Thanks a lot.

Feseha



* Wolfgang Huber <whuber at embl.de> [Fri 04 Mar 2011 01:30:18 PM EST]:

> Dear Feseha
>
> I am not sure whether this will solve your question, but have you tried
>
> cat chrI.gff chrII.gff chrIII.gff chrIV.gff chrV.gff chrX.gff > all.gff
>
> (on the OS command line) and then
>
> transcriptAnno = read.gffAnno("all.gff", feature="transcript")
>
> (in R). Alternatively, if you are so unfortunate to work with an  
> operating system that does not have 'cat', you could also e.g. use  
> R's readLines and writeLines.
>
> 	Best wishes
> 	Wolfgang
>
>
>
> Il Mar/2/11 3:48 AM, Feseha Abebe-Akele ha scritto:
>> Hello everyone;
>> I am trying to analyze Tiling array data using Starr Package
>> and I am stuck at reading GFF files for the 7 genomic sequences
>> of C. elegans. In the example that come with the vignette, a
>> single primordial gff file (20 lines?) is used whic is not
>> anywhere near the 56 MN (combined) gff files.
>>
>> My question is: how do I read in multiple gff files for analysis?
>> among other things I have tried reading them like:
>>
>> gffs <- c(file.path(dataPath,"chrI.gff"),
>> file.path(dataPath,"chrII.gff"), file.path(dataPath,"chrIII.gff"),
>> file.path(dataPath,"chrIV.gff"), file.path(dataPath,"chrV.gff"),
>> file.path(dataPath,"chrX.gff"))
>>
>> transcriptAnno <- read.gffAnno(gffs, feature="transcript")
>>
>> But none worked for me.
>>
>> I would appreciate any help in getting my analysis to the next level:
>>
>> FYI:
>> I am trying to analyze TEST vs CONTROL experession differential
>> on the C. elegans Tiling Array 1.0 chips.
>>
>> Thanks
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> -- 
>
>
> Wolfgang Huber
> EMBL
> http://www.embl.de/research/units/genome_biology/huber
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:  
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list