[BioC] Reading GFF using Starr

zacher at lmb.uni-muenchen.de zacher at lmb.uni-muenchen.de
Sat Mar 12 12:14:58 CET 2011


Dear Feseha,

sorry for the late reply. I am currently on holidays for some weeks.
I am going to make the documentation more clear, regarding what is meant
by the "feature" argument. I hope, everything works now.
Please contact me if you have any further questions on Starr. 
Best,

Benedikt

Wolfgang Huber <whuber at embl.de> wrote :

> Dear Feseha
> 
> I would suggest omitting the 'feature' argument in your call to 
> 'read.gffAnno' and then select those rows that you care about yourself.
> 
> The 'Starr' maintainer might be able to provide more details in the 
> function's manual page, or to allow 'feature' to be a vector or a 
> regular expression.
> 
> 	Best wishes
> 	Wolfgang
> 
> 
> Il Mar/4/11 9:44 PM, Feseha Abebe-Akele ha scritto:
> > Dear Wolfgang;
> >
> > "cat" indeed helped reading the GFF. However, I am still unclear
> about the
> > feature="transcript" parameter. In the example that shipped with
> the
> > package
> > all entries are "transcript". In the gff I downloaded from NCBI
> the same
> > column is populated by things like CDS, gene, tRNA etc.. Am I suposed
> to
> > convert entries like: CDS, gene, mRNA, tRNA, snRNA ... which appear in
> the
> > 4th column of the gff in to a generic "transcript" entry or
> would Starr
> > take
> > them in as is with the feature="transcript" parameter and use
> them?
> >
> > Thanks a lot.
> >
> > Feseha
> >
> >
> >
> > * Wolfgang Huber <whuber at embl.de>
> [Fri 04 Mar 2011 01:30:18 PM EST]:
> >
> >> Dear Feseha
> >>
> >> I am not sure whether this will solve your question, but have you
> tried
> >>
> >> cat chrI.gff chrII.gff chrIII.gff chrIV.gff chrV.gff chrX.gff >
> all.gff
> >>
> >> (on the OS command line) and then
> >>
> >> transcriptAnno = read.gffAnno("all.gff",
> feature="transcript")
> >>
> >> (in R). Alternatively, if you are so unfortunate to work with an
> >> operating system that does not have 'cat', you could also e.g. use
> R's
> >> readLines and writeLines.
> >>
> >> Best wishes
> >> Wolfgang
> >>
> >>
> >>
> >> Il Mar/2/11 3:48 AM, Feseha Abebe-Akele ha scritto:
> >>> Hello everyone;
> >>> I am trying to analyze Tiling array data using Starr Package
> >>> and I am stuck at reading GFF files for the 7 genomic sequences
> >>> of C. elegans. In the example that come with the vignette, a
> >>> single primordial gff file (20 lines?) is used whic is not
> >>> anywhere near the 56 MN (combined) gff files.
> >>>
> >>> My question is: how do I read in multiple gff files for
> analysis?
> >>> among other things I have tried reading them like:
> >>>
> >>> gffs <- c(file.path(dataPath,"chrI.gff"),
> >>> file.path(dataPath,"chrII.gff"),
> file.path(dataPath,"chrIII.gff"),
> >>> file.path(dataPath,"chrIV.gff"),
> file.path(dataPath,"chrV.gff"),
> >>> file.path(dataPath,"chrX.gff"))
> >>>
> >>> transcriptAnno <- read.gffAnno(gffs,
> feature="transcript")
> >>>
> >>> But none worked for me.
> >>>
> >>> I would appreciate any help in getting my analysis to the next
> level:
> >>>
> >>> FYI:
> >>> I am trying to analyze TEST vs CONTROL experession differential
> >>> on the C. elegans Tiling Array 1.0 chips.
> >>>
> >>> Thanks
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at r-project.org
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>> Search the archives:
> >>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >> --
> >>
> >>
> >> Wolfgang Huber
> >> EMBL
> >> http://www.embl.de/research/units/genome_biology/huber
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >> Search the archives:
> >> http://news.gmane.org/gmane.science.biology.informatics.conductor
> >>
> >
> >
> 
> -- 
> 
> 
> Wolfgang Huber
> EMBL
> http://www.embl.de/research/units/genome_biology/huber
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list