[BioC] error with pileup function in ShortRead

Martin Morgan mtmorgan at fhcrc.org
Sun Mar 15 03:31:08 CET 2009


Hi Davide -- a third, different, response!

Since you mention ELAND and AlignedRead, it's likely that you can
accomplish your task with

  cvg <- coverage(foo, 1, chromLen[chrom], extend=fraglength-readlength)

this also loops over chromosomes. (There was an off-by-one error,
fixed in version 1.1.45, that added an extra nucleotide to reads; this
has not quite reached the biocLite repository but is in svn).

The relevant help page is ?"AlignedRead-class"; I'll add a section to
the vignette on use of coverage in the not too distant future.

Martin

Hervé Pagès <hpages at fhcrc.org> writes:

> Hi Davide,
>
> You can use coverage() from the IRanges package to achieve this.
>
> Simple example where all reads are supposed to be on the + strand and
> extend to the right starting at 'start':
>
>    start <- c(8:20, 76)
>    fraglength <- 15
>    chrlength <- 99
>
>    pileup(start, fraglength, chrlength)
>
>    as.integer(coverage(IRanges(start=start, width=fraglength), 1, chrlength))
>
> Note that coverage() returns an Rle object which is a compact representation
> of the returned vector of integers. as.integer() coerce this back to a
> standard integer vector so you get exactly the same as with pileup().
>
> However, unlike coverage(), pileup() knows how to pileup reads that are
> described by a start, a fraglength and a direction (+ or -) to which
> the fragment extends (in case of -, the end of the read is supposed to
> be at start + readlength - 1):
>
>    set.seed(23)
>    dir <- strand(sample(c("+", "-"), length(start), replace=TRUE))
>    readlength <- 10
>
>    pileup(start, fraglength, chrlength, dir=dir, readlength=readlength)
>
> Here is the IRanges + coverage() solution:
>
>    a) Make the IRanges object describing how the reads map the
>       reference sequence:
>
>         x_start <- start
>         ii <- dir == "-"  # which ranges need to be shifted
>         ii_readlength <- if (length(readlength) > 1) readlength[ii] else readlength
>         ii_fraglength <- if (length(fraglength) > 1) fraglength[ii] else fraglength
>         x_start[ii] <- x_start[ii] + ii_readlength - ii_fraglength + 1L
>         x <- IRanges(start=x_start, width=fraglength)
>
>    b) Call coverage() on it:
>
>         as.integer(coverage(x, 1, chrlength))
>
>       This will not complain if chrlength looks too small:
>
>         as.integer(coverage(x, 1, 20))
>
> Hope this helps,
>
> H.
>
>
> Davide Cittaro wrote:
>> Hi, I'm trying to build a pileup vector from eland results.
>> p<-pileup(position(foo), 185, chromLen[chrom], svector, width(foo))
>> where foo is my AlignData class (filtered on chrom). It happens that
>> I have a read on the + strand near the end of the chromosome (156 bp
>> afar) that is greater than the specified fragment size (185). pileup
>> complains that the chromosome length is too small... I can try to
>> run with a shorter fragment size, but is there a way to tell pileup
>> that the rightmost read can be smaller than the specified average
>> size?
>> thanks
>> d
>> Davide Cittaro
>> davide.cittaro at ifom-ieo-campus.it
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793



More information about the Bioconductor mailing list