[BioC] minfi - how to subset a GenomicMethylSet by chromosome/seqname

Martin Morgan mtmorgan at fhcrc.org
Thu Jan 10 19:31:32 CET 2013


On 01/10/2013 10:15 AM, Tim Triche, Jr. wrote:
> I do this all the time with SEs, so here's a couple of options.
>
> ## keepSeqlevels generic
> setMethod("keepSeqlevels", signature(x="SummarizedExperiment", value="ANY"),
>            function(x, value) { # {{{
>              y <- which(rownames(x) %in%
> names(keepSeqlevels(rowData(x),value)))
>              return(x[y, ])
>            }) # }}}
> gmset.noXY <- keepSeqlevels(gmset, paste0('chr', 1:22))

?seqlevels<- has comparable behavior (in-place restriction to a subset of 
levels); after example(SummarizedExperiment)

 > seqlevels(sset)
[1] "chr1" "chr2"
 > seqlevels(sset, force = TRUE) <- "chr2"
 > seqlevels(sset)
[1] "chr2"


>
> ## or use split; this is slower and balkier than keepSeqlevels
> gmset.noXY <- split(gmset, as.vector(seqnames(gmset)))[ paste0('chr', 1:22)
> ]
>
> ## bonus: masking, cf. MM
> load("maskedRegions.rda")
> gmset <- gmset[ countOverlaps(gmset, maskedRegions) == 0, ]
>
> ## or perhaps gmset[ !gmset %within% maskedRegions, ] ... ?
>
> there is a small pile of functions like this (combine(), combine(,
> withNAs=TRUE), keepSeqlevels(), etc.) I'd like to see as generics for all
> SummarizedExperiments in GenomicRanges.  Let the appropriate people know if
> you agree with this notion and any pitfalls you foresee
>
>
> --t
>
>
>
>
>
> On Thu, Jan 10, 2013 at 5:01 AM, Toby Hughes <toby.hughes at adelaide.edu.au>wrote:
>
>> Hi all,
>>
>> I am relatively new to both R and the Bioconductor packages, so apologies
>> if this is a naïve question.  I am using minfi to preprocess a small set
>> (60 samples) of Illumina450k microarray data.  I have got to the point
>> where I have a GenomicMethylSet, following the use of mapToGenome on a
>> MethylSet to provide annotation, having excluded those probes without a
>> defined position in the process.  I now wish to exclude probes from the sex
>> chromosomes before using dmpFinder, and so I am seeking to subset the
>> GenomicMethylSet to the autosomes only.  Despite days of searching and a
>> number of references saying this is relatively straightforward, I have not
>> yet had any success (have been exploring GenomicRanges).  Could someone
>> please point me in the right direction with a code snippet or a reference?
>>
>> Kind regards,
>>
>> Toby.
>>
>>          [[alternative HTML version deleted]]
>>
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list