[BioC] RMA/QuantileNormalization results difference between oligo and aroma.affymetrix for Hugene

Mon Mar 22 12:23:01 CET 2010

My copy/paste skills need to be improved. Apologies.

setMethod("probeNames", "GeneFeatureSet",
          function(object, subset=NULL){
            res <- dbGetQuery(db(object), "SELECT fid, fsetid FROM pmfeature")
            idx <- order(res[["fid"]])
            as.character(res[idx, "fsetid"])
          })

On Mon, Mar 22, 2010 at 11:20 AM, Benilton Carvalho
<beniltoncarvalho at gmail.com> wrote:
> btw, the following is faster:
>
> setMethod("probeNames", "GeneFeatureSet",
>                 function(object, subset=NULL){
>                   res <- dbGetQuery(db(object), "SELECT fid, fsetid
> FROM pmfeature ORDER BY fid")
>                   idx <- order(res[["fid"]])
>                   as.character(res[idx, "fsetid"])
>                 })
>
> b
>
> On Mon, Mar 22, 2010 at 11:18 AM, Benilton Carvalho
> <beniltoncarvalho at gmail.com> wrote:
>> Dear Mikhail,
>>
>> I was able to reproduce the issue you reported. The  probeNames()
>> method in 1.10.3 is missing a sort by fid.
>>
>> setMethod("probeNames", "GeneFeatureSet",
>>                 function(object, subset=NULL){
>>                   res <- dbGetQuery(db(object), "SELECT fsetid FROM
>> pmfeature ORDER BY fid")[[1]]
>>                   as.character(res)
>>                 })
>>
>> I'll get this fixed now.
>>
>> b
>>
>> On Mon, Mar 22, 2010 at 11:04 AM, Benilton Carvalho
>> <beniltoncarvalho at gmail.com> wrote:
>>> what's the array you're looking at?
>>>
>>> sessionInfo()?
>>>
>>> thanks,
>>> b
>>>
>>> On Mon, Mar 22, 2010 at 10:54 AM, Mikhail Pachkov <pachkov at gmail.com> wrote:
>>>> Dear Benilton,
>>>>
>>>> I have got a problem obtaining probe indices along with probe names. My script:
>>>>
>>>> library(oligo);
>>>> workingDir = getwd();
>>>> celfiles<-list.files(path=workingDir,pattern=".CEL$|.cel$");
>>>> rawdata=read.celfiles(celfiles);
>>>>
>>>> pms = pm(rawdata)
>>>> rmadata=rma.background.correct(pms)
>>>> qndata=normalize.quantiles(log2(rmadata))
>>>>
>>>> res <- dbGetQuery(db(rawdata), "SELECT fsetid,atom,fid FROM pmfeature")
>>>> pid=paste(res[,1],res[,2],res[,3],sep=":")
>>>> rownames(qndata)<-pid
>>>>
>>>> colnames(qndata)<-sampleNames(rawdata)
>>>>
>>>> However during analysis of the data it looked like probe names were
>>>> determined wrong. I have tried to use pmindex() to extract "fid" of pm
>>>> probes which seems to be a list of numbers sorted in ascending order.
>>>> I do the following:
>>>>
>>>> pnames=probeNames(rawdata)
>>>> length(pnames)
>>>> [1] 818005
>>>>
>>>> pmidx=pmindex(rawdata)
>>>> length(pmidx)
>>>> [1] 818005
>>>>
>>>> # first value in probe names
>>>> pnames[1]
>>>> [1] "7896737"
>>>>
>>>> # first value in pm indices
>>>> pmidx[1]
>>>> [1] 1056
>>>>
>>>> If I check pgf file for probe with index "1056", it belongs to
>>>> probeset "7981328" not "7896737" as it given in pnames.
>>>>
>>>> My question: How to obtain probeset-probe_id pairs in correct order
>>>> for annotating expression values in "pms" matrix?
>>>>
>>>> Best regards,
>>>>
>>>> Mikhail
>>>>
>>>
>>
>