[BioC] quantile robust and RMA in xps
cstrato at aon.at
Fri May 29 15:54:36 CEST 2009
Yes, your CEL-files were modified as is clearly stated in the help file
"?intensity<-" of the modified replacement function "intensity<-":
"Warning: Do not use replacement method intensity<- until you really
know what you are doing!
Note: If you do not want to replace your current object, create first a
copy of type DataTreeSet by simply writing newobj <- oldobj, and use
newobj for replacement. This is important since intensity<- does also
update slots rootfile, filedir and treenames when a new filename was chosen.
Warning: The CEL-files created WILL REPLACE THE ORIGINAL CEL-files, if
they have identical names to the original CEL-files and the original
CEL-files are located in the working directory. Thus the original
CEL-files should preferable be located in directory celdir of function
To my knowledge microarray facilities usually store all CEL-files
created in a common place and people either copy the CEL-files of
interest to their working directories or they create links to the
When importing CEL-files into ROOT using function "import.data()" even
this is not necessary, as the help file "?import.data" says:
"To import CEL-files from different directories, vector celfiles must
contain the full path for each CEL-file and celdir must be celdir=NULL."
At the moment I can only apologize and ask people starting to use xps to
read the corresponding help files, especially when using advanced
features. For this reason I mention often the corresponding help files
in my replies to questions, as I did in my original reply to your question.
Mayte Suarez Farinas wrote:
>> Dear Mayte,
>> Thank you for sending me your code. I have just run your code using the
>> Affymetrix breast, heart, prostate HuGene data, and everything is ok,
>> including the final boxplot. However, the results are slightly different
>> when I replace "normalize.quantiles()" from package preProcessCore with
>> "normalize.quantiles()" from xps. Furthermore I have found two potential
>> problems in your code:
>> 1, To simulate the removal of chips as output from
>> "normalize.quantiles.robust()" I have deleted one column. Thus I also
>> needed to delete the corresponding column name. Since you have set
>> "n.remove=5", up to 5 columns may be removed, so you would need to
>> remove these column names manually.
> I do not want to remove those chips from the output. the remove option on
> normalize.quantiles.robust do not remove the columns per se, it only uses the
> remaining chips to calculate the reference distribution to which all the chips
> (including the 'removed') are normalized. This step produces a matrix with the same
> number of columns as the original. I don't want to removed those chips, only not to
> include them in the estimation of the reference distribution (if I include them the
> reference distribution has utterly low range and I want to avoid to delete those
> chips, their qc is not that bad)
>> 2, Since you store your original CEL-files in your working directory,
>> replacement function "intensity<-" did replace your original CEL-files
>> with the background corrected CEL-files of the same name. There are two
>> ways to prevent this. The best way is to store the original CEL-files in
>> another directory, e.g. in "raw". The second possibility is to use
>> parameter "celnames" of function "import.data()" to rename the imported
> Do you mean that my CEL files were changed?. they are not more the raw data ?
> Oh Oh! I was not aware of that. Oh my! I hope that the facility kept a copy of my
> CEL files otherwise I am in deep trouble!
>> A third problem may be that function "normalize.quantiles.robust()"
>> re-orders the matrix, so that the (x,y)-coordinates are no longer
>> correct. Although this should not be the case I cannot exclude this
> Maybe Bolstad can answer this...
>> Here is the complete code that I used for testing.
>> - - - - - - - - - - - - - - - - - - - - - - - - -
>> ### new R session: load library xps
>> scmdir <- "/Volumes/GigaDrive/CRAN/Workspaces/Schemes"
>> scheme.hugene10stv1r4 <- root.scheme(paste(scmdir,
>> "Scheme_HuGene10stv1r4_na28.root",sep = "/"))
>> celfiles <- c("Breast_01.CEL", "Breast_02.CEL", "Breast_03.CEL",
>> "Heart_01.CEL", "Heart_02.CEL", "Heart_03.CEL", "Prostate_01.CEL",
>> "Prostate_02.CEL", "Prostate_03.CEL")
>> "Pamela_NOMID_dataxps_162021", celdir=getwd(), celfiles=celfiles,
>> ## RMA background
>> data.bg.rma <-
>> select="antigenomic", option="pmonly:epanechnikov",params=c(16384))
>> # get intensities
>> # normalize with affy functions
>> # manually remove one chip
>> data.int.norm <- data.int.norm[,-4]
>> # replace intensity slot
>> # Problem: need to remove colname of chip which was removed
>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- aaa
>> # Problem: does overwrite original CEL-files
>> boxplot(data.bg.rma) #boxplot is OK
>> ## summarize medianpolish
>> setName(data.bg.rma) <- "DataSet"
>> data.mp.rma <-
>> - - - - - - - - - - - - - - - - - - - - - - - - -
>> Best regards
>> Mayte Suarez-Farinas wrote:
>>>> Dear Christian,
>>> I am sorry I need to bother you gain !
>>> Everything worked fine with the background correction, the quantile
>>> normalization and the substitution
>>> using function "intensity()<-". When I do the boxplot after this, teh
>>> data is normalized. Then when I use summarize.rma,
>>> after that the data is not normalized anymore.
>>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
>>> boxplot(data.bg.rma) ## Boxplot is perfect!
>>> setName(data.bg.rma) <- "DataSet"
>>> data.mp.rma <-
>>> #Boxplot is NOT ok
>>> any hint?
>>>> The new replacement method "intensity()<-" has an option to create a
>>>> new ROOT file (see?intensity), thus you need to do:
>>>> data.int.norm <- as.data.frame(cbind(data.int[,c(1,2)],data.int.norm))
>>>> Here you see that I added the (x,y) coordinates, but it is up to you
>>>> to make sure that the order is correct.
>>>> I am using cbind() to prevent cycling of the samples, which is what I
>>>> get when using "data.int[,-c(1,2)]".
>>>> Now I can use the replacement method:
>>>> intensity(data.bg.rma, "tmp_int2", verbose=TRUE) <- data.int.norm
>>>> boxplot(data.bg.rma) #boxplot is OK
>>>> Please note that this will take some time since the
>>>> background-corrected intensities will first be saved as CEL-files
>>>> which are then imported into the new ROOT file "tmp_int2_cel.root".
>>>> Now you can summarize the data using xps, but you need to replace the
>>>> setname first:
>>>> setName(data.bg.rma) <- "DataSet"
>>>> data.mp.rma <- summarize.rma(data.bg.rma, "tmp_sum_rma",
>>>> boxplot(data.mp.rma) #boxplot is now OK.
>>>> I hope this helps.
>>>> Best regards
> ---------End of Included Message----------
> Mayte Suarez Farinas
> Research Associate
> The Rockefeller University Hospital
> 1230 York Avenue, Box 178
> New York, NY 10021
> (212) 327-8213 - phone
> (212) 327-7422 - fax
> farinam at rockefeller.edu
More information about the Bioconductor