[BioC] Dist of exprSet object

Marco Blanchette mblanche at berkeley.edu
Thu Jul 27 02:35:45 CEST 2006


Hum... This exemplified my hate-love relationship that I have with R... Very
powerful, but very difficult to master...

One more issue. Each experiments are in duplicates (2 experiments, 2
replicates -> 4 arrays). My goal is to partition the distribution in genes
in the 10% top most expressed, 10% to 20% most expressed, 20% to 30% most
expressed, and so on.

eset is my exprSet object containing the rma computed expression for each
gene on the 4 arrays:
> eset 
Expression Set (exprSet) with
        18952 genes
        4 samples
                 phenoData object with 1 variables and 4 cases
         varLabels
                sample: arbitrary numbering

So I need to:

1) Get the average expression for each gene from the 2 replicates
Would you do:
>exp1 = iter(eset[,1,2], , mean)
>exp2 = iter(eset[,2,3], , mean)

Or is there a better way?

2) Break down the distribution per 10% bin as in
>top10 = geneNames(eset)[(rank(exp1) >= 0*(length(exp1)/10) & rank(exp1) <
1*(length(exp1)/10))]
>top10_20 = geneNames(eset)[(rank(exp1) >= 1*(length(exp1)/10) & rank(exp1) <
2*(length(exp1)/10))]
top20_30 = geneNames(eset)[(rank(exp1) >= 2*(length(exp1)/10) & rank(exp1) <
3*(length(exp1)/10))]

Or is there a better way? [I'm pretty sure there a more R elegant way than
that...]

Many thanks folks

Cheers,

Marco


On 7/26/06 4:05 PM, "Ben Bolstad" <bmb at bmbolstad.com> wrote:

> Actually you need affyPLM loaded to boxplot an exprSet. affy only
> provides the method for AffyBatch objects. Otherwise your example is
> correct.
> 
> Best,
> 
> Ben 
> 
> 
> eg .....
> 
>> library(affy)
> Loading required package: Biobase
> Loading required package: tools
> 
> Welcome to Bioconductor
> 
> 
>     Vignettes contain introductory material.
> 
>     To view, simply type 'openVignette()' or start with 'help(Biobase)'.
> 
>     For details on reading vignettes, see the openVignette help page.
> 
> 
> Loading required package: affyio
>> library(affydata)
>> data(Dilution)
>> eset <- rma(Dilution)
> Background correcting
> Normalizing
> Calculating Expression
>> boxplot(eset) # throws error
> Error in boxplot.default(eset) : invalid first argument
>> library(affyPLM)
> Loading required package: gcrma
> Loading required package: matchprobes
>> boxplot(eset) #works fine.
> 
> 
> 
> 
> 
> 
> 
> 
> On Thu, 2006-07-27 at 10:58 +1200, Marcus Davy wrote:
>> P 17 of the vignette("affy").
>> 
>> e.g.
>> 
>> chipCols <- rainbow(ncol(exprs(affybatch.example)))
>> boxplot(affybatch.example, col=chipCols)
>> 
>> Marcus
>> 
>> 
>> On 7/27/06 10:40 AM, "Marco Blanchette" <mblanche at berkeley.edu> wrote:
>> 
>>> Thank you all,
>>> 
>>> Using bioclite to download the annotation fixed the problem.
>>> 
>>> Now, I am getting into simpler R problem. I have an exprSet object  of 4
>>> arrays:
>>>> eset
>>> Expression Set (exprSet) with
>>>         18952 genes
>>>         4 samples
>>>                  phenoData object with 1 variables and 4 cases
>>>          varLabels
>>>                 sample: arbitrary numbering
>>> 
>>> My goal is to draw a boxplot of the 4 different samples. Surely I can do:
>>>> boxplot (exprs(eset)[,1], exprs(eset)[,2], exprs(eset)[,3],
>>>> exprs(eset)[,4],
>>> col=c(2,3,4,5))
>>> 
>>> But is there an easier way to do with without having to subscript each
>>> individual column? [right now I have only 4 but when I will have 20, I¹ll
>>> get bored quite rapidly]
>>> 
>>> Sorry if this sounds easy, I am still learning the basics of R
>>> 
>>> Marco
>>> ______________________________
>>> Marco Blanchette, Ph.D.
>>> 
>>> mblanche at uclink.berkeley.edu
>>> 
>>> Donald C. Rio's lab
>>> Department of Molecular and Cell Biology
>>> 16 Barker Hall
>>> University of California
>>> Berkeley, CA 94720-3204
>>> 
>>> Tel: (510) 642-1084
>>> Cell: (510) 847-0996
>>> Fax: (510) 642-6062
>> 
>> 
>> ______________________________________________________
>> 
>> The contents of this e-mail are privileged and/or confidenti...{{dropped}}
>> 
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
> 

______________________________
Marco Blanchette, Ph.D.

mblanche at uclink.berkeley.edu

Donald C. Rio's lab
Department of Molecular and Cell Biology
16 Barker Hall
University of California
Berkeley, CA 94720-3204

Tel: (510) 642-1084
Cell: (510) 847-0996
Fax: (510) 642-6062
--



More information about the Bioconductor mailing list