[BioC] ExpressionSet class and problems with phenotype and metadata matrices

James W. MacDonald jmacdon at med.umich.edu
Thu Feb 28 19:40:19 CET 2008


Hi Sean,

Sean MacEachern wrote:
> Hello,
> 
> I'm new to R and Bioconductor. I am trying to analyse a simple microarray
> experiment examining two lines: Resistant (R) and susceptible (S) for
> differences in expression levels.
> 
> The data I have contains a file with expression for 4 and 3 replicates from
> the R and S lines respectively. I'm trying to create an ExpressionSet object
> to initially complete some exploratory clustering on the data set and I have
> been following the vignette " An Introduction to Bioconductor¹s
> ExpressionSet Class" by Falcon etal.
> 
> I have read in my data:
> 
>> summary(AffyIn)
>     lineA.1                 lineB.3
> Min.   :   2.0           Min.   :   2.0
> 1st Qu.:  18.0           1st Qu.:  18.0
> Median :  38.0           Median :  42.0
> Mean   : 139.0           Mean   : 143.4
> 3rd Qu.:  96.0           3rd Qu.: 105.0
> Max.   :6974.0  ......   Max.   :7417.0
> 
>> dim(AffyIn)
> [1] 38483   7
> 
> Following the vignette I have read in a simple phenotype txt file containing
> seven rows which relate to the 7 lines with two phenotypes R and S
> 
>> dim(AffyPheno)
> [1] 7   1
> 
>> summary(AffyPheno)
> Pheno
>  R:4  
>  S:3
> 
>> all(rownames(AffyPheno) == colnames(AffyIn))
> [1] TRUE
> 
> 
> #However, it is after this that I start having some problems; as I am using
> my own data I have modified some of the exercises in the vignette.
> 
>> AffyPheno[c(3,7),c("Pheno")]
> [1] R S
> Levels: R S
> 
> # I was expecting something like the following to be returned:
>         Pheno
> lineA.3   R
> LineB.7   S

You shouldn't expect that. You might want to peruse 'An Introduction to 
R', which I believe should cover this point. What is happening is the 
output is being coerced to a vector, which can be overridden by using

AffyPheno[c(3,7),c("Pheno"), drop=FALSE]

> 
> #Also when I try the following command I get this error:
>> AffyPheno[AffyPheno$Pheno == "R"]
> 
> Error in `[.data.frame`(AffyPheno, AffyPheno$Pheno == "R") :
>   undefined columns selected

The error is supposed to be helpful here. You are trying to select rows 
from a data.frame, but you aren't saying which columns you want. The 
correct incantation looks like this:

AffyPheno[AffyPheno$Pheno == "R", ]

if you want all columns. This again is something that 'An Introduction 
to R' will help with.


> 
> #My R programming knowledge is basic at best so I assumed there was
> something wrong there and continued with the metadata and phenoData
> 
>> metadata = data.frame(labelDescrition = c("Status"),rownames=c("Phenotype"))
>> metadata
>   labelDescrition  rownames
> 1          Status Phenotype
> 
>> phenoData=new("AnnotatedDataFrame", data = AffyPheno, varMetadata = metadata)
>> phenoData
> An object of class "AnnotatedDataFrame"
>   rowNames: line6.1, line6.2, ..., line7.4  (7 total)
>   varLabels and varMetadata description:
>     Pheno: NA
>   additional varMetadata: rownames, labelDescription
> 
> 
> # As you can see no error was thrown, but I was expecting something in the
> varLabels and varMetadata descrtiptions...

I'd have to check to be sure, but I believe what you want for your 
metadata is to explain what the 'Pheno' column contains. So something like

metadata = data.frame(labelDescrition = c("Phenotype"),rownames="Pheno")

Is IIRC correct. I'm actually surprised you didn't get an error. Martin 
Morgan may respond as well, and he knows better than, well, everybody 
about the ExpressionSet class so he will know for sure.

> 
> So I thought it was best to check the list to see if anyone could point out
> any mistakes I've made before I continue.
> 
> While I was here I was also wondering if anyone knew of anything in the
> annotation package like the hgu95av2 chip for annotating chicken affy data
> in the annotation library?

Um, what? Not sure what you want here. The hgu95av2 chip is designed for 
  analyzing human samples, so there is nothing in there for chickens. If 
you have chicken affy data, then you might want to look at the chicken 
annotation package, which _does_ annotate that chip.

Best,

Jim


> 
> Thanks in advance, 
> 
> Sean MacEachern
> 
> R version 2.6.0 (2007-10-03)
> i386-apple-darwin8.10.1
> Biobase_1.16.3
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Affymetrix and cDNA Microarray Core
University of Michigan Cancer Center
1500 E. Medical Center Drive
7410 CCGC
Ann Arbor MI 48109
734-647-5623



More information about the Bioconductor mailing list