[BioC] Use of pData in assignments

Martin Morgan mtmorgan at fhcrc.org
Tue May 1 22:42:35 CEST 2007


Thanks Stephen for the clear and reproducible example!

The problem is that columns of pData need to have associated
descriptions in varMetadata, so after your change

> validObject(expressionSet, complete=TRUE)
Error in validObject(expressionSet, complete = TRUE) : 
	invalid class "ExpressionSet" object: In slot "phenoData" of class
	"AnnotatedDataFrame":
All AnnotatedDataFrame pData column names must be present as rows in
varMetadata, and vice versa

A safer way to do what you want is, for instance,

> expressionSet[["newCol"]] <- sub("rol", "", expressionSet[["type"]])
> phenoData(expressionSet)
  sampleNames: A, B, ..., Z (26 total)
  varLabels and varMetadata:
    sex: Female/Male
    type: Case/Control
    score: Testing Score
    newCol: NA
> validObject(expressionSet, complete=TRUE)
[1] TRUE

(you could also use '$' on expressionSet) To actually provide meaning
metadata requires an additional step

> varMetadata(expressionSet)["newCol", "labelDescription"] <-
+ "My new column"

though we'd like to make that simpler...

It might still be useful in some circumstances to assign using pData,
e.g., when not creating a new column, and doing several manipulations
in sequence, a strategy might be to extract pData, manipulate the date
frame, and reassign.

Hope that helps,

Martin


Edwards.Stephen at epamail.epa.gov writes:

>  When running the following code to update the phenoData for my eset, the 
> resulting eset object works fine for downstream analysis.  However, when I 
> try to subset the eset, I get the error noted just above the sessionInfo. 
> Am I doing something wrong in the assignment, or should I not assign using 
> the pData method?  This usage is still documented, so I assumed it was 
> safe.  I know I'm getting some warning messages with the expressionSet in 
> the example, but I get the same error when trying to subset a real dataset 
> created using ReadAffy() and rma() without the corresponding warnings.
>
>> data(sample.exprSet)
>> expressionSet <- as(sample.exprSet,"ExpressionSet")
> Warning messages:
> 1: The phenoData class is deprecated, use AnnotatedDataFrame (with 
> ExpressionSet) instead 
> 2: The phenoData class is deprecated, use AnnotatedDataFrame (with 
> ExpressionSet) instead 
>> isCurrent(expressionSet)
>             R       Biobase          eSet ExpressionSet 
>          TRUE          TRUE          TRUE          TRUE 
>> 
>> expressionSet[1:10,c(2,4,10)]
> ExpressionSet (storageMode: lockedEnvironment)
> assayData: 10 features, 3 samples 
>   element names: exprs, se.exprs 
> phenoData
>   rowNames: B, D, J
>   varLabels and varMetadata:
>     sex: Female/Male
>     type: Case/Control
>     score: Testing Score
> featureData
>   rowNames: AFFX-MurIL2_at, AFFX-MurIL10_at, ..., AFFX-BioDn-5_at (10 
> total)
>   varLabels and varMetadata: none
> experimentData: use 'experimentData(object)'
> Annotation [1] "hgu95av2"
>> 
>> p <- cbind (pData(expressionSet), sub("rol", "", 
> pData(expressionSet)$type))
>> names(p)[4] <- "type2"
>> pData(expressionSet) <- p
>> expressionSet[1:10,c(2,4,10)]
> Error in `row.names<-.data.frame`(`*tmp*`, value = c("sex", "type", 
> "score",  : 
>         invalid 'row.names' length
>> 
>> sessionInfo()
> R version 2.5.0 (2007-04-23) 
> i386-pc-mingw32 
>
> locale:
> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United 
> States.1252;LC_MONETARY=English_United 
> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
>
> attached base packages:
> [1] "tools"     "stats"     "graphics"  "grDevices" "utils"     "datasets" 
>
> [7] "methods"   "base" 
>
> other attached packages:
>     affy   affyio  Biobase    limma 
> "1.14.0"  "1.4.0" "1.14.0" "2.10.0" 
>>
>
> ----------------------------------------
> Stephen W. Edwards, Ph.D.
> Systems Biologist, ADHIO, NHEERL, ORD, USEPA
> U.S. Environmental Protection Agency (B305-01)
> 109 TW Alexander Drive
> Research Triangle Park, NC  27711
> Ph#: 919/541-0514       FAX#: 919/685-3221
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Martin Morgan
Bioconductor / Computational Biology
http://bioconductor.org



More information about the Bioconductor mailing list