[BioC] Expressionset from ArrayExpress processed data

Tue Jan 20 22:21:19 CET 2009

Hi Martin,

Thank you for your suggestions. Here's an example of how to create a data.frame from a sdrf file as explained in 'ExpressionSetIntroduction.pdf'  (provided that the file is in the current working directory):

  pData <- read.table("file.sdrf", row.names = 1, header = TRUE,  sep = "\t")

>From here it is possible to follow your suggestion.

However, I found that my expression data contains 3 replicates per array, but these are not treated separately in the pData (I have 3 times as columns in the expression data as elements in each pData slot). So obviously I get the error:

> eset <- new("ExpressionSet", exprs=exprs, phenoData=phenoData)
Error in validObject(.Object) :
  invalid class "ExpressionSet" object: 1: sample numbers differ between assayData and phenoData
invalid class "ExpressionSet" object: 2: sampleNames differ between assayData and phenoData
In addition: Warning message:
In sampleNames(assayData(object)) == sampleNames(phenoData(object)) :
  longer object length is not a multiple of shorter object length

Any ideas of how can I make them match?

Thanks
Yovanny

________________________________________
De: Martin Morgan [mtmorgan at fhcrc.org]
Enviado el: martes, 20 de enero de 2009 9:26
Para: Yovanny Izquierdo Núñez
CC: bioconductor at stat.math.ethz.ch
Asunto: Re: [BioC] Expressionset from ArrayExpress processed data

Hi Yovanny

Yovanny Izquierdo Núñez <yovanny at ibp.co.cu> writes:

> Dear BioC users,
>
> I'm working with experiments from the ArrayExpress database and some
> of them do not provide the cell files, but instead the already
> processed data in a table fromat (esasy to read with read.delim, for
> instance). The PhenoData of the experiment comes separately in the
> sdrf file. Is there a way to create an expressionset object from these
> two?  The ArrayExpress package only provides functions for creating an

See the 'ExpressionSetIntroduction.pdf' in the Biobase package

  http://bioconductor.org/packages/2.3/bioc/html/Biobase.html

I don't know how to parse the PhenoData into a data.frame, but once
done likely you'll be able to do

  phenoData <- new("AnnotatedDataFrame", pData=PhenoData)
  eset <- new("ExpressionSet", exprs=exprs, phenoData==phenoData)

Martin

> AffyBatch object from the raw data and the sdrf, adf and idf files;
> but has nothing so far to deal with the processed data.
>
> Thanks so much,
>
> Yovanny
>
> Instituto de Biotecnología de las Plantas Universidad Central "Marta
> Abreu" de Las Villas Carretera a Camajuaní km 5½, Santa Clara, Villa
> Clara, Cuba Tel: 53 (42) 281257, 281268, 281693 Fax: 53 (42) 281329
> Web: http://www.ibp.co.cu E-Mail: info at ibp.co.cu
>
> _______________________________________________ Bioconductor mailing
> list Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor Search the
> archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

Instituto de Biotecnología de las Plantas
Universidad Central "Marta Abreu" de Las Villas
Carretera a Camajuaní km 5½, Santa Clara, Villa Clara, Cuba
Tel: 53 (42) 281257, 281268, 281693
Fax: 53 (42) 281329
Web: http://www.ibp.co.cu
E-Mail: info at ibp.co.cu