[BioC] Combining expressionSets from GEO

Francois Pepin fpepin at cs.mcgill.ca
Wed Jan 30 19:24:12 CET 2008


Hi Martin,

I think it is related, as I now have a different error message along
with a series of warnings. 255 and 98 refer to the number of samples in
each ExpressionSet. 66 and 21 refer to the number of unique elements in
source_name_ch1 in the phenodata.

> tmp2<-combine(tmp[[1]],tmp[[2]])
Error in .local(x, y, ...) :
  data.frames contain conflicting data:
        non-conforming colname(s): title, geo_accession,
source_name_ch1, description, supplementary_file
In addition: Warning messages:
1: In alleq(levels(x[[nm]]), levels(y[[nm]])) :
  Lengths (255, 98) differ (string compare on first 98)98 string
mismatches
2: In switch(class(x[[nm]])[[1]], factor = { :
  data frame column 'title' levels not all.equal
3: In alleq(levels(x[[nm]]), levels(y[[nm]])) :
  Lengths (255, 98) differ (string compare on first 98)98 string
mismatches
4: In switch(class(x[[nm]])[[1]], factor = { :
  data frame column 'geo_accession' levels not all.equal
5: In alleq(levels(x[[nm]]), levels(y[[nm]])) :
  Lengths (66, 21) differ (string compare on first 21)21 string
mismatches
6: In switch(class(x[[nm]])[[1]], factor = { :
  data frame column 'source_name_ch1' levels not all.equal
7: In alleq(levels(x[[nm]]), levels(y[[nm]])) :
  Lengths (255, 98) differ (string compare on first 98)98 string
mismatches
8: In switch(class(x[[nm]])[[1]], factor = { :
  data frame column 'description' levels not all.equal
9: In alleq(levels(x[[nm]]), levels(y[[nm]])) :
  Lengths (255, 98) differ (string compare on first 98)98 string
mismatches
10: In switch(class(x[[nm]])[[1]], factor = { :
  data frame column 'supplementary_file' levels not all.equal

> traceback()
9: stop("data.frames contain conflicting data:", "\n\tnon-conforming
colname(s): ",
       paste(sharedCols[!ok], collapse = ", "))
8: .local(x, y, ...)
7: combine(pDataX, pDataY)
6: combine(pDataX, pDataY)
5: .local(x, y, ...)
4: combine(phenoData(x), phenoData(y))
3: combine(phenoData(x), phenoData(y))
2: combine(tmp[[1]], tmp[[2]])
1: combine(tmp[[1]], tmp[[2]])

> sessionInfo()
R version 2.6.0 (2007-10-03)
x86_64-unknown-linux-gnu

locale:
LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] GEOquery_2.2.0 RCurl_0.8-1    Biobase_1.16.2

loaded via a namespace (and not attached):
[1] rcompgen_0.1-15

Francois

On Wed, 2008-01-30 at 10:03 -0800, Martin Morgan wrote:
> Hi Francois -- this might be related to a bug in Biobase that has been
> fixed. Can you try to update your Biobase, either biocLite('Biobase')
> or following the directions at http://bioconductor.org/download ? If
> not, can you provide the output of traceback() after the error occurs?
> 
> Thanks,
> 
> Martin
> 
> Francois Pepin <fpepin at cs.mcgill.ca> writes:
> 
> > Hi everyone,
> >
> > I'm getting an error message when trying to combine two parts of a GSE
> > object:
> >
> >>tmp<-getGEO('GSE3526',GSEMatrix=T)
> >> tmp2<-combine(tmp[[1]],tmp[[2]])
> > Error in alleq(levels(x[[nm]]), levels(y[[nm]])) && alleq(x
> > [sharedRows,  :
> >   invalid 'x' type in 'x && y'
> >
> > Checking to make sure that I should be able to combine them (from the
> > eSet documentation):
> >
> > #eSets must have identical numbers of 'featureNames'
> >> all(featureNames(tmp[[2]])==featureNames(tmp[[2]]))
> > [1] TRUE
> >
> > #must have distinct 'sampleNames'
> >> any(sampleNames(tmp[[1]])%in%sampleNames(tmp[[2]]))
> > [1] FALSE
> >
> > #and must have identical 'annotation'.
> >> annotation(tmp[[2]])==annotation(tmp[[2]])
> > [1] TRUE
> >
> >> sessionInfo()
> > R version 2.6.0 (2007-10-03)
> > x86_64-unknown-linux-gnu
> >
> > locale:
> > LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C
> >
> > attached base packages:
> > [1] tools     stats     graphics  grDevices utils     datasets  methods
> > [8] base
> >
> > other attached packages:
> > [1] GEOquery_2.2.0 RCurl_0.8-1    Biobase_1.16.0
> >
> > loaded via a namespace (and not attached):
> > [1] rcompgen_0.1-15
> >
> > Does anyone know why that is happening and if there would be any way
> > around it?
> >
> > Francois
> >
> > _______________________________________________
> > Bioconductor mailing list
> > Bioconductor at stat.math.ethz.ch
> > https://stat.ethz.ch/mailman/listinfo/bioconductor
> > Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>



More information about the Bioconductor mailing list