[BioC] In need of platform clarification for 2-color SMD arrays

Ochsner, Scott A sochsner at bcm.tmc.edu
Mon Nov 10 17:32:52 CET 2008

Dear BioC,

I've used the GEOquery package as below to upload GSE7585.  This created a list of two ExpressionSets as predicted, one for each platform.

> gse<-getGEO("GSE7585",GSEMatrix=TRUE)
> names(gse)
[1] "GSE7585-GPL3417_series_matrix.txt.gz" "GSE7585-GPL5118_series_matrix.txt.gz"
> exprs(gse[[1]])[1:5,]
  GSM183597 GSM183598 GSM183599
1     1.203        NA        NA
2    -1.087     0.874    -1.236
3     0.384    -0.036     0.253
4    -2.443     1.641        NA
5    -1.518     4.661        NA
> exprs(gse[[2]])[1:5,]
  GSM183596 GSM183600 GSM183601
1    -3.237        NA        NA
2    -1.500    -1.423    -1.377
3    -0.007     0.144     0.386
4        NA    -4.547        NA
5    -0.258    -0.374    -0.492

The experiment consists of three treated groups each performed in duplicate.  However, the replicates have been split between two SMD platforms/prints.  For example, group A replicate 1 is on GPL3417 and group A replicate 2 is on GPL5118.  I would like to combine the two ExpressionSets prior to
doing differential expression analysis.  The two platforms GPL3417 and GPL5118 correspond to SMD platforms SHGA and SHEU respectively.  Looking at each GPL entry, these two arrays appear to have identical features in the same order.  Can someone please clarify what is the nature of the difference
between the SHGA and SHEU platforms?  I'm trying to ascertain if I can "legally" combine the two different ExpressionSets as is.

R version 2.8.0 (2008-10-20) 

LC_COLLATE=English_United States.1252;LC_CTYPE=English_United States.1252;LC_MONETARY=English_United States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] splines   tools     stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] affycoretools_1.14.0 annaffy_1.14.0       KEGG.db_2.2.5        gcrma_2.14.0         matchprobes_1.14.0   biomaRt_1.16.0       GOstats_2.8.0       
 [8] Category_2.8.0       genefilter_1.22.0    survival_2.34-1      RBGL_1.18.0          annotate_1.20.0      xtable_1.5-4         GO.db_2.2.5         
[15] RSQLite_0.7-1        DBI_0.2-4            AnnotationDbi_1.4.0  graph_1.20.0         affy_1.20.0          limma_2.16.2         GEOquery_2.6.0      
[22] RCurl_0.91-0         Biobase_2.2.0       

loaded via a namespace (and not attached):
[1] affyio_1.10.0        cluster_1.11.11      GSEABase_1.4.0       preprocessCore_1.4.0 XML_1.94-0.1             

Scott A. Ochsner, Ph.D.
NURSA Bioinformatics
Molecular and Cellular Biology
Baylor College of Medicine
Houston, TX. 77030
phone: 713-798-6227 

More information about the Bioconductor mailing list