[BioC] Help with interpreting GEO derived data values

Ochsner, Scott A sochsner at bcm.edu
Fri Jul 15 21:28:48 CEST 2011


Hi,

I've imported a GSE object using;

> gse<-getGEO("GSE444")
>gse<-gse$$GSE444_series_matrix.txt.gz
> class(gse)
[1] "ExpressionSet"
attr(,"package")
[1] "Biobase"

The data look like:
> head(exprs(gse))
          GSM6778 GSM6779 GSM6780 GSM6781 GSM6782 GSM6783 GSM6784 GSM6785 GSM6786 GSM6787
1000_at   10763.1 12469.0  9200.9  8988.6  8021.9 13376.4  9657.0  9835.3 10779.6  8374.2
1001_at     721.7  1538.3   827.9  1123.3   828.3  1699.9   864.9   989.2   940.2  1012.0
1002_f_at   606.4   285.9   364.1   651.4   347.2   794.6   770.5   573.1  1239.8  1324.0
1003_s_at  -687.9 -3701.3 -1040.8  -754.0  -748.6  -671.6 -1103.8 -1509.7  -769.9 -4944.8
1004_at    1224.6  -714.3   690.4  1130.4   780.9  1002.9  2096.1   594.1   773.2   813.7
1005_at    -119.7  -408.7    93.9   182.3  -209.3   705.0  -178.7   394.1  -161.2  -910.4

Now the above values don't' look normalized.  From the GSE444 summary I read, "The raw data are presented. The computations in the manuscript were based on the fold-changes reported by the MAS 4.0 software, not on ratios taken from raw data."

It appears that the investigators have provided some sort of raw data, presumably the MAS4.0 pm values, and not a processed and normalized expression value.  I would have used the .CEL files if the authors had bothered to deposit them. I would greatly appreciate it if anyone could help shed some light on the nature of the above values and how to go about normalizing them.

Regards,     

Scott

> sessionInfo()
R version 2.13.0 (2011-04-13)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252 LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] tools     stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] GEOquery_2.19.2      hgu95av2.db_2.5.0    org.Hs.eg.db_2.5.0   RSQLite_0.9-4        DBI_0.2-5            AnnotationDbi_1.14.1 limma_3.8.2          affy_1.30.0         
[9] Biobase_2.12.2      

loaded via a namespace (and not attached):
[1] affyio_1.20.0         preprocessCore_1.14.0 RCurl_1.6-6.1         XML_3.4-0.2


Scott A. Ochsner, PhD
Baylor College of Medicine
One Baylor Plaza
Mail Stop: BCM-130
Houston, TX 77030
Voice: (713) 798-6227
Fax: (713) 790-1275


More information about the Bioconductor mailing list