[BioC] Reading Affy CEL files

Ranjani R [guest] guest at bioconductor.org
Fri May 31 18:53:41 CEST 2013


I am a newbie to Affy. Thanks for your help.

I am processing CEL files through R (Affy package) and am having some basic issues that I am not finding satisfactory answers to (have googled).
The chip used is hugene11stv1. I also am using the hugene11stprobeset.db to try to do probeset –> Symbol translation.
Essentially, I want to create a file with gene expression data, with  genes * samples as my final matrix.  
 
Code:
setwd(wDir);
Data <- ReadAffy();
eset <- rma(Data);
write.exprs(eset,file="geneExpData.txt", sep="\t", quote = F);
 
When I analyze the file written, I see that the number of columns is as I expect(number samples) but there are 33,297 genes. 
Please help me understand a few fundamental aspects here:
 
1. I tried translating these Affy IDs to gene symbols to see if that would make my analysis easier.
    Here are some things I tried
 
    Try 1:
    symbols <- getSYMBOL(as.character(expr.matrix[,1]), "hugene11stprobeset"); –>  Not quite working. Only ~175 of the probeset IDs are getting translated.
    Try 2:
    symbs <- mget(featureNames(eset), hugene11stprobesetSYMBOL, ifnotfound =NA);
    symbs <- unlist(symbs)
    mat <- eset; # make a copy
    featureNames(mat) <- ifelse(!is.na(symbs), symbs, featureNames(mat))
   
    Many NAs.
 
Can you please help me understand what is happening here. 


 -- output of sessionInfo(): 

R version 2.15.3 (2013-03-01)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C                 LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] hugene11stv1cdf_2.3.0 affy_1.36.1           Biobase_2.18.0
[4] BiocGenerics_0.4.0

loaded via a namespace (and not attached):
[1] affyio_1.26.0         BiocInstaller_1.8.3   preprocessCore_1.20.0
[4] tools_2.15.3          zlibbioc_1.4.0


--
Sent via the guest posting facility at bioconductor.org.



More information about the Bioconductor mailing list