[BioC] oligo package reading in cel files

Tue Oct 29 18:16:42 CET 2013

Hi Manjula,

I hate it when the obvious escapes me.

Note that the argument list for read.celfiles() starts with an elipsis 
(...). I won't get into details here except to say that this means that 
anything you pass to the function that isn't _exactly_ matched by an 
argument will get sucked up by that first term.

You are passing in an argument phenodata = pd. The problem is that 
there is no 'phenodata' argument, but instead it is phenoData! Since 
you used an incorrect argument, your 'pd' object is assumed to be a 
filename and an attempt is made to parse it. And of course it isn't 
character, so you get the error.

I would imagine it will work correctly if you use

exp.data = read.celfiles(filenames=celFiles,phenoData=pd)

or alternatively

exp.data <- read.celfiles(filenames = celFiles)
pData(exp.data) <- pd

Best,

Jim

On Tuesday, October 29, 2013 12:59:57 PM, Manjula Kasoji wrote:
> It seems that the problem is with attaching the phenodata.
>
> *> celFiles <- list.celfiles()*
> *> celFiles*
> * [1] "A1_MoGene-2_0-st.CEL" "A2_MoGene-2_0-st.CEL"
> "A3_MoGene-2_0-st.CEL" "A4_MoGene-2_0-st.CEL"*
> * [5] "A5_MoGene-2_0-st.CEL" "B1_MoGene-2_0-st.CEL"
> "B2_MoGene-2_0-st.CEL" "B3_MoGene-2_0-st.CEL"*
> * [9] "B4_MoGene-2_0-st.CEL" "B5_MoGene-2_0-st.CEL"
> "C1_MoGene-2_0-st.CEL" "C2_MoGene-2_0-st.CEL"*
> *[13] "C3_MoGene-2_0-st.CEL" "C4_MoGene-2_0-st.CEL"
> "C5_MoGene-2_0-st.CEL" "D1_MoGene-2_0-st.CEL"*
> *[17] "D2_MoGene-2_0-st.CEL" "D3_MoGene-2_0-st.CEL"
> "D4_MoGene-2_0-st.CEL" "D5_MoGene-2_0-st.CEL"*
> *[21] "E1_MoGene-2_0-st.CEL" "E2_MoGene-2_0-st.CEL"
> "E3_MoGene-2_0-st.CEL" "E4_MoGene-2_0-st.CEL"*
> *[25] "E5_MoGene-2_0-st.CEL" "F1_MoGene-2_0-st.CEL"
> "F2_MoGene-2_0-st.CEL" "F3_MoGene-2_0-st.CEL"*
> *[29] "F4_MoGene-2_0-st.CEL" "F5_MoGene-2_0-st.CEL"
> "G1_MoGene-2_0-st.CEL" "G2_MoGene-2_0-st.CEL"*
> *[33] "G3_MoGene-2_0-st.CEL" "G4_MoGene-2_0-st.CEL"
> "G5_MoGene-2_0-st.CEL" "H1_MoGene-2_0-st.CEL"*
> *[37] "H2_MoGene-2_0-st.CEL" "H3_MoGene-2_0-st.CEL"
> "H4_MoGene-2_0-st.CEL" "H5_MoGene-2_0-st.CEL"*
> *[41] "I1_MoGene-2_0-st.CEL" "I2_MoGene-2_0-st.CEL"
> "I3_MoGene-2_0-st.CEL" "I4_MoGene-2_0-st.CEL"*
> *[45] "I5_MoGene-2_0-st.CEL"*
> *
> *
> *> exp.data = read.celfiles(filenames=celFiles)*
> *Platform design info loaded.*
> *Reading in : A1_MoGene-2_0-st.CEL*
> *Reading in : A2_MoGene-2_0-st.CEL*
> *Reading in : A3_MoGene-2_0-st.CEL*
> *Reading in : A4_MoGene-2_0-st.CEL*
> *Reading in : A5_MoGene-2_0-st.CEL*
> *Reading in : B1_MoGene-2_0-st.CEL*
> *Reading in : B2_MoGene-2_0-st.CEL*
> *Reading in : B3_MoGene-2_0-st.CEL*
> *
> *
> *...*
> *
> *
> *> exp.data*
> *GeneFeatureSet (storageMode: lockedEnvironment)*
> *assayData: 2598544 features, 45 samples *
> *  element names: exprs *
> *protocolData*
> *  rowNames: A1_MoGene-2_0-st.CEL A2_MoGene-2_0-st.CEL ...
> I5_MoGene-2_0-st.CEL (45*
> *    total)*
> *  varLabels: exprs dates*
> *  varMetadata: labelDescription channel*
> *phenoData*
> *  rowNames: A1_MoGene-2_0-st.CEL A2_MoGene-2_0-st.CEL ...
> I5_MoGene-2_0-st.CEL (45*
> *    total)*
> *  varLabels: index*
> *  varMetadata: labelDescription channel*
> *featureData: none*
> *experimentData: use 'experimentData(object)'*
> *Annotation: pd.mogene.2.0.st <http://pd.mogene.2.0.st> *
> *
> *
> *> exp.data = read.celfiles(filenames=celFiles,phenodata=pd)*
> *Error: is.character(filenames) is not TRUE*
>
> Here is how I create my pd variable:
>
> *pd<-read.AnnotatedDataFrame("covdesc.txt",header=TRUE,row.name
> <http://row.name>="Filename" ,sep="\t")*
>
> **
>
> **> pData(pd)**
>
> **                     celltype treatment time group**
>
> **A1_MoGene-2_0-st.CEL    MACRO  PBS-CTRL  24H     A**
>
> **A2_MoGene-2_0-st.CEL    MACRO  PBS-CTRL  24H     A**
>
> **A3_MoGene-2_0-st.CEL    MACRO  PBS-CTRL  24H     A**
>
> **A4_MoGene-2_0-st.CEL    MACRO  PBS-CTRL  24H     A**
>
> **A5_MoGene-2_0-st.CEL    MACRO  PBS-CTRL  24H     A**
>
> **B1_MoGene-2_0-st.CEL    MACRO    SILICA  24H     B**
>
> *
> ...
> *
>
>
> *Any suggestions on why/how this is incorrect?*
>
> *
> *
>
> *Thanks.*
>
>
>
>
> On Tue, Oct 29, 2013 at 10:22 AM, Manjula Kasoji
> <manjula.kasoji at gmail.com <mailto:manjula.kasoji at gmail.com>> wrote:
>
>     Jim,
>
>     I'm still getting the same error message:
>
>     > celfiles <- gsub("\\(", "\\\\(", gsub("\\)", "\\\\)", celfiles))
>     >
>     >
>     > exp.data = read.celfiles(filenames=celfiles,phenodata=pd)
>
>     Error: is.character(filenames) is not TRUE
>
>     > celfiles
>      [1] "A1_\\(MoGene-2_0-st\\).CEL" "A2_\\(MoGene-2_0-st\\).CEL"
>     "A3_\\(MoGene-2_0-st\\).CEL"
>      [4] "A4_\\(MoGene-2_0-st\\).CEL" "A5_\\(MoGene-2_0-st\\).CEL"
>     "B1_\\(MoGene-2_0-st\\).CEL"
>      [7] "B2_\\(MoGene-2_0-st\\).CEL" "B3_\\(MoGene-2_0-st\\).CEL"
>     "B4_\\(MoGene-2_0-st\\).CEL"
>     [10] "B5_\\(MoGene-2_0-st\\).CEL" "C1_\\(MoGene-2_0-st\\).CEL"
>     "C2_\\(MoGene-2_0-st\\).CEL"
>     [13] "C3_\\(MoGene-2_0-st\\).CEL" "C4_\\(MoGene-2_0-st\\).CEL"
>     "C5_\\(MoGene-2_0-st\\).CEL"
>     [16] "D1_\\(MoGene-2_0-st\\).CEL" "D2_\\(MoGene-2_0-st\\).CEL"
>     "D3_\\(MoGene-2_0-st\\).CEL"
>     [19] "D4_\\(MoGene-2_0-st\\).CEL" "D5_\\(MoGene-2_0-st\\).CEL"
>     "E1_\\(MoGene-2_0-st\\).CEL"
>     [22] "E2_\\(MoGene-2_0-st\\).CEL" "E3_\\(MoGene-2_0-st\\).CEL"
>     "E4_\\(MoGene-2_0-st\\).CEL"
>     [25] "E5_\\(MoGene-2_0-st\\).CEL" "F1_\\(MoGene-2_0-st\\).CEL"
>     "F2_\\(MoGene-2_0-st\\).CEL"
>     [28] "F3_\\(MoGene-2_0-st\\).CEL" "F4_\\(MoGene-2_0-st\\).CEL"
>     "F5_\\(MoGene-2_0-st\\).CEL"
>     [31] "G1_\\(MoGene-2_0-st\\).CEL" "G2_\\(MoGene-2_0-st\\).CEL"
>     "G3_\\(MoGene-2_0-st\\).CEL"
>     [34] "G4_\\(MoGene-2_0-st\\).CEL" "G5_\\(MoGene-2_0-st\\).CEL"
>     "H1_\\(MoGene-2_0-st\\).CEL"
>     [37] "H2_\\(MoGene-2_0-st\\).CEL" "H3_\\(MoGene-2_0-st\\).CEL"
>     "H4_\\(MoGene-2_0-st\\).CEL"
>     [40] "H5_\\(MoGene-2_0-st\\).CEL" "I1_\\(MoGene-2_0-st\\).CEL"
>     "I2_\\(MoGene-2_0-st\\).CEL"
>     [43] "I3_\\(MoGene-2_0-st\\).CEL" "I4_\\(MoGene-2_0-st\\).CEL"
>     "I5_\\(MoGene-2_0-st\\).CEL"
>     >
>     >
>
>
>
>
>     On Tue, Oct 29, 2013 at 10:17 AM, James W. MacDonald
>     <jmacdon at uw.edu <mailto:jmacdon at uw.edu>> wrote:
>
>         Hi Manjula,
>
>         Try escaping the parentheses in those file names (in general
>         parentheses are not a good thing to have in a file name).
>
>         celfiles <- gsub("\\(", "\\\\(", gsub("\\)", "\\\\)", celfiles))
>
>         Best,
>
>         Jim
>
>
>
>
>         On Tuesday, October 29, 2013 10:01:12 AM, Manjula Kasoji wrote:
>
>             Hi Jim,
>
>             Thank you for the quick response.
>
>             exp.data <- read.celfiles(list.celfiles(), phenodata=pd)
>             also does not
>             work:
>
>
>             > exp.data =
>             read.celfiles(filenames=list.__celfiles(),phenodata=pd)
>
>             Error: is.character(filenames) is not TRUE
>
>
>             Output of celfiles:
>
>
>             > celfiles
>
>              [1] "A1_(MoGene-2_0-st).CEL" "A2_(MoGene-2_0-st).CEL"
>             "A3_(MoGene-2_0-st).CEL"
>
>              [4] "A4_(MoGene-2_0-st).CEL" "A5_(MoGene-2_0-st).CEL"
>             "B1_(MoGene-2_0-st).CEL"
>
>              [7] "B2_(MoGene-2_0-st).CEL" "B3_(MoGene-2_0-st).CEL"
>             "B4_(MoGene-2_0-st).CEL"
>
>             [10] "B5_(MoGene-2_0-st).CEL" "C1_(MoGene-2_0-st).CEL"
>             "C2_(MoGene-2_0-st).CEL"
>
>             [13] "C3_(MoGene-2_0-st).CEL" "C4_(MoGene-2_0-st).CEL"
>             "C5_(MoGene-2_0-st).CEL"
>
>             [16] "D1_(MoGene-2_0-st).CEL" "D2_(MoGene-2_0-st).CEL"
>             "D3_(MoGene-2_0-st).CEL"
>
>             [19] "D4_(MoGene-2_0-st).CEL" "D5_(MoGene-2_0-st).CEL"
>             "E1_(MoGene-2_0-st).CEL"
>
>             [22] "E2_(MoGene-2_0-st).CEL" "E3_(MoGene-2_0-st).CEL"
>             "E4_(MoGene-2_0-st).CEL"
>
>             [25] "E5_(MoGene-2_0-st).CEL" "F1_(MoGene-2_0-st).CEL"
>             "F2_(MoGene-2_0-st).CEL"
>
>             [28] "F3_(MoGene-2_0-st).CEL" "F4_(MoGene-2_0-st).CEL"
>             "F5_(MoGene-2_0-st).CEL"
>
>             [31] "G1_(MoGene-2_0-st).CEL" "G2_(MoGene-2_0-st).CEL"
>             "G3_(MoGene-2_0-st).CEL"
>
>             [34] "G4_(MoGene-2_0-st).CEL" "G5_(MoGene-2_0-st).CEL"
>             "H1_(MoGene-2_0-st).CEL"
>
>             [37] "H2_(MoGene-2_0-st).CEL" "H3_(MoGene-2_0-st).CEL"
>             "H4_(MoGene-2_0-st).CEL"
>
>             [40] "H5_(MoGene-2_0-st).CEL" "I1_(MoGene-2_0-st).CEL"
>             "I2_(MoGene-2_0-st).CEL"
>
>             [43] "I3_(MoGene-2_0-st).CEL" "I4_(MoGene-2_0-st).CEL"
>             "I5_(MoGene-2_0-st).CEL"
>
>
>             Output from traceback after error:
>
>
>             > traceback()
>
>             4: stop(sprintf(ngettext(length(__r), "%s is not TRUE",
>             "%s are not all
>             TRUE"),
>
>                    ch), call. = FALSE, domain = NA)
>
>             3: stopifnot(is.character(__filenames))
>
>             2: checkValidFilenames(filenames)
>
>             1: read.celfiles(filenames = list.celfiles(), phenodata = pd)
>
>             >
>
>
>             I appreciate your help!
>
>
>
>             On Mon, Oct 28, 2013 at 3:00 PM, James W. MacDonald
>             <jmacdon at uw.edu <mailto:jmacdon at uw.edu>
>             <mailto:jmacdon at uw.edu <mailto:jmacdon at uw.edu>>> wrote:
>
>                 Does
>
>                 exp.data <- read.celfiles(list.celfiles(), phenodata=pd)
>
>                 work?
>
>                 Otherwise, please give the output from
>
>                 celfiles
>
>                 and
>
>                 traceback()
>
>                 after the error.
>
>                 Best,
>
>                 Jim
>
>
>
>
>                 On Monday, October 28, 2013 2:50:10 PM, oligo user
>             [guest] wrote:
>
>
>                     Hi I'm trying to read in cel file names from Affy
>                     MoGene-2_0-st arrays, however I am receiving error
>             message
>                     indicating that my cel file names are not of the
>             character
>                     class when it appears that they are. My phenodata
>             is read in
>                     successfully.
>
>                     Here is my code and error:
>
>
>                         celfiles = list.files(path = ".", pattern =
>             ".CEL$",
>                         all.files = FALSE,
>
>                     full.names = FALSE, recursive = FALSE, ignore.case
>             = FALSE);
>
>
>             pd<-read.AnnotatedDataFrame("____covdesc.txt",header=TRUE,row.____name
>                         <http://row.name>="Filename" ,sep="\t")
>
>
>                         exp.data =
>             read.celfiles(filenames=____celfiles,phenodata=pd)
>
>
>                     Error: is.character(filenames) is not TRUE
>
>                         class(celfiles)
>
>                     [1] "character"
>
>                         is.character(celfiles)
>
>                     [1] TRUE
>
>
>                     Any advice will be appreciated.
>
>                     Thanks!
>
>                       -- output of sessionInfo():
>
>                         sessionInfo()
>
>                     R version 3.0.1 (2013-05-16)
>                     Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
>                     locale:
>                     [1]
>
>             en_US.UTF-8/en_US.UTF-8/en_US.____UTF-8/C/en_US.UTF-8/en_US.__UTF-__8
>
>
>                     attached base packages:
>                     [1] parallel  stats     graphics  grDevices utils
>                 datasets
>                      methods   base
>
>                     other attached packages:
>                       [1] affyPLM_1.36.0             preprocessCore_1.22.0
>                      arrayQualityMetrics_3.16.0
>                       [4] affyQCReport_1.38.0        lattice_0.20-15
>                      simpleaffy_2.36.1
>                       [7] gcrma_2.32.0               genefilter_1.42.0
>                      affy_1.38.1
>                     [10] pd.mogene.2.0.st_2.12.0    RSQLite_0.11.4
>                     DBI_0.2-7
>                     [13] oligo_1.24.2               Biobase_2.20.1
>                     oligoClasses_1.22.0
>                     [16] BiocGenerics_0.6.0         BiocInstaller_1.10.4
>
>                     loaded via a namespace (and not attached):
>                       [1] affxparser_1.32.3    affyio_1.28.0
>                      annotate_1.38.0      AnnotationDbi_1.22.6
>                       [5] beadarray_2.10.0     BeadDataPackR_1.12.0
>                     Biostrings_2.28.0    bit_1.1-10
>                       [9] Cairo_1.5-2          cluster_1.14.4
>                     codetools_0.2-8      colorspace_1.2-4
>                     [13] ff_2.2-12            foreach_1.4.1
>                      GenomicRanges_1.12.5 grid_3.0.1
>                     [17] Hmisc_3.12-2         hwriter_1.3
>              IRanges_1.18.4
>                           iterators_1.0.6
>                     [21] latticeExtra_0.6-26  limma_3.16.8
>             plyr_1.8
>                           RColorBrewer_1.0-5
>                     [25] reshape2_1.2.2       rpart_4.1-1
>                      setRNG_2011.11-2     splines_3.0.1
>                     [29] stats4_3.0.1         stringr_0.6.2
>              survival_2.37-4
>                          SVGAnnotation_0.93-1
>                     [33] tools_3.0.1          vsn_3.28.0
>             XML_3.95-0.2
>                           xtable_1.7-1
>                     [37] zlibbioc_1.6.0
>
>
>
>                     --
>                     Sent via the guest posting facility at
>             bioconductor.org <http://bioconductor.org>
>                     <http://bioconductor.org>.
>
>                     ___________________________________________________
>                     Bioconductor mailing list
>             Bioconductor at r-project.org
>             <mailto:Bioconductor at r-project.org>
>             <mailto:Bioconductor at r-__project.org
>             <mailto:Bioconductor at r-project.org>>
>             https://stat.ethz.ch/mailman/____listinfo/bioconductor
>             <https://stat.ethz.ch/mailman/__listinfo/bioconductor>
>
>
>             <https://stat.ethz.ch/mailman/__listinfo/bioconductor
>             <https://stat.ethz.ch/mailman/listinfo/bioconductor>>
>                     Search the archives:
>             http://news.gmane.org/gmane.____science.biology.informatics.____conductor
>             <http://news.gmane.org/gmane.__science.biology.informatics.__conductor>
>
>
>             <http://news.gmane.org/gmane.__science.biology.informatics.__conductor
>             <http://news.gmane.org/gmane.science.biology.informatics.conductor>>
>
>
>                 --
>                 James W. MacDonald, M.S.
>                 Biostatistician
>                 University of Washington
>                 Environmental and Occupational Health Sciences
>                 4225 Roosevelt Way NE, # 100
>                 Seattle WA 98105-6099
>
>
>
>         --
>         James W. MacDonald, M.S.
>         Biostatistician
>         University of Washington
>         Environmental and Occupational Health Sciences
>         4225 Roosevelt Way NE, # 100
>         Seattle WA 98105-6099
>
>
>

--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099