[BioC] oligo package reading in cel files
James W. MacDonald
jmacdon at uw.edu
Tue Oct 29 18:16:42 CET 2013
Hi Manjula,
I hate it when the obvious escapes me.
Note that the argument list for read.celfiles() starts with an elipsis
(...). I won't get into details here except to say that this means that
anything you pass to the function that isn't _exactly_ matched by an
argument will get sucked up by that first term.
You are passing in an argument phenodata = pd. The problem is that
there is no 'phenodata' argument, but instead it is phenoData! Since
you used an incorrect argument, your 'pd' object is assumed to be a
filename and an attempt is made to parse it. And of course it isn't
character, so you get the error.
I would imagine it will work correctly if you use
exp.data = read.celfiles(filenames=celFiles,phenoData=pd)
or alternatively
exp.data <- read.celfiles(filenames = celFiles)
pData(exp.data) <- pd
Best,
Jim
On Tuesday, October 29, 2013 12:59:57 PM, Manjula Kasoji wrote:
> It seems that the problem is with attaching the phenodata.
>
> *> celFiles <- list.celfiles()*
> *> celFiles*
> * [1] "A1_MoGene-2_0-st.CEL" "A2_MoGene-2_0-st.CEL"
> "A3_MoGene-2_0-st.CEL" "A4_MoGene-2_0-st.CEL"*
> * [5] "A5_MoGene-2_0-st.CEL" "B1_MoGene-2_0-st.CEL"
> "B2_MoGene-2_0-st.CEL" "B3_MoGene-2_0-st.CEL"*
> * [9] "B4_MoGene-2_0-st.CEL" "B5_MoGene-2_0-st.CEL"
> "C1_MoGene-2_0-st.CEL" "C2_MoGene-2_0-st.CEL"*
> *[13] "C3_MoGene-2_0-st.CEL" "C4_MoGene-2_0-st.CEL"
> "C5_MoGene-2_0-st.CEL" "D1_MoGene-2_0-st.CEL"*
> *[17] "D2_MoGene-2_0-st.CEL" "D3_MoGene-2_0-st.CEL"
> "D4_MoGene-2_0-st.CEL" "D5_MoGene-2_0-st.CEL"*
> *[21] "E1_MoGene-2_0-st.CEL" "E2_MoGene-2_0-st.CEL"
> "E3_MoGene-2_0-st.CEL" "E4_MoGene-2_0-st.CEL"*
> *[25] "E5_MoGene-2_0-st.CEL" "F1_MoGene-2_0-st.CEL"
> "F2_MoGene-2_0-st.CEL" "F3_MoGene-2_0-st.CEL"*
> *[29] "F4_MoGene-2_0-st.CEL" "F5_MoGene-2_0-st.CEL"
> "G1_MoGene-2_0-st.CEL" "G2_MoGene-2_0-st.CEL"*
> *[33] "G3_MoGene-2_0-st.CEL" "G4_MoGene-2_0-st.CEL"
> "G5_MoGene-2_0-st.CEL" "H1_MoGene-2_0-st.CEL"*
> *[37] "H2_MoGene-2_0-st.CEL" "H3_MoGene-2_0-st.CEL"
> "H4_MoGene-2_0-st.CEL" "H5_MoGene-2_0-st.CEL"*
> *[41] "I1_MoGene-2_0-st.CEL" "I2_MoGene-2_0-st.CEL"
> "I3_MoGene-2_0-st.CEL" "I4_MoGene-2_0-st.CEL"*
> *[45] "I5_MoGene-2_0-st.CEL"*
> *
> *
> *> exp.data = read.celfiles(filenames=celFiles)*
> *Platform design info loaded.*
> *Reading in : A1_MoGene-2_0-st.CEL*
> *Reading in : A2_MoGene-2_0-st.CEL*
> *Reading in : A3_MoGene-2_0-st.CEL*
> *Reading in : A4_MoGene-2_0-st.CEL*
> *Reading in : A5_MoGene-2_0-st.CEL*
> *Reading in : B1_MoGene-2_0-st.CEL*
> *Reading in : B2_MoGene-2_0-st.CEL*
> *Reading in : B3_MoGene-2_0-st.CEL*
> *
> *
> *...*
> *
> *
> *> exp.data*
> *GeneFeatureSet (storageMode: lockedEnvironment)*
> *assayData: 2598544 features, 45 samples *
> * element names: exprs *
> *protocolData*
> * rowNames: A1_MoGene-2_0-st.CEL A2_MoGene-2_0-st.CEL ...
> I5_MoGene-2_0-st.CEL (45*
> * total)*
> * varLabels: exprs dates*
> * varMetadata: labelDescription channel*
> *phenoData*
> * rowNames: A1_MoGene-2_0-st.CEL A2_MoGene-2_0-st.CEL ...
> I5_MoGene-2_0-st.CEL (45*
> * total)*
> * varLabels: index*
> * varMetadata: labelDescription channel*
> *featureData: none*
> *experimentData: use 'experimentData(object)'*
> *Annotation: pd.mogene.2.0.st <http://pd.mogene.2.0.st> *
> *
> *
> *> exp.data = read.celfiles(filenames=celFiles,phenodata=pd)*
> *Error: is.character(filenames) is not TRUE*
>
> Here is how I create my pd variable:
>
> *pd<-read.AnnotatedDataFrame("covdesc.txt",header=TRUE,row.name
> <http://row.name>="Filename" ,sep="\t")*
>
> **
>
> **> pData(pd)**
>
> ** celltype treatment time group**
>
> **A1_MoGene-2_0-st.CEL MACRO PBS-CTRL 24H A**
>
> **A2_MoGene-2_0-st.CEL MACRO PBS-CTRL 24H A**
>
> **A3_MoGene-2_0-st.CEL MACRO PBS-CTRL 24H A**
>
> **A4_MoGene-2_0-st.CEL MACRO PBS-CTRL 24H A**
>
> **A5_MoGene-2_0-st.CEL MACRO PBS-CTRL 24H A**
>
> **B1_MoGene-2_0-st.CEL MACRO SILICA 24H B**
>
> *
> ...
> *
>
>
> *Any suggestions on why/how this is incorrect?*
>
> *
> *
>
> *Thanks.*
>
>
>
>
> On Tue, Oct 29, 2013 at 10:22 AM, Manjula Kasoji
> <manjula.kasoji at gmail.com <mailto:manjula.kasoji at gmail.com>> wrote:
>
> Jim,
>
> I'm still getting the same error message:
>
> > celfiles <- gsub("\\(", "\\\\(", gsub("\\)", "\\\\)", celfiles))
> >
> >
> > exp.data = read.celfiles(filenames=celfiles,phenodata=pd)
>
> Error: is.character(filenames) is not TRUE
>
> > celfiles
> [1] "A1_\\(MoGene-2_0-st\\).CEL" "A2_\\(MoGene-2_0-st\\).CEL"
> "A3_\\(MoGene-2_0-st\\).CEL"
> [4] "A4_\\(MoGene-2_0-st\\).CEL" "A5_\\(MoGene-2_0-st\\).CEL"
> "B1_\\(MoGene-2_0-st\\).CEL"
> [7] "B2_\\(MoGene-2_0-st\\).CEL" "B3_\\(MoGene-2_0-st\\).CEL"
> "B4_\\(MoGene-2_0-st\\).CEL"
> [10] "B5_\\(MoGene-2_0-st\\).CEL" "C1_\\(MoGene-2_0-st\\).CEL"
> "C2_\\(MoGene-2_0-st\\).CEL"
> [13] "C3_\\(MoGene-2_0-st\\).CEL" "C4_\\(MoGene-2_0-st\\).CEL"
> "C5_\\(MoGene-2_0-st\\).CEL"
> [16] "D1_\\(MoGene-2_0-st\\).CEL" "D2_\\(MoGene-2_0-st\\).CEL"
> "D3_\\(MoGene-2_0-st\\).CEL"
> [19] "D4_\\(MoGene-2_0-st\\).CEL" "D5_\\(MoGene-2_0-st\\).CEL"
> "E1_\\(MoGene-2_0-st\\).CEL"
> [22] "E2_\\(MoGene-2_0-st\\).CEL" "E3_\\(MoGene-2_0-st\\).CEL"
> "E4_\\(MoGene-2_0-st\\).CEL"
> [25] "E5_\\(MoGene-2_0-st\\).CEL" "F1_\\(MoGene-2_0-st\\).CEL"
> "F2_\\(MoGene-2_0-st\\).CEL"
> [28] "F3_\\(MoGene-2_0-st\\).CEL" "F4_\\(MoGene-2_0-st\\).CEL"
> "F5_\\(MoGene-2_0-st\\).CEL"
> [31] "G1_\\(MoGene-2_0-st\\).CEL" "G2_\\(MoGene-2_0-st\\).CEL"
> "G3_\\(MoGene-2_0-st\\).CEL"
> [34] "G4_\\(MoGene-2_0-st\\).CEL" "G5_\\(MoGene-2_0-st\\).CEL"
> "H1_\\(MoGene-2_0-st\\).CEL"
> [37] "H2_\\(MoGene-2_0-st\\).CEL" "H3_\\(MoGene-2_0-st\\).CEL"
> "H4_\\(MoGene-2_0-st\\).CEL"
> [40] "H5_\\(MoGene-2_0-st\\).CEL" "I1_\\(MoGene-2_0-st\\).CEL"
> "I2_\\(MoGene-2_0-st\\).CEL"
> [43] "I3_\\(MoGene-2_0-st\\).CEL" "I4_\\(MoGene-2_0-st\\).CEL"
> "I5_\\(MoGene-2_0-st\\).CEL"
> >
> >
>
>
>
>
> On Tue, Oct 29, 2013 at 10:17 AM, James W. MacDonald
> <jmacdon at uw.edu <mailto:jmacdon at uw.edu>> wrote:
>
> Hi Manjula,
>
> Try escaping the parentheses in those file names (in general
> parentheses are not a good thing to have in a file name).
>
> celfiles <- gsub("\\(", "\\\\(", gsub("\\)", "\\\\)", celfiles))
>
> Best,
>
> Jim
>
>
>
>
> On Tuesday, October 29, 2013 10:01:12 AM, Manjula Kasoji wrote:
>
> Hi Jim,
>
> Thank you for the quick response.
>
> exp.data <- read.celfiles(list.celfiles(), phenodata=pd)
> also does not
> work:
>
>
> > exp.data =
> read.celfiles(filenames=list.__celfiles(),phenodata=pd)
>
> Error: is.character(filenames) is not TRUE
>
>
> Output of celfiles:
>
>
> > celfiles
>
> [1] "A1_(MoGene-2_0-st).CEL" "A2_(MoGene-2_0-st).CEL"
> "A3_(MoGene-2_0-st).CEL"
>
> [4] "A4_(MoGene-2_0-st).CEL" "A5_(MoGene-2_0-st).CEL"
> "B1_(MoGene-2_0-st).CEL"
>
> [7] "B2_(MoGene-2_0-st).CEL" "B3_(MoGene-2_0-st).CEL"
> "B4_(MoGene-2_0-st).CEL"
>
> [10] "B5_(MoGene-2_0-st).CEL" "C1_(MoGene-2_0-st).CEL"
> "C2_(MoGene-2_0-st).CEL"
>
> [13] "C3_(MoGene-2_0-st).CEL" "C4_(MoGene-2_0-st).CEL"
> "C5_(MoGene-2_0-st).CEL"
>
> [16] "D1_(MoGene-2_0-st).CEL" "D2_(MoGene-2_0-st).CEL"
> "D3_(MoGene-2_0-st).CEL"
>
> [19] "D4_(MoGene-2_0-st).CEL" "D5_(MoGene-2_0-st).CEL"
> "E1_(MoGene-2_0-st).CEL"
>
> [22] "E2_(MoGene-2_0-st).CEL" "E3_(MoGene-2_0-st).CEL"
> "E4_(MoGene-2_0-st).CEL"
>
> [25] "E5_(MoGene-2_0-st).CEL" "F1_(MoGene-2_0-st).CEL"
> "F2_(MoGene-2_0-st).CEL"
>
> [28] "F3_(MoGene-2_0-st).CEL" "F4_(MoGene-2_0-st).CEL"
> "F5_(MoGene-2_0-st).CEL"
>
> [31] "G1_(MoGene-2_0-st).CEL" "G2_(MoGene-2_0-st).CEL"
> "G3_(MoGene-2_0-st).CEL"
>
> [34] "G4_(MoGene-2_0-st).CEL" "G5_(MoGene-2_0-st).CEL"
> "H1_(MoGene-2_0-st).CEL"
>
> [37] "H2_(MoGene-2_0-st).CEL" "H3_(MoGene-2_0-st).CEL"
> "H4_(MoGene-2_0-st).CEL"
>
> [40] "H5_(MoGene-2_0-st).CEL" "I1_(MoGene-2_0-st).CEL"
> "I2_(MoGene-2_0-st).CEL"
>
> [43] "I3_(MoGene-2_0-st).CEL" "I4_(MoGene-2_0-st).CEL"
> "I5_(MoGene-2_0-st).CEL"
>
>
> Output from traceback after error:
>
>
> > traceback()
>
> 4: stop(sprintf(ngettext(length(__r), "%s is not TRUE",
> "%s are not all
> TRUE"),
>
> ch), call. = FALSE, domain = NA)
>
> 3: stopifnot(is.character(__filenames))
>
> 2: checkValidFilenames(filenames)
>
> 1: read.celfiles(filenames = list.celfiles(), phenodata = pd)
>
> >
>
>
> I appreciate your help!
>
>
>
> On Mon, Oct 28, 2013 at 3:00 PM, James W. MacDonald
> <jmacdon at uw.edu <mailto:jmacdon at uw.edu>
> <mailto:jmacdon at uw.edu <mailto:jmacdon at uw.edu>>> wrote:
>
> Does
>
> exp.data <- read.celfiles(list.celfiles(), phenodata=pd)
>
> work?
>
> Otherwise, please give the output from
>
> celfiles
>
> and
>
> traceback()
>
> after the error.
>
> Best,
>
> Jim
>
>
>
>
> On Monday, October 28, 2013 2:50:10 PM, oligo user
> [guest] wrote:
>
>
> Hi I'm trying to read in cel file names from Affy
> MoGene-2_0-st arrays, however I am receiving error
> message
> indicating that my cel file names are not of the
> character
> class when it appears that they are. My phenodata
> is read in
> successfully.
>
> Here is my code and error:
>
>
> celfiles = list.files(path = ".", pattern =
> ".CEL$",
> all.files = FALSE,
>
> full.names = FALSE, recursive = FALSE, ignore.case
> = FALSE);
>
>
> pd<-read.AnnotatedDataFrame("____covdesc.txt",header=TRUE,row.____name
> <http://row.name>="Filename" ,sep="\t")
>
>
> exp.data =
> read.celfiles(filenames=____celfiles,phenodata=pd)
>
>
> Error: is.character(filenames) is not TRUE
>
> class(celfiles)
>
> [1] "character"
>
> is.character(celfiles)
>
> [1] TRUE
>
>
> Any advice will be appreciated.
>
> Thanks!
>
> -- output of sessionInfo():
>
> sessionInfo()
>
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin10.8.0 (64-bit)
>
> locale:
> [1]
>
> en_US.UTF-8/en_US.UTF-8/en_US.____UTF-8/C/en_US.UTF-8/en_US.__UTF-__8
>
>
> attached base packages:
> [1] parallel stats graphics grDevices utils
> datasets
> methods base
>
> other attached packages:
> [1] affyPLM_1.36.0 preprocessCore_1.22.0
> arrayQualityMetrics_3.16.0
> [4] affyQCReport_1.38.0 lattice_0.20-15
> simpleaffy_2.36.1
> [7] gcrma_2.32.0 genefilter_1.42.0
> affy_1.38.1
> [10] pd.mogene.2.0.st_2.12.0 RSQLite_0.11.4
> DBI_0.2-7
> [13] oligo_1.24.2 Biobase_2.20.1
> oligoClasses_1.22.0
> [16] BiocGenerics_0.6.0 BiocInstaller_1.10.4
>
> loaded via a namespace (and not attached):
> [1] affxparser_1.32.3 affyio_1.28.0
> annotate_1.38.0 AnnotationDbi_1.22.6
> [5] beadarray_2.10.0 BeadDataPackR_1.12.0
> Biostrings_2.28.0 bit_1.1-10
> [9] Cairo_1.5-2 cluster_1.14.4
> codetools_0.2-8 colorspace_1.2-4
> [13] ff_2.2-12 foreach_1.4.1
> GenomicRanges_1.12.5 grid_3.0.1
> [17] Hmisc_3.12-2 hwriter_1.3
> IRanges_1.18.4
> iterators_1.0.6
> [21] latticeExtra_0.6-26 limma_3.16.8
> plyr_1.8
> RColorBrewer_1.0-5
> [25] reshape2_1.2.2 rpart_4.1-1
> setRNG_2011.11-2 splines_3.0.1
> [29] stats4_3.0.1 stringr_0.6.2
> survival_2.37-4
> SVGAnnotation_0.93-1
> [33] tools_3.0.1 vsn_3.28.0
> XML_3.95-0.2
> xtable_1.7-1
> [37] zlibbioc_1.6.0
>
>
>
> --
> Sent via the guest posting facility at
> bioconductor.org <http://bioconductor.org>
> <http://bioconductor.org>.
>
> ___________________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> <mailto:Bioconductor at r-project.org>
> <mailto:Bioconductor at r-__project.org
> <mailto:Bioconductor at r-project.org>>
> https://stat.ethz.ch/mailman/____listinfo/bioconductor
> <https://stat.ethz.ch/mailman/__listinfo/bioconductor>
>
>
> <https://stat.ethz.ch/mailman/__listinfo/bioconductor
> <https://stat.ethz.ch/mailman/listinfo/bioconductor>>
> Search the archives:
> http://news.gmane.org/gmane.____science.biology.informatics.____conductor
> <http://news.gmane.org/gmane.__science.biology.informatics.__conductor>
>
>
> <http://news.gmane.org/gmane.__science.biology.informatics.__conductor
> <http://news.gmane.org/gmane.science.biology.informatics.conductor>>
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
>
> --
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099
>
>
>
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list