[BioC] Problem reading in files with marrayInput

Richard Friedman friedman at cancercenter.columbia.edu
Tue Nov 4 18:20:53 MET 2003


On Mon, 3 Nov 2003, Jean Yee Hwa Yang wrote:

> Hi Richard,
>
> Look at the examples in help(read.marrayRaw)
> Copy and paste the whole set of examples and see if it works.
> If so, it's possible something is not right with the spot files.
>
> Try simply reading the data in:
> array1.raw <- read.Spot(fnames, path=datadir)
> and see if it works.
>
> Cheers
>
> Jean
>

Dear Jean and Everybody,

	Thank you for your reply.
	I ran the test case in "Introduction to the Bioconductor
marrayInput package first, without any error messages.
	When I ran the session in help(read.marrayRaw) I got
the following error messages:
#####################################################################
>  datadir <- system.file("data", package="marrayInput")
>
>      skip <-  grep("Row", readLines(file.path(datadir,"fish.gal"),
n=100)) - 1
Error in file(con, "r") : unable to open connection
In addition: Warning message:
cannot open file `C:/PROGRA~1/R/rw1080/library/marrayInput/data/fish.gal'
>
>      swirl.layout <- read.marrayLayout(ngr=4, ngc=4, nsr=22, nsc=24)
>
>      swirl.targets <- read.marrayInfo(file.path(datadir,
"SwirlSample.txt"))
Error in file(con, "r") : unable to open connection
In addition: Warning message:
cannot open file
`C:/PROGRA~1/R/rw1080/library/marrayInput/data/SwirlSample.txt'
>
>      swirl.gnames <- read.marrayInfo(file.path(datadir, "fish.gal"),
+                                      info.id=4:5, labels=5, skip=skip)
Error in file(con, "r") : unable to open connection
In addition: Warning message:
cannot open file `C:/PROGRA~1/R/rw1080/library/marrayInput/data/fish.gal'
>
>      x <-  maInfo(swirl.gnames)[,1]
>      y <- rep(0, maNspots(swirl.layout))
>      y[x == "control"] <- 1
>      slot(swirl.layout, "maControls") <- as.factor(y)
>
>      fnames <- dir(path=datadir,pattern=paste("*", "spot", sep="\."))
>      swirl<- read.Spot(fnames, path=datadir,
+                             layout = swirl.layout,
+                             gnames = swirl.gnames,
+                             targets = swirl.targets)
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.1.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.2.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.3.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.4.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.5.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.6.spot"
Warning messages:
1: number of items read is not a multiple of the number of columns
2: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Gf,
as.numeric(dat[[name.Gf]]))
3: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Gb,
as.numeric(dat[[name.Gb]]))
4: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Rf,
as.numeric(dat[[name.Rf]]))
5: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Rb,
as.numeric(dat[[name.Rb]]))
Error in read.marrayRaw(fnames = fnames, path = path, name.Gf = name.Gf,
:
        Object "swirl.targets" not found
>
##########################################################################

Two things (at least) are puzzling to me about the above session.

1. I seemed able to read fish.gal when I ran the excercises in
the Introduction. fish.gal is in the data directory under the
marrayInput directory, in which I am working, Since I opened
fish.gal with notepad it appears as a notepad file on the
screen. Is that okay?

2. The computer started reading the array1.?.spot files,
which the present series of commands had nothing to do.

	Then I tried reading the spot files the way that you
said and their were problems:
##########################################################################
> fnames <- dir(path=datadir,pattern=paste("*","spot",sep="\."))
> array1.raw <- read.Spot(fnames,path=datadir)
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.1.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.2.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.3.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.4.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.5.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.6.spot"
Warning messages:
1: number of items read is not a multiple of the number of columns
2: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Gf,
as.numeric(dat[[name.Gf]]))
3: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Gb,
as.numeric(dat[[name.Gb]]))
4: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Rf,
as.numeric(dat[[name.Rf]]))
5: number of rows of result
        is not a multiple of vector length (arg 2) in: cbind(Rb,
as.numeric(dat[[name.Rb]]))
>

################################################################

So clearly there is a problem with the spot files. When I ran
summary statistics on array.raw, I got the following:
#################################################################
> objects()
 [1] "array1"          "array1.gnames"   "array1.layout"   "array1.raw"
"array1.samples"
 [6] "ctl"             "datadir"         "fileIndex"       "fnames"
"last.warning"
[11] "read.marrayInfo" "swirl"           "swirl.gnames"    "swirl.layout"
"swirl.raw"
[16] "swirl.samples"   "swirl2"          "swirl2.gnames"   "swirl2.layout"
"swirl2.samples"
[21] "swirl3"          "swirl3..samples" "swirl3.gnames"   "swirl3.layout"
"swirl3.samples"
[26] "x"               "y"
> array1.raw
Pre-normalization intensity data:        Object of class marrayRaw.

Number of arrays:       6 arrays.

A) Layout of spots on the array:
Array layout:    Object of class marrayLayout.

Total number of spots:
Dimensions of grid matrix:               rows by  cols
Dimensions of spot matrices:             rows by  cols

Currently working with a subset of  spots.

Control spots:


Notes on layout:


B) Samples hybridized to the array:
Object of class marrayInfo.

NULL data frame with 1 rows

Number of labels:  0
Dimensions of maInfo matrix:  0  rows by  0  columns

Notes:


C) Summary statistics for log-ratio distribution:
                  Min. 1st Qu. Median  Mean 3rd Qu.   Max   NA
1 array1.1.spot  -2.16   -0.76  -0.52 -0.44   -0.20  3.50   NA
2 array1.2.spot  -2.15   -0.66  -0.44 -0.44   -0.21  2.01   NA
3 array1.3.spot  -2.59   -0.84  -0.58 -0.58   -0.31  1.04   NA
4 array1.4.spot  -3.46   -0.33   0.09  0.13    0.52  3.53   NA
5 array1.5.spot  -3.05   -0.43  -0.15 -0.15    0.12  3.16   NA
6 array1.6.spot -13.08   -0.93   0.29  0.28    2.41 15.43 3664

D) Notes on intensity data:
Spot Data
>
###############################################################

Clearly something is wrong.  The files look good to me.
May I send you the files (offlist).

Thanks and best wishes,
Rich
------------------------------------------------------------
Richard A. Friedman, PhD
Associate Research Scientist
Herbert Irving Comprehensive Cancer Center
Oncoinformatics Core
Lecturer
Department of Biomedical Informatics
Box 95, Room 130BB or P&S 1-420C
Columbia University
630 W. 168th St.
New York, NY 10032
(212)305-6901 (5-6901) (voice)
friedman at cancercenter.columbia.edu
http://cancercenter.columbia.edu/~friedman/

"Everybody is going to do their book reports on Harry Potter.
I'm going to do mine on 'Red Planet'." -Isaac Friedman, age 13
(I know I said "no more Isaac quotes", but I couldn't resist that one).



More information about the Bioconductor mailing list