[BioC] Problem reading in files with marrayInput
Richard Friedman
friedman at cancercenter.columbia.edu
Tue Nov 4 18:20:53 MET 2003
On Mon, 3 Nov 2003, Jean Yee Hwa Yang wrote:
> Hi Richard,
>
> Look at the examples in help(read.marrayRaw)
> Copy and paste the whole set of examples and see if it works.
> If so, it's possible something is not right with the spot files.
>
> Try simply reading the data in:
> array1.raw <- read.Spot(fnames, path=datadir)
> and see if it works.
>
> Cheers
>
> Jean
>
Dear Jean and Everybody,
Thank you for your reply.
I ran the test case in "Introduction to the Bioconductor
marrayInput package first, without any error messages.
When I ran the session in help(read.marrayRaw) I got
the following error messages:
#####################################################################
> datadir <- system.file("data", package="marrayInput")
>
> skip <- grep("Row", readLines(file.path(datadir,"fish.gal"),
n=100)) - 1
Error in file(con, "r") : unable to open connection
In addition: Warning message:
cannot open file `C:/PROGRA~1/R/rw1080/library/marrayInput/data/fish.gal'
>
> swirl.layout <- read.marrayLayout(ngr=4, ngc=4, nsr=22, nsc=24)
>
> swirl.targets <- read.marrayInfo(file.path(datadir,
"SwirlSample.txt"))
Error in file(con, "r") : unable to open connection
In addition: Warning message:
cannot open file
`C:/PROGRA~1/R/rw1080/library/marrayInput/data/SwirlSample.txt'
>
> swirl.gnames <- read.marrayInfo(file.path(datadir, "fish.gal"),
+ info.id=4:5, labels=5, skip=skip)
Error in file(con, "r") : unable to open connection
In addition: Warning message:
cannot open file `C:/PROGRA~1/R/rw1080/library/marrayInput/data/fish.gal'
>
> x <- maInfo(swirl.gnames)[,1]
> y <- rep(0, maNspots(swirl.layout))
> y[x == "control"] <- 1
> slot(swirl.layout, "maControls") <- as.factor(y)
>
> fnames <- dir(path=datadir,pattern=paste("*", "spot", sep="\."))
> swirl<- read.Spot(fnames, path=datadir,
+ layout = swirl.layout,
+ gnames = swirl.gnames,
+ targets = swirl.targets)
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.1.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.2.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.3.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.4.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.5.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.6.spot"
Warning messages:
1: number of items read is not a multiple of the number of columns
2: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Gf,
as.numeric(dat[[name.Gf]]))
3: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Gb,
as.numeric(dat[[name.Gb]]))
4: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Rf,
as.numeric(dat[[name.Rf]]))
5: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Rb,
as.numeric(dat[[name.Rb]]))
Error in read.marrayRaw(fnames = fnames, path = path, name.Gf = name.Gf,
:
Object "swirl.targets" not found
>
##########################################################################
Two things (at least) are puzzling to me about the above session.
1. I seemed able to read fish.gal when I ran the excercises in
the Introduction. fish.gal is in the data directory under the
marrayInput directory, in which I am working, Since I opened
fish.gal with notepad it appears as a notepad file on the
screen. Is that okay?
2. The computer started reading the array1.?.spot files,
which the present series of commands had nothing to do.
Then I tried reading the spot files the way that you
said and their were problems:
##########################################################################
> fnames <- dir(path=datadir,pattern=paste("*","spot",sep="\."))
> array1.raw <- read.Spot(fnames,path=datadir)
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.1.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.2.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.3.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.4.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.5.spot"
[1] "Reading C:/PROGRA~1/R/rw1080/library/marrayInput/data/array1.6.spot"
Warning messages:
1: number of items read is not a multiple of the number of columns
2: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Gf,
as.numeric(dat[[name.Gf]]))
3: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Gb,
as.numeric(dat[[name.Gb]]))
4: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Rf,
as.numeric(dat[[name.Rf]]))
5: number of rows of result
is not a multiple of vector length (arg 2) in: cbind(Rb,
as.numeric(dat[[name.Rb]]))
>
################################################################
So clearly there is a problem with the spot files. When I ran
summary statistics on array.raw, I got the following:
#################################################################
> objects()
[1] "array1" "array1.gnames" "array1.layout" "array1.raw"
"array1.samples"
[6] "ctl" "datadir" "fileIndex" "fnames"
"last.warning"
[11] "read.marrayInfo" "swirl" "swirl.gnames" "swirl.layout"
"swirl.raw"
[16] "swirl.samples" "swirl2" "swirl2.gnames" "swirl2.layout"
"swirl2.samples"
[21] "swirl3" "swirl3..samples" "swirl3.gnames" "swirl3.layout"
"swirl3.samples"
[26] "x" "y"
> array1.raw
Pre-normalization intensity data: Object of class marrayRaw.
Number of arrays: 6 arrays.
A) Layout of spots on the array:
Array layout: Object of class marrayLayout.
Total number of spots:
Dimensions of grid matrix: rows by cols
Dimensions of spot matrices: rows by cols
Currently working with a subset of spots.
Control spots:
Notes on layout:
B) Samples hybridized to the array:
Object of class marrayInfo.
NULL data frame with 1 rows
Number of labels: 0
Dimensions of maInfo matrix: 0 rows by 0 columns
Notes:
C) Summary statistics for log-ratio distribution:
Min. 1st Qu. Median Mean 3rd Qu. Max NA
1 array1.1.spot -2.16 -0.76 -0.52 -0.44 -0.20 3.50 NA
2 array1.2.spot -2.15 -0.66 -0.44 -0.44 -0.21 2.01 NA
3 array1.3.spot -2.59 -0.84 -0.58 -0.58 -0.31 1.04 NA
4 array1.4.spot -3.46 -0.33 0.09 0.13 0.52 3.53 NA
5 array1.5.spot -3.05 -0.43 -0.15 -0.15 0.12 3.16 NA
6 array1.6.spot -13.08 -0.93 0.29 0.28 2.41 15.43 3664
D) Notes on intensity data:
Spot Data
>
###############################################################
Clearly something is wrong. The files look good to me.
May I send you the files (offlist).
Thanks and best wishes,
Rich
------------------------------------------------------------
Richard A. Friedman, PhD
Associate Research Scientist
Herbert Irving Comprehensive Cancer Center
Oncoinformatics Core
Lecturer
Department of Biomedical Informatics
Box 95, Room 130BB or P&S 1-420C
Columbia University
630 W. 168th St.
New York, NY 10032
(212)305-6901 (5-6901) (voice)
friedman at cancercenter.columbia.edu
http://cancercenter.columbia.edu/~friedman/
"Everybody is going to do their book reports on Harry Potter.
I'm going to do mine on 'Red Planet'." -Isaac Friedman, age 13
(I know I said "no more Isaac quotes", but I couldn't resist that one).
More information about the Bioconductor
mailing list