[BioC] Make read.marrayRaw upload data faster

Marcus Davy MDavy at hortresearch.co.nz
Wed Jul 9 18:10:36 MEST 2003


Hi,
I have tweaked the read.marrayRaw code so the scan function accepts a "what" list of  mode character for the 
5 odd columns of interest, and NULL for the 77 odd unwanted columns in a gpr file
The change is to the following code in read.marrayRaw:

comment out;
#    h <- strsplit(readLines(f, n = skip + 1), split = sep)
#    h <- as.list(unlist(h[[length(h)]]))
#    names(h) <- gsub("\"", "", unlist(h))

replacing it with;
    h <- scan(f, quiet=TRUE, what=character(1), sep=sep, skip = skip, quote=quote, nlines=1)
    names(h) <- gsub("\"","",h)
    h <- lapply(h,as.null)
    cols <- c(name.Gf, name.Gb, name.Rf, name.Rb, name.W)
    h[charmatch(cols,names(h))]  <- character(1) #Ignores columns that are spelt incorrectly

It should give the same errors as the previous code downstream if people incorrectly identify a column.

Ive had about a 4x increase in multiple file uploading. 
Anyone out there want to test it for read.Spot, and read.SMD?
Spot files dont have so many columns of data so the speed increase shouldnt be as much.

marcus



______________________________________________________
The contents of this e-mail are privileged and/or confidential to the
named recipient and are not to be used by any other person and/or
organisation. If you have received this e-mail in error, please notify 
the sender and delete all material pertaining to this e-mail.



More information about the Bioconductor mailing list