AW: Re: [R] "Large" data set: performance issue

Peter Dalgaard BSA p.dalgaard at biostat.ku.dk
Tue Apr 2 17:58:35 CEST 2002


Till Baumgaertel <till.baumgaertel at epost.de> writes:
datfull <- read.csv
> >What happens if you try this?:
> >
> >datfull <- read.csv("foo", colClasses=rep(c("character","numeric"),c(22,1801)))
> 
> nope, sorry. it's not working.
> it complains about the following:
> ####
> Error in scan(file = file, what = what, sep = sep, quote = quote, dec =
> dec,  : 
>         "scan" expected a real, got ""+1073741824""
> ####
> 
> ok, i forgot to tell you my numbers are just like the characters quoted
> ("\""). sorry!
> 
> therefore i tried
> ###
> datfull <- read.csv2(file.choose(), colClasses=rep(c("character","numeric"),c(22,1801)),quote="\"",sep=",")
> ###
> 
> But it's still not working.
> 
> it seems to be critical to do the translation of character ("+1234") to
> numeric(1234.0) AFTER the file was totally read into (any kind of?) a data
> structure. 

Hum. I wonder whether that quoting behaviour is really as intended.
You might try this

library(methods)
setAs("character","num", function(from)as.numeric(from))
datfull <- read.csv(file.choose(), 
    colClasses=rep(c("character","num"), c(22,1801)))


Otherwise, try something like

datfull <- read.csv(file.choose(),colClasses="character")
datfull[-(1:22)] <- lapply(datfull[-(1:22)], as.numeric)

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list