[R] ff package: reading selected columns from csv

Jan van der Laan rhelp at eoos.dds.nl
Thu Jul 26 15:32:32 CEST 2012


You probably have a character (which is converted to factor) or factor  
column with a large number of distinct values. All the levels of a  
factor are stored in memory in ff.

Jan


threshold <r.kozarski at gmail.com> schreef:

> *..plus I get the following message after reading the whole set (all 7
> columns):*
>
>> read.csv.ffdf(file=csvfile, header=FALSE, skip=100, first.rows=1000,
>> next.rows=1e7, VERBOSE=TRUE)
>
> read.table.ffdf 1..1000 (1000)  csv-read=0.02sec ffdf-write=0.08sec
> read.table.ffdf 1001..10001000 (10000000)  csv-read=282.16sec
> ffdf-write=65.01sec
> read.table.ffdf 10001001..20001000 (10000000)  csv-read=240.3sec
> ffdf-write=63.84sec
> read.table.ffdf 20001001..30001000 (10000000)  csv-read=213.78sec
> ffdf-write=149.2sec
> read.table.ffdf 30001001..40001000 (10000000)  csv-read=217.36sec
> ffdf-write=379.8sec
> read.table.ffdf 40001001..50001000 (10000000)  csv-read=541.28secError:
> cannot allocate vector of size 381.5 Mb
> In addition: There were 14 warnings (use warnings() to see them)
>> warnings()
> Warning messages:
> 1: In match(levels(x), lev) :
>   Reached total allocation of 7987Mb: see help(memory.size)
> 2: In match(levels(x), lev) :
>   Reached total allocation of 7987Mb: see help(memory.size)
>
>
>
> --
> View this message in context:  
> http://r.789695.n4.nabble.com/ff-package-reading-selected-columns-from-csv-tp4637794p4637900.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list