[R] questions on the ff package

"Jens Oehlschlägel" oehl_list at gmx.de
Wed Nov 25 14:04:03 CET 2009


> I need to save a matrix as a memory-mapped file and load it back later. 
> To save the matrix, I use
> mat = matrix(1:20, 4, 5)
> matFF = ff(mat, dim=dim(mat), filename="~/a.mat"
> , overwrite=TRUE, dimnames = dimnames(mat))

# This stores the data in an ff file, 
# but not the metadata in R's ff object. 
# To do the latter you need to do 
save(matFF, file="~/matFF.RData")

# Assuming that your ff file remains in the same location, 
# in a new R session you simply 
# and the ff file is available automagically

> However, I don't always know the dimension when loading the matrix back.
> If I miss the dim attributes, ff will return it as vector. 
> Is there a way to load the matrix without specifying the dimension?

# You can create an ff object using your existing ff file by
matFF <- ff(filename="~/a.mat", vmode="double", dim=c(4,5))

# You can do the same at unknown file size with 
matFF <- ff(filename="~/a.mat", vmode="double")
# which gives you the length of the ff object
# if you know the number of columns you can calculate the number of rows and give your ff object the interpretation of a matrix
dim(matFF) <- c(length(matFF)/5, 5)

> the matrix may grow in terms of the number of rows. 
> Is there an efficient way to do this?

# there are two ways to grow a matrix by rows

# 1) you create the matrix in major row order
matFF <- ff(1:20, dim=c(4,5), dimorder=c(2:1))
# then you require a higher number of rows
nrow(matFF) <- 6
# as you can see there are new empty rows in the file

# 2) Instead of a matrix you create a ffdf data.frame
#    which you can also give more rows using nrow<-
#    An example of this is in read.table.ffdf
#    which reads a csv file in chunks and extends the 
#    number of rows in the ffdf

Jens Oehlschlägel

Preisknaller: GMX DSL Flatrate für nur 16,99 Euro/mtl.!

More information about the R-help mailing list