[R] Automation of c()

Liaw, Andy andy_liaw at merck.com
Sun Apr 11 23:15:00 CEST 2004


If the structure of the files are known beforehand, things can be done in
more efficient manner.  Assuming the .txt files all contain 3 columns of
numbers, and without column headings, this is what I would do to read them
into R as one matrix:

[First, generate a test file to be read in 300 times.]

> set.seed(1)
> write(matrix(rnorm(3e3), 3, 1e3), file="xyz.txt", ncol=3)
> ntimes <- 300
> files <- rep("xyz.txt", ntimes)

[Now the test:]

> system.time(dat <- do.call("rbind", lapply(files, function(f)
matrix(scan(f), byrow=T, nc=3))))
[ "Read 3000 items" 300 times omitted... ]
[1] 7.59 0.27 8.29   NA   NA

[Compare this with Gabor's suggestion:]

> system.time(dat2 <- do.call("rbind", lapply(files, read.table, header=F)))
[1] 37.32  0.76 40.61    NA    NA
> all(dat == dat2)
[1] TRUE

Most of the time spent in Gabor's approach is on rbind.data.frame(), because
even if we specify colClasses in read.table(), it doesn't get much faster:

> system.time(dat3 <- do.call("rbind", lapply(files, read.table, header=F,
colClass=rep("numeric", 3))))
[1] 36.48  0.75 38.81    NA    NA

If you don't need the data in a data frame, stick with a matrix.  If you
must have a data frame, it only takes another 4.5 seconds to run
as.data.frame() on the matrix.

Cheers,
Andy


> From: Gabor Grothendieck
> 
> Assuming your files are in /*.txt:
> 
>    setwd("/")
>    names <- dir( patt="[.]txt$" )
>    do.call( "rbind", lapply( names, read.table ) )
> 
> 
> Miha STAUT <mihastaut <at> hotmail.com> writes:
> 
> : 
> : Hi,
> : 
> : I have around 300 files with the x, y and z coordinates of 
> a DEM that I 
> : would like to join in a single data frame object. I know 
> how to automate the 
> : import of the 300 data frames.
> : 
> : in Bash
> : ls > names
> : 
> : in R
> : names<-scan(names...)
> : With rep() and data.frame() construct a series of 
> read.table() commands, 
> : which you write to a file and execute them with source().
> : 
> : I do not know however how to automate the combination of 
> the e.g. x vectors 
> : from all imported data frames in a single vector avoiding 
> the manual writing 
> : of all the imported data frame names.
> : 
> : Thanks in advance, Miha
> : 
> : ______________________________________________
> : R-help <at> stat.math.ethz.ch mailing list
> : https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> : PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> : 
> :
> 
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
> 
>




More information about the R-help mailing list