[R] Creating a custom connection to read from multiple files

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Jan 20 11:58:11 CET 2005


On Thu, 20 Jan 2005, Tomas Kalibera wrote:

>
> Dear Prof Ripley,
>
> thanks for your suggestions, it's very nice one can create custom connections 
> directly in R and I think it is what I need just now.
>
>> However, what is wrong with reading a file at a time and combining the 
>> results in R using rbind?
>> 
> Well, the problem is performance. If I concatenate all those files, they have 
> around 8MB, can grow to tens of MBs in near future.
>
> Both concatenating and reading from a single file by scan takes 5 seconds 
> (which is almost OK).
>
> However, reading individual files by read.table and rbinding one by one ( 
> samples=rbind(samples, newSamples ) takes minutes. The same is when I 
> concatenate lists manually. Scan does not help significantly. I guess there 
> is some overhead in detecting dimensions of objects in rbind (?) or 
> re-allocation or copying data ?

rbind is vectorized so you are using it (way) suboptimally.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list