[R] Problem with read.spss() and as.data.frame(), or: alternative to subset()?

Prof Brian Ripley ripley at stats.ox.ac.uk
Sat Sep 24 10:06:36 CEST 2005

On Fri, 23 Sep 2005, Thomas Lumley wrote:

> On Wed, 21 Sep 2005, Martin Maechler wrote:
>>>>>>> "Dirk" == Dirk Enzmann <dirk.enzmann at jura.uni-hamburg.de>
>>>>>>>     on Wed, 21 Sep 2005 13:18:32 +0200 writes:
>>    Dirk> The selection problem can be solved by
>>    Dirk> dr2000=read.spss('myfile')
>>    Dirk> d=lapply(dr2000,subset,dr2000$RBINZ99 > 0)
>>    Dirk> however, there is still the problem that R crashes when using
>>    Dirk> d = as.data.frame(dr2000)
>> which is bug in a R, or at least in your R installation.
>> However we can't do anything about it at the moment, because we
>> can't even try to do reproduce it...
> I suspect this is the same stack overflow in coerce.c:substituteList that
> was reported in PR#8141

Apparently not (it had only about 1500 columns rather than 198000).  After 
taking it offline I was able to make it work on 1Gb machines under Windows 
and Linux, and Dirk succeeded using --max-mem-size=640M on Windows.  So it 
looks like it was a problem with total memory usage - I have yet to find 
out what exactly.

Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

More information about the R-help mailing list