[R] Efficiency question: replacing all NAs with a zero

Dimitri Liakhovitski ld7631 at gmail.com
Tue Mar 30 03:16:26 CEST 2010


Gabor, thanks a lot!
I removed everything from the work space but the data frame - and then
DF[is.na(DF)]<-0 has worked!
Thanks a lot!
Dimitri

On Mon, Mar 29, 2010 at 8:45 PM, Gabor Grothendieck
<ggrothendieck at gmail.com> wrote:
> Its going to be pretty hard to do anything useful if you can`t even do
> simple operations like that without overflowing memory but anyways try
> this (untested):
>
> write.table(DF, "DF.csv", sep = ",", quote = FALSE)
> rm(DF)
> DF <- read.csv(pipe("sed s/NA/0/g DF.csv"))
>
>
> On Mon, Mar 29, 2010 at 8:33 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote:
>> Just tried it. It's definitely faster - but I get the same error:
>> " Reached total allocation of 1535Mb:"
>>
>> On Mon, Mar 29, 2010 at 8:27 PM, Gabor Grothendieck
>> <ggrothendieck at gmail.com> wrote:
>>> See if this works for you:
>>>
>>> DF[is.na(DF)] <- 0
>>>
>>> On Mon, Mar 29, 2010 at 8:21 PM, Dimitri Liakhovitski <ld7631 at gmail.com> wrote:
>>>> Dear R'ers,
>>>>
>>>> I have a very large data frame (over 4000 rows and 2,500 columns). My
>>>> task is very simple - I have to replace all NAs with a zero. My code
>>>> works fine on smaller data frames - but I have to deal with a huge one
>>>> and there are many NAs in each column.
>>>> R runs out of memory on me ("Reached total allocation of 1535Mb: see
>>>> help(memory.size)"). Is there any other, more efficient way of doing
>>>> it?
>>>> Thanks a lot for any hints!
>>>> Dimitri
>>>>
>>>>
>>>> # Building an example frame:
>>>> frame<-data.frame(a=rnorm(1:100),b=rnorm(1:100),c=rnorm(1:100),d=rnorm(1:100),e=rnorm(1:100),f=rnorm(1:100),g=rnorm(1:100))
>>>> set.seed(1234)
>>>> for(i in names(frame)){
>>>>        i.for.NA<-sample(1:100,60)
>>>>        frame[[i]][i.for.NA]<-NA
>>>> }
>>>>
>>>> # Replacing all NAs in "frame" with zeros - is of course fast in this
>>>> example, because this data frame is very small
>>>> system.time({
>>>> frame<-lapply(frame,function(x){
>>>>        x[is.na(x)]<-0
>>>>        return(x)
>>>> })})
>>>>
>>>>
>>>> --
>>>> Dimitri Liakhovitski
>>>> Ninah.com
>>>> Dimitri.Liakhovitski at ninah.com
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>
>>
>>
>> --
>> Dimitri Liakhovitski
>> Ninah.com
>> Dimitri.Liakhovitski at ninah.com
>>
>



-- 
Dimitri Liakhovitski
Ninah.com
Dimitri.Liakhovitski at ninah.com



More information about the R-help mailing list