[R] speeding up loop and dealing wtih memory problems

ONKELINX, Thierry Thierry.ONKELINX at inbo.be
Mon Jul 28 15:43:40 CEST 2008


Dear Denise,

It looks like you want to replace all NA with 0 in the dataset? The code
below should do that trick without loops. And it will be rather fast.

dat[is.na(dat)] <- 0

> dat <- matrix(rbinom(40, 1, 0.75), ncol = 4, nrow = 10)
> dat[dat == 0] <- NA
> dat
      [,1] [,2] [,3] [,4]
 [1,]    1    1    1    1
 [2,]    1    1   NA    1
 [3,]   NA    1   NA   NA
 [4,]    1    1   NA    1
 [5,]    1    1    1   NA
 [6,]    1    1    1   NA
 [7,]    1    1    1    1
 [8,]    1    1    1   NA
 [9,]   NA    1    1    1
[10,]    1    1    1    1
> 
> dat[is.na(dat)] <- 0
> dat
      [,1] [,2] [,3] [,4]
 [1,]    1    1    1    1
 [2,]    1    1    0    1
 [3,]    0    1    0    0
 [4,]    1    1    0    1
 [5,]    1    1    1    0
 [6,]    1    1    1    0
 [7,]    1    1    1    1
 [8,]    1    1    1    0
 [9,]    0    1    1    1
[10,]    1    1    1    1
>

HTH,

Thierry
------------------------------------------------------------------------
----
ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature
and Forest
Cel biometrie, methodologie en kwaliteitszorg / Section biometrics,
methodology and quality assurance
Gaverstraat 4
9500 Geraardsbergen
Belgium 
tel. + 32 54/436 185
Thierry.Onkelinx op inbo.be 
www.inbo.be 

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to
say what the experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of
data.
~ John Tukey

-----Oorspronkelijk bericht-----
Van: r-help-bounces op r-project.org [mailto:r-help-bounces op r-project.org]
Namens Denise Xifara
Verzonden: maandag 28 juli 2008 15:15
Aan: r-help op r-project.org
Onderwerp: [R] speeding up loop and dealing wtih memory problems

 Dear All and Mark,

Given a dataset that I have called dat, I was hoping to speed up the
following loop:

for(i in 1:835353){
for(j in 1:86){
if  (is.na(dat[i,j])==TRUE){dat[i,j]<-0 }}}
Actually I am also having a memory problem.  I get the following:

Error: cannot allocate vector of size 3.2 Mb
In addition: Warning messages:
1: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size)
2: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size)
3: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size)
4: In dat[i, j] <- 0 :
  Reached total allocation of 1535Mb: see help(memory.size)

If I try and apply the loop just to a particular column, rather than the
whole dataset, so that I dont have the memory problem, ie

for(i in 1:835353){
if  (is.na(dat[i,4])==TRUE){dat[i,4]<-0 }}

it takes ridiculously long to process, so I was hoping that there would
be a
quicker way to do this.

Thank you all very much for the help,
Denise

	[[alternative HTML version deleted]]

______________________________________________
R-help op r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list