[R] How can I avoid the for and If loops in my function?

Mramba,Lazarus K lmramba at ufl.edu
Thu Jun 19 17:04:43 CEST 2014


Hi Martin,

Thanks for the useful comments. It has greatly improved the efficiency 
of my functions. I am sorry that I forgot to paste the functions varG 
and others which I have done here for reproducibility.


#############################
## a function to calculate h2
## (heritability)
###############################
herit<-function(varG,varR=1)
{
   h<-4*varG/(varG+varR)
   h
}
###################################
# function to calculate random error
varR<-function(varG,h2)
{
   varR<- varG*(4-h2)/h2
   varR
}

##########################################
# function to calculate treatment variance
varG<-function(varR=1,h2)
{
   varG<-varR*h2/(4-h2)
   varG
}

###############################

I used the package "gmp" so that I can use the function "is.whole()".

I have managed to reduce the complexity of the function that I needed 
help. It now takes half the time it used to take before, but it is 
several hours and wish I can cut these hours down by doing away with the 
for and if loops in this code.
Any help will be appreciated.

newfunF<- function(matrix0,n,traceI)
{
   newmatdf<-matrix0
   trace<-traceI
   mat <- NULL
   Design_best <- newmatdf
   mat <- rbind(mat, c(value = trace, iterations = 0))
   Des<-list()
   newmatdf<-swapsimple(matrix0)
   for(i in 2:n){
      newmatdf<-swapsimple(newmatdf)
     Des[[i]]<-newmatdf
     if(swapmainF(newmatdf) < trace){
       Design_best <- Des[[i]]<-newmatdf
       trace<- swapmainF(newmatdf)
       mat <- rbind(mat, c(trace = trace, iterations = i))
     }
    }
  list(mat=mat,Design_best=Design_best)
}





On Thu, 19 Jun 2014 16:50:15 +0200, Martin Maechler wrote:
>>>>>> lmramba  <lmramba at ufl.edu>
>>>>>>     on Wed, 18 Jun 2014 20:00:15 +0000 writes:
>
>     > Hi Jim. If I avoid the dataframe, how can use the function
> model.matrix() to
>     > build the incident matrices X, And Z? I tried saving the design
> as matrix
>     > but ghen I got the wrong design matrix.
>
> I think you are entirely right here, Laz.
> That indeed you have data frame and a formula --> model.matrix()
> to get the matrix.
>
> I have no time currently to delve into your
> example, and I see
> - it is not reproducible {you use a function
>   varG() that is undefined}
> - you use foreach just so you can use %do% in one place which I
>   think makes no sense
> - you use package 'gmp' which I don't think you'd use, but I
>   don't know as your code is not reproducible ....
> - you use "<<-" in quite a few places in your code, which is
>   considered really bad programming style and makes it very hard
>   to understand the code by reading it ...
>
> ... *but* .. after all that ...
> ...
> as maintainer of the Matrix package
> I'm close to absolutely sure that you want to work with *sparse*
> matrices as Matrix provides.
>
> So in fact, do use
>
> ## "require":  just so you are not tempted to call a package a 
> "library"
> require(Matrix)
>
> help(sparse.model.matrix)
>
> and then do
> - use  sparse.model.matrix() instead of model.matrix().
>
> Further, do
> - use Diagonal() instead of  diag()  for *constructing* diagonal 
> matrices.
>
> Please let us know if this helps
>
> [and maybe fix your example to become reproducible: do
>
>    rm(list=ls(all=TRUE))
>
>  before  source(...) ing the reproducible example script...
> ]
>
> Martin Maechler, ETH Zurich
>
>
>     > Thanks.
>
>     > Laz
>
>
>     > Sent from my LG Optimus G™, an AT&T 4G LTE smartphone
>     > ------ Original message ------
>     > From: jim holtman
>     > Date: 6/18/2014 3:49 PM
>     > To: Laz;
>     > Cc: R mailing list;
>     > Subject:Re: [R] How can I avoid the for and If loops in my 
> function?
>
>     > First order of business, without looking in detail at the code,
> is to avoid
>     > the use of dataframes.  If all your values are numerics, then
> use a matrix.
>     > It will be faster execution.
>     > I did see the following statements:
>     > newmatdf<-Des[[i]]
>     > Des[[i]]<-newmatdf
>     > why are you just putting back what you pulled out of the list?
>
>     > Jim Holtman
>     > Data Munger Guru
>
>     > What is the problem that you are trying to solve?
>     > Tell me what you want to do, not how you want to do it.
>     > On Wed, Jun 18, 2014 at 12:41 PM, Laz <[1]lmramba at ufl.edu> 
> wrote:
>
>     > Dear R-users,
>     > I have a 3200 by 3200 matrix that was build from a data frame 
> that had
>     > 180 observations,  with variables: x, y, blocks (6 blocks) and
>     > treatments (values range from 1 to 180) I am working on. I 
> build other
>     > functions that seem to work well. However, I have one function 
> that has
>     > many If loops and a long For loop that delays my results for 
> over 10
>     > hours ! I need your help to avoid these loops.
>     > ########################################################
>     > ## I need to avoid these for loops and if loops here :
>     > ########################################################
>     > ### swapsimple() is a function that takes in a dataframe,
> randomly swaps
>     > two elements from the same block in a data frame and generates 
> a new
>     > dataframe called newmatdf
>     > ### swapmainF() is a function that calculates the trace of the 
> final N
>     > by N matrix considering the incident matrices and blocks and 
> treatments
>     > and residual errors in a linear mixed model framework using 
> Henderson
>     > approach.
>     > funF<- function(newmatdf, n, traceI)
>     > {
>     > # n = number of iterations (swaps to be made on pairs of
> elements of the
>     > dataframe, called newmatdf)
>     > # newmatdf : is the original dataframe with N rows, and 4 
> variables
>     > (x,y,blocks,genotypes)
>     > matrix0<-newmatdf
>     > trace<-traceI  ##  sum of the diagonal elements of the N by N 
> matrix
>     > (generated outside this loop) from the original newmatdf 
> dataframe
>     > res <- list(mat = NULL, Design_best = newmatdf, Original_design 
> =
>     > matrix0) # store our output of interest
>     > res$mat <- rbind(res$mat, c(value = trace, iterations = 0)) #
>     > initialized values
>     > Des<-list()
>     > for(i in seq_len(n)){
>     > ifelse(i==1,
>     > newmatdf<-swapsimple(matrix0),newmatdf<-swapsimple(newmatdf))
>     > Des[[i]]<-newmatdf
>     > if(swapmainF(newmatdf) < trace){
>     > newmatdf<-Des[[i]]
>     > Des[[i]]<-newmatdf
>     > trace<- swapmainF(newmatdf)
>     > res$mat <- rbind(res$mat, c(trace = trace, iterations = i))
>     > res$Design_best <- newmatdf
>     > }
>     > if(swapmainF(newmatdf) > trace & nrow(res$mat)<=1){
>     > newmatdf<-matrix0
>     > Des[[i]]<-matrix0
>     > res$Design_best<-matrix0
>     > }
>     > if(swapmainF(newmatdf)> trace & nrow(res$mat)>1){
>     > newmatdf<-Des[[length(Des)-1]]
>     > Des[[i]]<-newmatdf
>     > res$Design_best<-newmatdf
>     > }
>     > }
>     > res
>     > }
>     > The above function was created to:
>     > Take  an original matrix, called matrix0, calculate its trace.
>     > Generate a new matrix, called newmatdf after  swapping two
> elements of the
>     > old one and  calculate the trace. If the trace of the newmatrix 
> is
>     > smaller than
>     > that of the previous matrix, store both the current trace 
> together
>     > with the older trace and their  iteration values. If the newer
> matrix has
>     > a trace larger than the previous trace, drop this trace and 
> drop this
>     > matrix too (but count its iteration).
>     > Re-swap the old matrix that you stored previously and 
> recalculate the
>     > trace. Repeat the
>     > process many times, say 10,000. The final results should be a 
> list
>     > with the original initial matrix and its trace, the final best
>     > matrix that had the smallest trace after the 10000 simulations 
> and a
>     > dataframe  showing the values of the accepted traces that
>     > were smaller than the previous and their respective iterations.
>     > $Original_design
>     > x  y block genotypes
>     > 1    1  1     1        29
>     > 7    1  2     1         2
>     > 13   1  3     1         8
>     > 19   1  4     1        10
>     > 25   1  5     1         9
>     > 31   1  6     2        29
>     > 37   1  7     2         4
>     > 43   1  8     2        22
>     > 49   1  9     2         3
>     > 55   1 10     2        26
>     > 61   1 11     3        18
>     > 67   1 12     3        19
>     > 73   1 13     3        28
>     > 79   1 14     3        10
>     > ------truncated ----
>     > the final results after running  funF<-
>     > function(newmatdf,n,traceI)  given below looks like this:
>     > ans1
>     > $mat
>     > value iterations
>     > [1,] 1.474952          0
>     > [2,] 1.474748          1
>     > [3,] 1.474590          2
>     > [4,] 1.474473          3
>     > [5,] 1.474411          5
>     > [6,] 1.474294         10
>     > [7,] 1.474182         16
>     > [8,] 1.474058         17
>     > [9,] 1.473998         19
>     > [10,] 1.473993         22
>     > ---truncated
>     > $Design_best
>     > x  y block genotypes
>     > 1    1  1     1        29
>     > 7    1  2     1         2
>     > 13   1  3     1        18
>     > 19   1  4     1        10
>     > 25   1  5     1         9
>     > 31   1  6     2        29
>     > 37   1  7     2        21
>     > 43   1  8     2         6
>     > 49   1  9     2         3
>     > 55   1 10     2        26
>     > ---- truncated
>     > $Original_design
>     > x  y block genotypes
>     > 1    1  1     1        29
>     > 7    1  2     1         2
>     > 13   1  3     1         8
>     > 19   1  4     1        10
>     > 25   1  5     1         9
>     > 31   1  6     2        29
>     > 37   1  7     2         4
>     > 43   1  8     2        22
>     > 49   1  9     2         3
>     > 55   1 10     2        26
>     > 61   1 11     3        18
>     > 67   1 12     3        19
>     > 73   1 13     3        28
>     > 79   1 14     3        10
>     > ------truncated
>     > Regards,
>     > Laz
>     > [[alternative HTML version deleted]]
>     > ______________________________________________
>     > [2]R-help at r-project.org mailing list
>     > [3]https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting
>     > guide [4]http://www.R-project.org/posting-guide.html
>     > and provide commented, minimal, self-contained, reproducible 
> code.
>
>     > References
>
>     > 1. mailto:lmramba at ufl.edu
>     > 2. mailto:R-help at r-project.org
>     > 3. https://stat.ethz.ch/mailman/listinfo/r-help
>     > 4. http://www.R-project.org/posting-guide.html
>     > ______________________________________________
>     > R-help at r-project.org mailing list
>     > https://stat.ethz.ch/mailman/listinfo/r-help
>     > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
>     > and provide commented, minimal, self-contained, reproducible 
> code.



More information about the R-help mailing list