[R] Discrepancies in run times

rex.dwyer at syngenta.com rex.dwyer at syngenta.com
Tue Feb 22 19:20:24 CET 2011


My surmise would be that you have not analyzed the situation correctly, and you are making a false assumption about your code.  Since you can't show the code, it's pretty hard to figure out what that is.  I think you're going to have to produce a simple example that you can share that has the same behavior.  My guess is that you will answer your own question as you try to do that.

-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sébastien Bihorel
Sent: Tuesday, February 22, 2011 12:16 PM
To: r-help at r-project.org
Subject: [R] Discrepancies in run times

Dear R-users,

I am in the process of creating new custom functions and am quite puzzled by
some discrepancies in execution time when I run some R scripts that call
those new functions. So here is the situation:
- let's assume I have created two custom functions, called myg and myf;
- myg is mostly a plotting function, which makes a heavy use of grid and
lattice functions;
- myf is a function that massages data, opens and closes graph devices, and
pass the data to myg:
  * myf contains loops and sub-loops which subset the data in little pieces
necessary for plotting purposes;
  * the most inner loop in myf contains two calls to myg, one in section A
of the code, one in section B of the code;
  * Both sections could be turn on and off based upon an input of the myf
function;
  * Both sections passes the same data to myg, except for some graph
settings;
  * All graph devices open in section A are closed before section B starts;
and all graph devices open in section B are closed before the next iteration
of the inner loop.

Running a script passing a particular set of data to myf and turning on both
section A and B takes around 9 minutes (~3 combined minutes for Section A,
~6 combined minutes for Section B). The results of R CMD Rprof indicates
that most of the execution time is used by print (see extract below).
   %       total       %        self
 total    seconds     self    seconds    name
  99.4    545.84       0.0      0.00     "myf"
  95.2    522.70       0.0      0.06     "myg"
  90.7    498.20       0.0      0.02     "standardGeneric"
  90.6    497.70       0.0      0.00     "print"
  90.5    497.32       5.3     29.06     "printFunction"
  90.5    497.32       0.0      0.00     "print.trellis"
  62.7    344.58      62.6    343.96     "lattice.setStatus"
...
   %        self       %      total
  self    seconds    total   seconds    name
  62.6    343.96      62.7    344.58     "lattice.setStatus"
   5.6     30.86       7.6     41.60     ".Call.graphics"
   5.3     29.06      90.5    497.32     "printFunction"
   3.5     19.12       3.7     20.22     ".Call"
   2.3     12.52       2.3     12.52     "$"
   1.3      7.18       1.9     10.42     "match"
   1.3      6.98       1.3      6.98     "dev.off"
...

Running another script passing the same set of data to myf and turning
section A on and section B off takes around 3 minutes. The results of R CMD
Rprof also indicates that most of the execution time is used by print (see
extract below).
   %       total       %        self
 total    seconds     self    seconds    name
  98.1    177.16       0.0      0.00     "myf"
  93.3    168.40       0.0      0.00     "myg"
  85.0    153.50       0.1      0.10     "standardGeneric"
  84.7    152.94       0.0      0.02     "print"
  84.6    152.74       0.0      0.00     "print.trellis"
  84.6    152.72       4.5      8.04     "printFunction"
  51.3     92.66      51.3     92.58     "lattice.setStatus"
...
   %        self       %      total
  self    seconds    total   seconds    name
  51.3     92.58      51.3     92.66     "lattice.setStatus"
   8.5     15.34      10.7     19.36     ".Call.graphics"
   6.6     11.96       6.9     12.50     ".Call"
   4.5      8.04      84.6    152.72     "printFunction"
   3.4      6.14       3.4      6.14     "$"
   2.1      3.72       3.0      5.40     "match"
   0.8      1.46       0.8      1.46     "dev.off"
...
Running another script passing the same set of data to myf and turning
section A off and section B on takes around 3 minutes. The results of R CMD
Rprof also indicates that most of the execution time is used by print (see
extract below).
   %       total       %        self
 total    seconds     self    seconds    name
  98.1    175.00       0.0      0.00     "myf"
  90.7    161.82       0.0      0.00     "myg"
  86.8    154.90       0.0      0.06     "standardGeneric"
  86.5    154.32       0.0      0.02     "print"
  86.4    154.16       4.0      7.18     "printFunction"
  86.4    154.16       0.0      0.00     "print.trellis"
  52.6     93.76      52.5     93.62     "lattice.setStatus"
...
   %        self       %      total
  self    seconds    total   seconds    name
  52.5     93.62      52.6     93.76     "lattice.setStatus"
   8.6     15.28      10.9     19.40     ".Call.graphics"
   4.2      7.58       4.5      7.98     ".Call"
   4.0      7.18      86.4    154.16     "printFunction"
   3.1      5.56       3.1      5.56     "dev.off"
...

I cannot quite explain why the execution of print in section B would take
more time if section A is turn off. I sincerely apologize but, due to the
confidentiality of the actual script, I cannot provide the content of myf or
myg in my email. I realize this is not ideal for identifying the source of
this discrepancy, therefore any suggestion to approach this problem would be
very much welcome.

Sebastien

        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




message may contain confidential information. If you are not the designated recipient, please notify the sender immediately, and delete the original and any copies. Any use of the message by you is prohibited. 


More information about the R-help mailing list