[R] 2^k*r (with replications) experimental design question

Giovanni Azua bravegag at gmail.com
Mon Nov 14 01:33:29 CET 2011


Hello,

I have one replication (r=1 of the 2^k*r) of a 2^k experimental design in the context of performance analysis i.e. my response variables are Throughput and Response Time. I use the "aov" function and the results look ok:

> str(throughput)
'data.frame':	286 obs. of  7 variables:
 $ Time          : int  6 7 8 9 10 11 12 13 14 15 ...
 $ Throughput    : int  42 44 33 41 43 40 37 40 42 37 ...
 $ No_databases  : Factor w/ 2 levels "1","4": 1 1 1 1 1 1 1 1 1 1 ...
 $ Partitioning  : Factor w/ 2 levels "sharding","replication": 1 1 1 1 1 1 1 1 1 1 ...
 $ No_middlewares: Factor w/ 2 levels "2","4": 1 1 1 1 1 1 1 1 1 1 ...
 $ Queue_size    : Factor w/ 2 levels "40","100": 1 1 1 1 1 1 1 1 1 1 ...
 $ No_clients    : Factor w/ 1 level "128": 1 1 1 1 1 1 1 1 1 1 ...
> head(throughput)
  Time Throughput No_databases Partitioning No_middlewares Queue_size 
1    6         42            1     sharding              2         40 
2    7         44            1     sharding              2         40
3    8         33            1     sharding              2         40
4    9         41            1     sharding              2         40
5   10         43            1     sharding              2         40
6   11         40            1     sharding              2         40
> 
> throughput.aov <- aov(Throughput~No_databases+Partitioning+No_middlewares+Queue_size,data=throughput)
> summary(throughput.aov)
                              Df    Sum Sq  Mean Sq F value    Pr(>F)    
No_databases       1    28488651 28488651 53.4981 2.713e-12 ***
Partitioning            1    71687    71687  0.1346  0.713966    
No_middlewares   1     5624454  5624454 10.5620  0.001295 ** 
Queue_size          1     50892    50892  0.0956  0.757443    
Residuals             281 149637226   532517                      
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 
> 

This is somehow what I expected and I am happy, it is saying that the Throughput is significatively affected firstly by the number of database instances and secondly by the number of middleware instances.

The problem is that I need to integrate multiple replications of this same 2^k so I can also account for experimental error i.e. the _r_ of 2^k*r but I can't see how to integrate the _r_ term into the data and into the aov function parameters. Can anyone advice?  

TIA,
Best regards,
Giovanni


More information about the R-help mailing list