[R] evaluating a function on an array of values

Faheem Mitha faheem at email.unc.edu
Wed May 3 01:11:23 CEST 2000


Dear R people,

I recently spent some time trying to do the following. I wrote a function
called meanvar. This function has four arguments and returns a vector
of two values. 

It looks like meanvar(r,t,type=c("interval","nominal"),n = 10000).

I wanted to write a routine to return to me the values of this function
for a range of values of r and t, specifically r from 1 to 5, t from 2 to
6, for fixed values of type (say "int"), and n the default. However, even

for(r in 1:5) meanvar(r,2,"int")

does not work, and I have no idea why.

I include the function below.

Some words of explanation may be in order. Suppose there are t people in a
room. Each gives one of the values 0,1...r, with equal probability. Call
these votes. Call these rvs C_{i}, where i= 1,...t. We write the function
DR (Disagreement Rate) = \sum_{i < j} \abs(C_i - C_j)/(r{t \choose 2}).
Ie. scaled sum over all possible pairwise differences of the votes. Call
AR (Agreement Rate) = 1 - DR. This is the "interval" case (not my name).
The nominal case is similar. In this case AR = \sum_{i < j}
\delta(C_i,C_j)/(r{t \choose 2}), where \delta(x,y) is 1 if x =y, 0
othewise.

What the function meanvar does is estimate E(AR), Var(AR) using
simulation.

In the unlikely event you have actually waded through all this turgid
nonsense, I would also like to ask whether the function below can be
improved. It appears to work and agrees with the special cases I have
managed to work out. Case 1: t=2, Case 2: r=1. However, the sub-functions
delta and sumdiff use loops, and these loops are evaluated n times, so
this can greatly affect the speed. These sub-functions could probably be
better. Any other suggestions for general improvement would be greatly
appreciated.
                            Sincerely, Faheem Mitha.

***************************************************************************

#Functions for estimating mean and variance of AR using simulation:

meanvar <- function(r,t,type=c("interval","nominal"), n = 10000)
{
# possible score values is 0,1,...r. (r should be at least 1). 
# number of coders is t (Note this must be at least 2, else fn will not work). 
# n is number of trials. The higher n, the better the precision of mean and var
# Taking n less than 1 of course makes no sense. At least 10000 recommended.

type <- match.arg(type)
  
if(is.integer(r) || (r<1))   #check for sensible values of r
            stop(message = "r must be a positive integer")

if(is.integer(t) || (t<2))   #check for sensible values of t 
            stop(message = "t must be a positive integer and at least 2")

  if(missing(n))   #setting default for n. If n not given, n=10000 is used
    {              
      warning("value of n missing, using default n=1000")
    }

if(is.integer(n) || (n<1))   #check for sensible values of n 
            stop(message = "n must be a positive integer")  
  
  x <- 0:r                              #x is possible values of votes
  votes <- sample(x,n*t,replace=TRUE)   #creating random vector of votes
  dim(votes) <- c(t,n)         #making the votes vector into a matrix

# Need function to compute sum of all pairwise absolute differences of
# the elements of a vector. Note: I am scaling sum appropriately to
# give sample values of DR (Disagreement Rate) before returning
# result. This function is used for the interval scale case.

  sumdiff <- function(v)
    {llen <- length(v) - 1
     summer <- 0                           #initialising sum
     for(i in 1:llen)                      #using diff to make sum (using loop)
       {summer <- sum(abs(diff(v,i))) + summer}
     return(summer/(r*choose(length(v),2)))#scaling sum before returning result
    }

# Need function to compute sum of all pairwise deltas of the elements
# of a vector. Recall that delta(x,y) is defined as 1 if x=y and 0 otherwise 

 delta <- function(v)
    {llen <- length(v) - 1
     summer <- 0                           #initialising sum
     for(i in 1:llen)                      #using diff to make sum (using loop)
       {summer <- sum(ifelse(diff(v,i)==0,0,1)) + summer}
     return(summer/(r*choose(length(v),2)))#scaling sum before returning result
    }

  result <- switch(type,
interval = c(1 - mean(apply(votes,2,sumdiff)),var(apply(votes,2,sumdiff))),
nominal = c(mean(apply(votes,2,delta)),var(apply(votes,2,delta))) )

  result      # Returning values of mean and variance as vector
}


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list