[R] the function doesn´t work

Joshua Wiley jwiley.psych at gmail.com
Mon Sep 27 04:10:28 CEST 2010


Hi Jethi,

Please look at this code, and the graph that is created in the code.
This all leads me to suspect you are trying to use the wrong formula
at the end.  Since there was some confusion with it, I just removed
the stuff with apply().

Josh

##################################
#### I moved stuff around to assign all global variables at the beginning

## Assign global variables; this works
N = 10
n = 100
m = 2
k = n/m
l = matrix(0, nrow=m, ncol=N)
p = matrix(0, nrow=m, ncol=N)
H = matrix(0, nrow=N, ncol=1)
alpha = 0.05
q_1 <- qnorm(alpha, 0, 0.05)
q_2 <- qnorm(1 - alpha, 0, 0.05)

## Calculate correlations; this works
for(i in 1:N){
 x = rnorm(n,0,0.5)
 y = rnorm(n,0,0.8)
 for(j in 1:m){
   l[j,i] = cor(
      (x[(((j-1)*k)+1):(((j-1)*k)+k)]),
      (y[(((j-1)*k)+1):(((j-1)*k)+k)]))
 }
}

## Calculate probabilities; this works
for(i in 1:N){
  for (j in 1:m){
    p[j,i] = (l[j,i]^2) / sum(l[,i]^2)
  }
}

## Calculate this other thing; this works, but is not what you want
for(i in 1:N){
  H[i]=log(m)-sum(p[,i]*log(p[,i]))
}
1 - mean(q_1 <= H & H <= q_2)

## lets break this down then, and see what each part does
## first we are going to redefine H.  H = log(m) - zz
## where zz = sum(p[,i] * log(p[,i]))

## create the matrix zz
zz <- matrix(0, nrow=N, ncol=1)
for(i in 1:N) {
  zz[i] <- sum(p[,i] * log(p[,i]))
}

## You have a logical test (q_1 <= H <= q_2)
## that is true when H is between q_1 and q_2

## Lets plot our data to see what is going on

plot(x = seq_along(zz), y = zz, ylim = c(-1, 2), col = "black", pch = 16)
points(x = seq_along(zz), y = H, col = "blue", pch = 16)
abline(h = q_1, col = "red")
abline(h = q_2, col = "red")
abline(h = log(m), col = "green")

## The black dots are the values of zz (sum(p[,i] * log(p[,i])))
## the blue dots are your actual values of H
## the red lines are the region selected by q_1 and q_2 (based on alpha)
## for your logical test to evaluate TRUE, the blue dots MUST be
within the red lines
## otherwise it will be FALSE
## finally, the green line is log(m), you can see that log(m) is so far outside
## of the selection region (between red lines), that it pushes
## every sum(p[,i] * log(p[,i])) outside

## To change this, you need to lower log(m) (~ m = .8 would work)
## or you need to increase sum(p[,i] * log(p[,i])), which is tricky
## because log(1) = 0, and the log of anything in the interval (0, 1)
## is a negative number.  If p is a probability, it must be between [0, 1]
## which means you are guaranteed to be multiplying a positive number by
## a negative number, or 0.  This in turn means the sum of all of these
## will itself be negative, or 0.  So, since H = log(m) - zz, and we have
## established that zz must be 0 or negative, the minimum bound of H is log(m)
## and you can see from the graph (log(m) is the green line), that log(m)
## is far above your selection region

#############################


On Sun, Sep 26, 2010 at 5:13 PM, jethi <kartija at hotmail.com> wrote:
>
> hi josh, and really thnx again.
>
> i have now 2 problems . the first one ist if i take ur idea and programm
> like u:
>
> N = 10
> n = 100
> m = 2
> k = n/m
> l = matrix(0,nrow=m,ncol=N)
> p=matrix(0,nrow=m,ncol=N)
>
> alpha = 0.05
> q_1 <- qnorm(alpha, 0, 0.05)
> q_2 <- qnorm(1 -alpha, 0, 0.05)
> for(i in 1:N){
>  x=rnorm(n,0,0.5)
>  y=rnorm(n,0,0.8)
>  for(j in 1:m){
>    l[j,i] = cor(
>       (x[(((j-1)*k)+1):(((j-1)*k)+k)]),
>       (y[(((j-1)*k)+1):(((j-1)*k)+k)]))
>
>
>  }
> }
>
>
> for(i in 1:N){
> for (j in 1:m){
> p[j,i]=l[j,i]^2/sum(l[,i]^2)
> }
> }
>
> p2 <- apply(l, 2, function(z) {(z^2)/sum(z^2)})
> identical(p, p2)
>
> H=matrix(0,nrow=N,ncol=1)
> for(i in 1:N){
> H[i]=log(m)-sum(p[,i]*log(p[,i]))
> }
> H1<-apply(p2,2,function(t){(log(m)-sum(t*log(t)))})
> identical(H1, H)
>
>
> it gives me, that identical(H1, H) is False. is it because of H? H is a
> matrix and H1 is a list, isn´t it?
> ok if i leave H1 out and programm further to get my powerfunction, it gives
> me always the value 1. even i change my alpha or the number of n, m or N.:
>
>
> N = 10
> n = 100
> m = 2
> k = n/m
> l = matrix(0,nrow=m,ncol=N)
> p=matrix(0,nrow=m,ncol=N)
>
> alpha = 0.05
> q_1 <- qnorm(alpha, 0, 0.05)
> q_2 <- qnorm(1 -alpha, 0, 0.05)
> for(i in 1:N){
>  x=rnorm(n,0,0.5)
>  y=rnorm(n,0,0.8)
>  for(j in 1:m){
>    l[j,i] = cor(
>       (x[(((j-1)*k)+1):(((j-1)*k)+k)]),
>       (y[(((j-1)*k)+1):(((j-1)*k)+k)]))
>
>
>  }
> }
> for(i in 1:N){
> for (j in 1:m){
> p[j,i]=l[j,i]^2/sum(l[,i]^2)
> }
> }
>
> p2 <- apply(l, 2, function(z) {(z^2)/sum(z^2)})
> identical(p, p2)
>
> H=matrix(0,nrow=N,ncol=1)
> for(i in 1:N){
> H[i]=log(m)-sum(p[,i]*log(p[,i]))
> }
> 1-mean(q_1<=H & H <=q_2)
>
> regards
> jethi
>
>
> --
> View this message in context: http://r.789695.n4.nabble.com/the-function-doesn-t-work-tp2714105p2714815.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list