[R] estimation problem

David Winsemius dwinsemius at comcast.net
Sat May 5 16:53:57 CEST 2012


On May 4, 2012, at 4:22 PM, Petr Savicky wrote:

> On Fri, May 04, 2012 at 07:43:32PM +0200, Kehl Dániel wrote:
>> Dear Petr,
>>
>> thank you for your input.
>> I tried to experiment with (probably somewhat biased) truncated means
>> like in the following code.
>> How I got the 225 as a truncation limit is a good question. :)
>>
>> REPS1 <- REPS2 <- 1000
>> N1 <- 100000
>> N2 <- 30000
>> N <- N1+N2
>> x1 <- rep(0,N1)
>> x2 <- rnorm(N2,300,100)
>> x <- c(x1,x2)
>>
>> n <- 1000
>>
>> for (i in 1:REPS1){
>>  x_sample <- sort(sample(x,n,replace=FALSE),TRUE)
>>  x_trunc <- x_sample[1:225]
>>  REPS1[i] <- mean(x_sample)*N
>>  REPS2[i] <- sum(x_trunc)/n*N
>>  }
>>
>> sum(x2)
>> mean(REPS1)
>> mean(REPS2)
>> sd(REPS1)
>> sd(REPS2)
>> sd(REPS2)/sd(REPS1)
>
> Dear Daniel.
>
> Thank you for your reply.
>
> In the original question, you used the parameters
>
>  N1 <- 100000
>  N2 <- 3000
>
> and now the parameters
>
>  N1 <- 100000
>  N2 <- 30000
>
> My remark was that with the original parameters, there are only 29.1
> nonzero elements on average. Now, there are 230.8 nonzero elements on
> average, which is significantly better.
>
> Discussion of the use of the truncated mean is probably a question to
> other members of the list. I do not feel to be an expert on this.
>
> Best, Petr.

My experience is that Petr is better than I at much of R, but so far  
in this thread I have not seen mention of methods that are designed to  
examine data situations with large numbers of zeros. There is a very  
informative review of R techniques and packages to such efforts by  
Achim Zeileis and others. The same material was published in the  
Journal of Statistical Software and as a vignette in one of the  
contributed packages:

www.jstatsoft.org/v27/i08/paper
cran.r-project.org/web/packages/pscl/vignettes/countreg.pdf

I don't have this information memorized, but generally find a Google- 
search with "count r zeileis" to be highly effective. I've just  
noticed that the second author Kleiber also has put up useful material  
on that topic for web-searchers to use.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list