[R] histogram frequency weighing

jim holtman jholtman at gmail.com
Mon Sep 18 00:05:15 CEST 2006


I think this should do it:

> lenh <- hist(iris$Sepal.Length, br=seq(4, 8, 0.05))$counts
> lenh  # original data
 [1]  0  0  0  0  0  1  0  3  0  1  0  4  0  2  0  5  0  6  0 10  0  9
 0  4  0  1  0  6  0  7  0  6  0
[34]  8  0  7  0  3  0  6  0  6  0  4  0  9  0  7  0  5  0  2  0  8  0
 3  0  4  0  1  0  1  0  3  0  1
[67]  0  1  0  0  0  1  0  4  0  0  0  1  0  0
> l.rle <- rle(lenh)
> # determine where '0's are
> Zero <- which(l.rle$values == 0)
> # if last entry in rle was 0, delete from offsets since we are changing +1
> if (tail(l.rle$values,1) == 0) Zero <- Zero[-length(Zero)]
> l.offsets <- cumsum(l.rle$lengths)  # offsets into original vector# modify original input
> lenh[l.offsets[Zero+1]] <- lenh[l.offsets[Zero  + 1]] / (l.rle$lengths[Zero]+1)
> lenh  # modified data
 [1] 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.1666667
0.0000000 1.5000000 0.0000000 0.5000000
[11] 0.0000000 2.0000000 0.0000000 1.0000000 0.0000000 2.5000000
0.0000000 3.0000000 0.0000000 5.0000000
[21] 0.0000000 4.5000000 0.0000000 2.0000000 0.0000000 0.5000000
0.0000000 3.0000000 0.0000000 3.5000000
[31] 0.0000000 3.0000000 0.0000000 4.0000000 0.0000000 3.5000000
0.0000000 1.5000000 0.0000000 3.0000000
[41] 0.0000000 3.0000000 0.0000000 2.0000000 0.0000000 4.5000000
0.0000000 3.5000000 0.0000000 2.5000000
[51] 0.0000000 1.0000000 0.0000000 4.0000000 0.0000000 1.5000000
0.0000000 2.0000000 0.0000000 0.5000000
[61] 0.0000000 0.5000000 0.0000000 1.5000000 0.0000000 0.5000000
0.0000000 0.5000000 0.0000000 0.0000000
[71] 0.0000000 0.2500000 0.0000000 2.0000000 0.0000000 0.0000000
0.0000000 0.2500000 0.0000000 0.0000000
>
>

On 9/17/06, Sebastian P. Luque <spluque at gmail.com> wrote:
> Fellow R-helpers,
>
> Suppose we create a histogram as follows (although it could be any vector
> with zeroes in it):
>
>
> R> lenh <- hist(iris$Sepal.Length, br=seq(4, 8, 0.05))
> R> lenh$counts
>  [1]  0  0  0  0  0  1  0  3  0  1  0  4  0  2  0  5  0  6  0 10  0  9  0  4  0
> [26]  1  0  6  0  7  0  6  0  8  0  7  0  3  0  6  0  6  0  4  0  9  0  7  0  5
> [51]  0  2  0  8  0  3  0  4  0  1  0  1  0  3  0  1  0  1  0  0  0  1  0  4  0
> [76]  0  0  1  0  0
>
>
> and we wanted to apply a weighing scheme where frequencies immediately
> following (and only those) empty class intervals (0) should be adjusted by
> averaging them over the number of preceding empty intervals + 1.  For
> example, the first frequency that would need to be adjusted in 'lenh' is
> element 6 (1), which has 5 preceding empty intervals, so its adjusted
> count would be 1/6.  Similarly, the second one would be element 8 (3),
> which has 1 preceding empty interval, so its adjusted count would be 3/2.
> Can somebody please provide a hint to implement such a weighing scheme?
>
> I thought about some very contrived ways to accomplish this, involving
> 'which' and 'diff', but I sense a function might already be available to
> do this efficiently.  I couldn't find relevant info in the usual channels.
> Thanks in advance for any pointers.
>
>
> Cheers,
>
> --
> Seb
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?



More information about the R-help mailing list