[R] Histogram to KDE

Tim Hesterberg timhesterberg at gmail.com
Fri Sep 7 06:04:45 CEST 2012


To bootstrap from a histogram, use
  sample(bins, replace = TRUE, prob = counts)

Note that a kernel density estimate is biased, so some bootstrap
confidence intervals have poor coverage properties.
Furthermore, if the kernel bandwidth is data-driven then the estimate
is not functional, so some bootstrap and jackknife methods won't work right.

Tim Hesterberg
http://www.timhesterberg.net
New:  Mathematical Statistics with Resampling and R, Chihara & Hesterberg

>On Fri, Aug 31, 2012 at 12:15 PM, David L Carlson <dcarlson at tamu.edu> wrote:
>
>> Using a data.frame x with columns bins and counts:
>>
>> x <- structure(list(bins = c(3.5, 4.5, 5.5, 6.5, 7.5, 8.5, 9.5, 10.5,
>>     11.5, 12.5, 13.5, 14.5, 15.5), counts = c(1, 1, 2, 3, 6, 18,
>>     19, 23, 8, 10, 6, 2, 1)), .Names = c("bins", "counts"), row.names =
>> 4:16,
>>     class = "data.frame")
>>
>> This will give you a plot of the kde estimate:
>>
>
>Thanks.
>
>>
>> xkde <- density(rep(bins, counts), bw="SJ")
>> plot(xkde)
>>
>> As for the standard error or the confidence interval, you would probably
>> need to use bootstrapping.
>>
>>
>>
> On a similar note - is there a way in R to directly resample /
>cross-validate from a histogram of a data-set without recreating the
>original data-set ?
>
>
>>  > -----Original Message-----
>> >
>> > Hello,
>> > I wanted to know if there was way to convert a histogram of a data-set
>> > to a
>> > kernel density estimate directly in R ?
>> >
>> > Specifically, I have a histogram [bins, counts] of samples {X1 ...
>> > XN} of a quantized variable X where there is one bin for each level of
>> > X,
>> > and I'ld like to directly get a kde estimate of the pdf of X from the
>> > histogram. Therefore, there is no additional quantization of X in the
>> > histogram. Most KDE methods in R seem to require the original sample
>> > set   - and I would like to avoid re-creating the samples from the
>> > histogram. Is there some quick way of doing this using one of the
>> > standard
>> > kde methods in R ?
>> >
>> > Also, a general statistical question - is there some measure of the
>> > standard error or confidence interval or similar of a KDE of a data-set
>> > ?
>> >
>> > Thanks,
>> > -fj
>> >
>>
>
>	[[alternative HTML version deleted]]




More information about the R-help mailing list