[R] Re-binning histogram data

Justin Ashmall ja at space.mit.edu
Thu Jun 8 19:19:58 CEST 2006


> histograms [...] should be abandoned in favor of [...] density plots.

I take your point Bert, but I think there is value in data that is simple 
and can be intuitively understood.

For my application, 10-minutes is a good charactersitic chunk of time, and 
I have an intuitive feeling of how many events I would expect to see in a 
10-minute period. By looking at a histogram with 10-minute bins I can 
tell immediately if something looks amiss. I could not do this simply with 
a pdf. Similarly histograms have the nice feature of compartmentalising 
bad data. Perhaps this is practical-use vs mathematical-idealism?

Also it's a case of simple tools for simple jobs. If the handle is loose 
on my kitchen cabient, I tighten the screw with a screwdriver or even the 
tip of a knife from the drawer. I don't need my power-drill with 
torque-controlled screwdriver attachement, much as I love it.

Justin


On Thu, 8 Jun 2006, Berton Gunter wrote:

> I would argue that histograms are outdated relics and that density plots
> (whatever your favorite flavor is) should **always** be used instead these
> days.
>
> In this vein, I would appreciate critical rejoinders (public or private) to
> the following proposition: Given modern computer power and software like R
> on multi ghz machines, statistical and graphical relics of the pre-computer
> era (like histograms, low resolution printer-type plots, and perhaps even
> method of moments EMS calculations) should be abandoned in favor of superior
> but perhaps computation-intensive alternatives (like density plots, high
> resolution plots, and likelihood or resampling or Bayes based methods).
>
> NB: Please -- no pleadings that new methods would be mystifying to the
> non-cogniscenti. Following that to its logical conclusion would mean that
> we'd all have to give up our TV remotes and cell phones, and what kind of
> world would that be?! :-)
>
> -- Bert Gunter
>
>
>
>> -----Original Message-----
>> From: r-help-bounces at stat.math.ethz.ch
>> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Petr Pikal
>> Sent: Thursday, June 08, 2006 6:17 AM
>> To: Justin Ashmall; r-help at stat.math.ethz.ch
>> Subject: Re: [R] Re-binning histogram data
>>
>>
>>
>> On 8 Jun 2006 at 11:35, Justin Ashmall wrote:
>>
>> Date sent:      	Thu, 8 Jun 2006 11:35:46 +0100 (BST)
>> From:           	Justin Ashmall <ja at space.mit.edu>
>> To:             	Petr Pikal <petr.pikal at precheza.cz>
>> Copies to:      	r-help at stat.math.ethz.ch
>> Subject:        	Re: [R] Re-binning histogram data
>>
>>>
>>> Thanks for the reply Petr,
>>>
>>> It looks to me that truehist() needs a vector of data just like
>>> hist()? Whereas I have histogram-style input data? Am I missing
>>> something?
>>
>> Well, maybe you could use barplot. Or as you suggested recreate the
>> original vector and call hist or truehist with other bins.
>>
>>> hhh<-hist(rnorm(1000))
>>> barplot(tapply(hhh$counts, c(rep(1:7,each=2),7), sum))
>>> tapply(hhh$mids, c(rep(1:7,each=2),7), mean)
>>     1     2     3     4     5     6     7
>> -3.00 -2.00 -1.00  0.00  1.00  2.00  3.25
>>> hhh1<-rep(hhh$mids,hhh$counts)
>>> plot(hhh, freq=F)
>>> lines(density(hhh1))
>>>
>>
>> HTH
>> Petr
>>
>>
>>
>>
>>
>>
>>>
>>> Cheers,
>>>
>>> Justin
>>>
>>>
>>>
>>> On Thu, 8 Jun 2006, Petr Pikal wrote:
>>>
>>>> Hi
>>>>
>>>> try truehist from MASS package and look for argument breaks or h.
>>>>
>>>> HTH
>>>> Petr
>>>>
>>>>
>>>>
>>>>
>>>> On 8 Jun 2006 at 10:46, Justin Ashmall wrote:
>>>>
>>>> Date sent:      	Thu, 8 Jun 2006 10:46:19 +0100 (BST)
>>>> From:           	Justin Ashmall <ja at space.mit.edu>
>>>> To:             	r-help at stat.math.ethz.ch
>>>> Subject:        	[R] Re-binning histogram data
>>>>
>>>>> Hi,
>>>>>
>>>>> Short Version:
>>>>> Is there a function to re-bin a histogram to new, broader bins?
>>>>>
>>>>> Long version: I'm trying to create a histogram, however my
>>>>> input-data is itself in the form of a fine-grained
>> histogram, i.e.
>>>>> numbers of counts in regular one-second bins. I want to produce a
>>>>> histogram of, say, 10-minute bins (though possibly irregular bins
>>>>> also).
>>>>>
>>>>> I suppose I could re-create a data set as expected by the hist()
>>>>> function (i.e. if time t=3600 has 6 counts, add six
>> entries of 3600
>>>>> to a list) however this seems neither elegant nor
>> efficient (though
>>>>> I'd be pleased to be mistaken!). I could then re-create
>> a histogram
>>>>> as normal.
>>>>>
>>>>> I guessing there's a better solution however! Apologies
>> if this is
>>>>> a basic question - I'm rather new to R and trying to get up to
>>>>> speed.
>>>>>
>>>>> Regards,
>>>>>
>>>>> Justin
>>>>>
>>>>> ______________________________________________
>>>>> R-help at stat.math.ethz.ch mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide!
>>>>> http://www.R-project.org/posting-guide.html
>>>>
>>>> Petr Pikal
>>>> petr.pikal at precheza.cz
>>>>
>>>>
>>>
>>> ______________________________________________
>>> R-help at stat.math.ethz.ch mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide!
>>> http://www.R-project.org/posting-guide.html
>>
>> Petr Pikal
>> petr.pikal at precheza.cz
>>
>> ______________________________________________
>> R-help at stat.math.ethz.ch mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide!
>> http://www.R-project.org/posting-guide.html
>>
>
>



More information about the R-help mailing list