[R] Possible to "import" histograms in R?

Vladimir Eremeev wl2776 at gmail.com
Wed Aug 15 11:50:51 CEST 2007


Hello Nick,

Wednesday, August 15, 2007, 1:18:34 PM, you wrote:

NC> On 15/08/07, Vladimir Eremeev <wl2776 at gmail.com> wrote:
NC> Nick Chorley-3 wrote:
>>
>> I have a large amount of data that I would like to create a histogram of
>> and
>> plot and do things with in R. It is pretty much impossible to read the
>> data
>> into R, so I have written a program to bin the data and now have a list of
>> counts in each bin. Is it possible to somehow import this into R and use
>> hist(), so I can, for instance, plot the probability density? I have
>> looked
>> at the help page for hist(), but couldn't find anything related to this
>> there.
>>

NC> Hi! And why do you think, its impossible to import the data in R?
NC> It can handle rather large data volumes, especially in Linux. Just study
NC> help("Memory-limits").
NC> My data file is 4.8 GB!

NC> You can plot something looking like a histogram using barplot() or plot(...
NC> type="h").

NC> The problem with those is that I can't plot the probability density.

NC> You can create the "histogram" class object manually.

NC> For example,
NC> [ import bin counts... probably, it is a table of 2 columns, defining bin
NC> borders and counts.
NC>   let's  store it in ncounts. ]

NC> Yes, that's what I have. 

>> hst<-hist(rnorm(nrow(ncounts)),plot=FALSE)
>> str(hst)  # actually I called hist(rnorm(100))
>> List of 7
>>  $ breaks     : num [1:10] -2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2
>>  $ counts     : int [1:9] 3 6 12 9 24 19 14 9 4
>>  $ intensities: num [1:9] 0.06 0.12 0.24 0.18 0.48 ...
>>  $ density    : num [1:9] 0.06 0.12 0.24 0.18 0.48 ...
>>  $ mids       : num [1:9] -2.25 -1.75 -1.25 -0.75 -0.25 0.25 0.75 1.25 1.75
>>  $ xname      : chr "rnorm(100)"
>>  $ equidist   : logi TRUE
>>  - attr(*, "class")= chr "histogram"
>> hst$breaks <-  [ bsdfgsdghsdghdfgh ]
>> hst$counts <-  [ asfd109,mnasdfkjhdsfl ]
>> hst$intensities <-

NC> My data isn't normally distributed, so I tried rexp() rather
NC> than rnorm(), but it's not looking like it should

The call of the random generator doesn't matter, since
it is used just to create a numeric vector for the hist().
And call to hist() just creates the dummy structure, which you must
fill with your data.

You then replace the returned result with yours.
You can call hist(1:100) with the same success. And any other numeric
vector can be used to call hist.
If the result doesn't look like it should then you, probably,
incorrectly or incompletely altered the list returned by hist().

Actually, you can create this structure from scratch:

hst<-list(breaks= [your breaks], counts= [your counts],
          intensities = [your intensities], density=[your density],
          mids= [your mids], xname= "hist(of  your data)",
          equidist=TRUE [or FALSE] )
attr(hst,"class")<-"histogram"
     
>> Studying the hist.default() sources will help you to understand, how every
>> list element is created.

Type hist.default (without parentheses) on the R prompt, and it will
display you the sources of this function.
You can also use dump(hist.default,file="hist_default.R") to save it
to a text file.

-- 
Best regards,
 Vladimir                            mailto:wl2776 at gmail.com


--SevinMail--



More information about the R-help mailing list