[R] Frequency table

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Mar 17 16:27:35 CET 2004


Kai Hendry <hendry at cs.helsinki.fi> writes:

> This must be FAQ, but I can't find it in archives or with a site search.
> 
> I am trying to construct a frequency table. I guess this should be done with
> table. Or perhaps factor and split. Or prop.table. cut? findInterval? Argh!
> 
> Please correct me if what I am looking for is not called a "frequency table".
> Perhaps it's called grouped data.
> 
> > zz$x9
>  [1] 65 70 85 65 65 65 62 55 82 59 55 66 74 55 65 56 80 73 45 64 75 58 60 56 60
> [26] 65 53 63 72 80 90 95 55 70 79 62 57 65 60 47 61 53 80 75 72 87 52 72 80 85
> [51] 75 70 84 60 72 70 76 70 79 72 69 80 62 74 54 58 58 69 81 84
> 
> I (think) I want it to look like:
> 
> 40-49   2
> 50-59   15
> 60-69   20
> 70-79   19
> 80-89   12
> 90-99   2
> 
> Or the other way around with transpose.
> 
> classes = c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99")
> For the rownames
> 
> sum(zz$x9 > 40 & zz$x9 < 50)
> For getting frequency counts is very laborious...
> 
> I got this far:
> > table(cut(zz$x9, brk))
> 
>  (40,50]  (50,60]  (60,70]  (70,80]  (80,90] (90,100]
>        2       19       21       19        8        1
> > brk
> [1]  40  50  60  70  80  90 100
> > 
> > t(table(cut(zz$x9, brk)))
>      (40,50] (50,60] (60,70] (70,80] (80,90] (90,100]
> [1,]  2      19      21      19       8       1
> 
> Still feels a million miles off.
> 
> Now I could do with a little help please after spending a couple of hours
> working this out.

Hmm, interesting complication of the convention that tables are 1D
arrays there...

You got this far:


classes <- c("40-49", "50-59", "60-69", "70-79", "80-89", "90-99")
brk <- seq(40,100,10)

However, your intervals include the wrong end and the labels are ugly,
so try

table(cut(zz,breaks=brk,right=FALSE,labels=classes))

This at least gives you the right counts and labels:

40-49 50-59 60-69 70-79 80-89 90-99
    2    15    20    19    12     2

for a column display, you need to convert to a matrix somehow.
Transposing twice will actually do it, but I think I prefer

matrix(table(cut(zz,breaks=brk,right=FALSE)),dimnames=list(age=classes,""))

which gives this:

age
  40-49  2
  50-59 15
  60-69 20
  70-79 19
  80-89 12
  90-99  2



-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list