# [R] automatic chi-square grouping in R

Jan_Svatos@eurotel.cz Jan_Svatos at eurotel.cz
Mon Oct 14 13:28:08 CEST 2002

```Hi,
there are possibilities to group, regroup, and transform data in R easily.
For example,

then
survdiff(formula = Surv(days, status) ~ adl2, data = nu)

To make this automagically, I would use something along these simple lines:

#then order it (descending)
t2<-t1[rev(order(t1))]
t3<-t2[t2>=5] #or another threshold
remainder<-sum(t2[t2<5])
names(remainder)<-"Blahblah"
mynewadlgrouping<-rbind(t3,remainder) # I think it is correct, if not, then
use cbind() or even simpler c()

But joining groups in order to build bigger groups is sometimes
statistically doubtful.
The same for age :

age[age>30 & age<=35]<-32.5
age[age>35 & age<=40]<-37.5
etc.

Or there is a possibility to build factor, for example

age2<-as.factor(floor(age/5))

JS

- - - Original message: - -
From: owner-r-help at stat.math.ethz.ch
Send: 11.10.2002 12:03:08
To: <r-help at stat.math.ethz.ch>
Subject: [R] automatic chi-square grouping in R

I'm doing some chi-square tests, and I recall some arbitrary rule that says

each band must have at least 5 events in order for the test to be
meaningful.
Is there some way to do the banding automagically in R ? For instance, in
the
following survdiff, I'm trying to see if ADL affects survival. But when
and 6, the number observed is too little. Anyway for me to tell R how to
group
them ? Like "R, combine ADL=5 and ADL=6, and redo the test" ?

-----------

Call:
survdiff(formula = Surv(days, status) ~ adl, data = nu)

N Observed Expected (O-E)^2/E (O-E)^2/V
adl=0 92        6     8.74   0.86134   1.17556
adl=1 38        5     3.41   0.74346   0.83435
adl=2 60        9    10.56   0.22975   0.39159
adl=3 44        4     5.22   0.28487   0.33978
adl=4 27        6     2.32   5.83153   6.30818
adl=5 31        3     3.12   0.00456   0.00506
adl=6 16        2     1.63   0.08385   0.08835

Chisq= 8.2  on 6 degrees of freedom, p= 0.226

-------

On a related note, is it possible to tell R to group together values, for
instance, if I have age in my data ranging from 30-60, is it possible to
tell R
to convert all ages 30-35 into 32.5, all age from 36-40 into 37.5 ... etc ?
I
mean I can always do this in Excel before I feed the data into R, but it
seems
R must be able to do something like this. I just don't know where to begin
looking in the manual for something like this ...

Thanks so much guys,

Roger

```