[R] Strange t-test error: "grouping factor must have exactly 2 levels" while it does...

Marc Schwartz marc_schwartz at me.com
Fri Jul 10 02:11:53 CEST 2009


On Jul 9, 2009, at 5:04 PM, Tymek W wrote:

> Hi,
>
> Could anyone tell me what is wrong:
>
>> length(unique(mydata$myvariable))
> [1] 2
>>
>
> and in t-test:
>
> (...)
> Error in t.test.formula(othervariable ~ myvariable, mydata) :
>  grouping factor must have exactly 2 levels
>>
>
> I re-checked the code and still don't get what is wrong.
>
> Moreover, there is some strange behavior:
>
> /1 It seems that the error is vulnerable to NA'a, because it affects
> some variables in data set with NA's and doesn't affect same ones in
> dataset with NA's removed.
>
> /2 It seems it works differently with different ways of using
> variables in t.test:
>
> eg. it hapends here: t.test(x~y, dataset) and does not here:
> t.test(dataset[['x']]~dataset[['y']])
>
> Does anyone have any ideas?
>
> Greetz,
> Timo


Check the output of:

   na.omit(cbind(mydata$othervariable, mydata$myvariable))

which will give you some insight into what data is actually available  
to be used in the t test. This will remove any rows that have missing  
data. Your first test above, checking the number of levels, is before  
missing data is removed.

The likelihood is that once missing values have been removed, you are  
only left with one unique grouping value in mydata$myvariable.

For your note number 2, it should be the same for both examples, as in  
both cases, the same basic approach is used. For example:

DF <- data.frame(x = c(1:3, NA, NA, NA), y = rep(1:2, each = 3))

 > DF
    x y
1  1 1
2  2 1
3  3 1
4 NA 2
5 NA 2
6 NA 2

# Remove missing data
 > na.omit(DF)
   x y
1 1 1
2 2 1
3 3 1

 > t.test(x ~ y, data = DF)
Error in t.test.formula(x ~ y, data = DF) :
   grouping factor must have exactly 2 levels

 > t.test(DF$x ~ DF$y)
Error in t.test.formula(DF$x ~ DF$y) :
   grouping factor must have exactly 2 levels


If you have a small reproducible example where the two function calls  
behave differently, please post back with it.

HTH,

Marc Schwartz




More information about the R-help mailing list