[R] Selecting groups with R

David Winsemius dwinsemius at comcast.net
Sat Aug 22 00:33:48 CEST 2009


On Aug 21, 2009, at 6:16 PM, Don McKenzie wrote:

> dataset[dataset$Color != "BLUE",]

Will return a data.frame with Color still a factor with three levels.

>
> On 21-Aug-09, at 3:08 PM, jlwoodard wrote:
>
>>
>> I have a data set similar to the following:
>>
>> Color  Score
>> RED      10
>> RED      13
>> RED      12
>> WHITE   22
>> WHITE   27
>> WHITE   25
>> BLUE     18
>> BLUE     17
>> BLUE     16
>>
>> and I am trying to to select just the values of Color that are  
>> equal to RED
>> or WHITE, excluding the BLUE.
>>
>> I've tried the following:
>> myComp1<-subset(dataset, Color =="RED" | Color == "WHITE")
>> myComp1<-subset(dataset, Color != "BLUE")
>> myComp1<-dataset[which(dataset$Color != "BLUE"),]
>>
>> Each of the above lines successfully excludes the BLUE subjects,  
>> but the
>> "BLUE" category is still present in my data set; that is, if I try
>> table(Color)  I get
>>
>> RED  WHITE  BLUE
>> 82     151      0
>>
>> If I try to do a t-test (since I've presumably gone from three  
>> groups to two
>> groups), I get:
>> Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx),  
>> abs(my)))
>> stop("data are essentially constant") :
>>  missing value where TRUE/FALSE needed
>> In addition: Warning message:
>> In mean.default(y) : argument is not numeric or logical: returning NA
>>
>> and describe.by(score,Color) gives me descriptives for RED and  
>> WHITE, and
>> BLUE also shows up as NULL.
>>
>> How can I eliminate the BLUE category completely so I can do a t- 
>> test using
>> Color (with just the RED and WHITE subjects)?

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list