[R] truncating values into separate categories

David Winsemius dwinsemius at comcast.net
Sat Aug 1 14:43:57 CEST 2009


On Jul 31, 2009, at 2:55 PM, PDXRugger wrote:

>
> I must apoligize, as i want clear of what i wanted to occur.  i dont  
> want to
> count the occurences but rather recode them.  I am trying to replace  
> all of
> the values with the new coded values in Person_CAT.  SO NP <- c(1,   
> 1,  2,
> 1, 1,  2,  2,  1,  4,  1,  0,  5,
> + 3,  3,  1,  5,  3, 5, 1, 6, 1, 2, 2, 2,
> + 4, 4, 1, 2, 1, 3, 3, 1,  2,  2,  1,  2, 1, 2,
> + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)
>
>
>
>
> and Person_CAT: 1, 1, 2, 1, 1, 2, 2, 1, 4, 1, NA, 4..... and so on.   
> This
> task would easily be done in SPSS but i am trying to automate it  
> using R.  I
> hope this is more clear,

Perhaps:

?cut      #with special attention to the "right" parameter which is  
set to TRUE by default.

 > per_Cat <- cut(NP, breaks= c(1:4, Inf), right= FALSE)
 > per_Cat
  [1] [1,2)   [1,2)   [2,3)   [1,2)   [1,2)   [2,3)   [2,3)   [1,2)    
[4,Inf) [1,2)   <NA>    [4,Inf)
[13] [3,4)   [3,4)   [1,2)   [4,Inf) [3,4)   [4,Inf) [1,2)   [4,Inf)  
[1,2)   [2,3)   [2,3)   [2,3)
[25] [4,Inf) [4,Inf) [1,2)   [2,3)   [1,2)   [3,4)   [3,4)   [1,2)    
[2,3)   [2,3)   [1,2)   [2,3)
[37] [1,2)   [2,3)   [2,3)   [3,4)   [1,2)   [1,2)   [4,Inf) [4,Inf)  
[1,2)   [1,2)   [1,2)   [2,3)
[49] [2,3)   [2,3)
Levels: [1,2) [2,3) [3,4) [4,Inf)
 > Per <- c( "1", "2", "3","4")
 > levels(per_Cat) <- Per
 > per_Cat
  [1] 1    1    2    1    1    2    2    1    4    1    <NA> 4    3     
3    1    4    3    4    1    4
[21] 1    2    2    2    4    4    1    2    1    3    3    1    2     
2    1    2    1    2    2    3
[41] 1    1    4    4    1    1    1    2    2    2
Levels: 1 2 3 4
>
>
>
>
> Bill.Venables wrote:
>>
>> Here is a suggestion:
>>
>>> Per <- c("NA", "1", "2", "3","4")
>>> NP <- c(1,  1,  2,  1, 1,  2,  2,  1,  4,  1,  0,  5,
>> + 3,  3,  1,  5,  3, 5, 1, 6, 1, 2, 2, 2,
>> + 4, 4, 1, 2, 1, 3, 3, 1,  2,  2,  1,  2, 1, 2,
>> + 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)
>>> Person_CAT <- cut(NP, breaks = c(0:4, Inf)-0.5, labels = Per)
>>> table(Person_CAT)
>> Person_CAT
>> NA  1  2  3  4
>> 1 19 15  6  9
>>>
>>
>> You should be aware, though, that items corresponding to the level  
>> "NA"
>> will NOT be treated as missing.
>>
>>
>> Bill Venables
>> http://www.cmis.csiro.au/bill.venables/
>>
>>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org 
>> ]
>> On Behalf Of PDXRugger
>> Sent: Friday, 31 July 2009 9:54 AM
>> To: r-help at r-project.org
>> Subject: [R] truncating values into separate categories
>>
>>
>> Hi all,
>>  Simple question which i thought i had the answer but it isnt so  
>> simple
>> for
>> some reason.  I am sure someone can easily help.  I would like to
>> categorize
>> the values in NP into 1 of the five values in "Per", with the last
>> category("4") representing values >=4(hence 4:max(NP)).  The  
>> problem is
>> that
>> R is reading max(NP) as multiple values instead of range so the  
>> lengths of
>> the labels and the breaks are not matching.  Suggestions?
>>
>> Per <- c("NA", "1", "2", "3","4")
>>
>> NP=c(1 ,1 ,2 ,1, 1 ,2 ,2 ,1 ,4 ,1 ,0 ,5 ,3 ,3 ,1 ,5 ,3, 5, 1, 6, 1,  
>> 2, 2,
>> 2,
>> 4, 4, 1, 2, 1, 3, 3, 1 ,2 ,2 ,1 ,2, 1, 2,
>> 2, 3, 1, 1, 4, 4, 1, 1, 1, 2, 2, 2)
>>
>> Person_CAT <- cut(NP, breaks=c(0,1,2,3,4:max(NP)), labels=Per)
>>
>> -- 

David Winsemius, MD
Heritage Laboratories
West Hartford, CT




More information about the R-help mailing list