[R] recoding large number of categories (select in SAS)

John Fox jfox at mcmaster.ca
Wed Jan 19 17:09:51 CET 2005


Dear Peter et al.,

The recode() function in the car package will also do this kind of thing,
will work even when the ranges include non-integers, and supports an else=
construction.

Regards,
 John

--------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario
Canada L8S 4M4
905-525-9140x23604
http://socserv.mcmaster.ca/jfox 
-------------------------------- 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Peter Dalgaard
> Sent: Wednesday, January 19, 2005 10:52 AM
> To: Philippe Grosjean
> Cc: r-help at stat.math.ethz.ch; Denis Chabot
> Subject: Re: [R] recoding large number of categories (select in SAS)
> 
> Philippe Grosjean <phgrosjean at sciviews.org> writes:
> 
> > Does
> > 
> >  > ?cut
> > 
> > answers to your question?
> 
> That's one way, but it tends to get messy to get the names right.
> 
> You might consider using the rather little-known variant of levels
> assignment:
> 
> preyGR <- prey # or factor(prey) if it wasn't one already
> levels(preyGR) <- list("150"=149:150,
>                        "187"=187:188,
>                        "438"=438, 
> 
>                         [...]
>                     
>                        "9994"=c(999,125,994), "1"=NA) 
> 
> preyGR[is.na(preyGR) & !is.na(prey)] <- "1"
> 
> This would be roughly as clean as the SAS way, only the "otherwise"
> case got a bit tricky.
> 
> > Best,
> > 
> > Philippe Grosjean
> > 
> > ..............................................<°}))><........
> >   ) ) ) ) )
> > ( ( ( ( (    Prof. Philippe Grosjean
> >   ) ) ) ) )
> > ( ( ( ( (    Numerical Ecology of Aquatic Systems
> >   ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
> > ( ( ( ( (    Academie Universitaire Wallonie-Bruxelles
> >   ) ) ) ) )   8, av du Champ de Mars, 7000 Mons, Belgium
> > ( ( ( ( (
> >   ) ) ) ) )   phone: + 32.65.37.34.97, fax: + 32.65.37.30.54
> > ( ( ( ( (    email: Philippe.Grosjean at umh.ac.be
> >   ) ) ) ) )
> > ( ( ( ( (    web:   http://www.umh.ac.be/~econum
> >   ) ) ) ) )          http://www.sciviews.org
> > ( ( ( ( (
> > ..............................................................
> > 
> > Denis Chabot wrote:
> > > Hi,
> > > I have data on stomach contents. Possible prey species are in the 
> > > hundreds, so a list of prey codes has been in used in many labs 
> > > doing this kind of work.
> > > When comes time to do analyses on these data one often wants to 
> > > regroup prey in broader categories, especially for rare prey.
> > > In SAS you can nest a large number of "if-else", or do this more 
> > > cleanly with "select" like this:
> > > select;
> > >   when (149 <= prey <=150)   preyGr= 150;
> > >   when (186 <= prey <= 187)  preyGr= 187;
> > >   when (prey= 438)                 preyGr= 438;
> > >   when (prey= 430)                 preyGr= 430;
> > >   when (prey= 436)                 preyGr= 436;
> > >   when (prey= 431)                 preyGr= 431;
> > >   when (prey= 451)                 preyGr= 451;
> > >   when (prey= 461)                 preyGr= 461;
> > >   when (prey= 478)                 preyGr= 478;
> > >   when (prey= 572)                 preyGr= 572;
> > >   when (692 <= prey <=  695 )
> > > preyGr= 692;
> > >   when (808 <= prey <=  826, 830 <= prey <= 832 )      
> preyGr= 808;
> > >   when (997 <= prey <= 998, 792 <= prey <= 796)      preyGr= 792;
> > >   when (882 <= prey <= 909)                          preyGr= 882;
> > >   when (prey in (999, 125, 994))                          
>    preyGr= 9994;
> > >   otherwise                             preyGr= 1;
> > > end; *select;
> > > The number of transformations is usually much larger than 
> this short 
> > > example.
> > > What is the best way of doing this in R?
> > > Sincerely,
> > > Denis Chabot
> > > ______________________________________________
> > > R-help at stat.math.ethz.ch mailing list 
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide!
> > > http://www.R-project.org/posting-guide.html
> > >
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide! 
> > http://www.R-project.org/posting-guide.html
> > 
> 
> -- 
>    O__  ---- Peter Dalgaard             Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: 
> (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: 
> (+45) 35327907
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html




More information about the R-help mailing list