[R] recoding large number of categories (select in SAS)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Wed Jan 19 16:51:52 CET 2005


Philippe Grosjean <phgrosjean at sciviews.org> writes:

> Does
> 
>  > ?cut
> 
> answers to your question?

That's one way, but it tends to get messy to get the names right.

You might consider using the rather little-known variant of levels
assignment:

preyGR <- prey # or factor(prey) if it wasn't one already
levels(preyGR) <- list("150"=149:150,
                       "187"=187:188,
                       "438"=438, 

                        [...]
                    
                       "9994"=c(999,125,994), "1"=NA) 

preyGR[is.na(preyGR) & !is.na(prey)] <- "1"

This would be roughly as clean as the SAS way, only the "otherwise"
case got a bit tricky.

> Best,
> 
> Philippe Grosjean
> 
> ..............................................<°}))><........
>   ) ) ) ) )
> ( ( ( ( (    Prof. Philippe Grosjean
>   ) ) ) ) )
> ( ( ( ( (    Numerical Ecology of Aquatic Systems
>   ) ) ) ) )   Mons-Hainaut University, Pentagone (3D08)
> ( ( ( ( (    Academie Universitaire Wallonie-Bruxelles
>   ) ) ) ) )   8, av du Champ de Mars, 7000 Mons, Belgium
> ( ( ( ( (
>   ) ) ) ) )   phone: + 32.65.37.34.97, fax: + 32.65.37.30.54
> ( ( ( ( (    email: Philippe.Grosjean at umh.ac.be
>   ) ) ) ) )
> ( ( ( ( (    web:   http://www.umh.ac.be/~econum
>   ) ) ) ) )          http://www.sciviews.org
> ( ( ( ( (
> ..............................................................
> 
> Denis Chabot wrote:
> > Hi,
> > I have data on stomach contents. Possible prey species are in the
> > hundreds, so a list of prey codes has been in used in many labs
> > doing this kind of work.
> > When comes time to do analyses on these data one often wants to
> > regroup prey in broader categories, especially for rare prey.
> > In SAS you can nest a large number of "if-else", or do this more
> > cleanly with "select" like this:
> > select;
> >   when (149 <= prey <=150)   preyGr= 150;
> >   when (186 <= prey <= 187)  preyGr= 187;
> >   when (prey= 438)                 preyGr= 438;
> >   when (prey= 430)                 preyGr= 430;
> >   when (prey= 436)                 preyGr= 436;
> >   when (prey= 431)                 preyGr= 431;
> >   when (prey= 451)                 preyGr= 451;
> >   when (prey= 461)                 preyGr= 461;
> >   when (prey= 478)                 preyGr= 478;
> >   when (prey= 572)                 preyGr= 572;
> >   when (692 <= prey <=  695 )
> > preyGr= 692;
> >   when (808 <= prey <=  826, 830 <= prey <= 832 )      preyGr= 808;
> >   when (997 <= prey <= 998, 792 <= prey <= 796)      preyGr= 792;
> >   when (882 <= prey <= 909)                          preyGr= 882;
> >   when (prey in (999, 125, 994))                             preyGr= 9994;
> >   otherwise                             preyGr= 1;
> > end; *select;
> > The number of transformations is usually much larger than this short
> > example.
> > What is the best way of doing this in R?
> > Sincerely,
> > Denis Chabot
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> >
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907




More information about the R-help mailing list