[R] categorizing data

Tom Woolman twoo|m@n @end|ng |rom ont@rgettek@com
Sun May 29 21:42:38 CEST 2022


Some ideas:

You could create a cluster model with k=3 for each of the 3 variables, 
to determine what constitutes high/medium/low centroid values for each 
of the 3 types of plant types. Centroid values could then be used as the 
upper/lower boundary ranges for high/med/low.

Or utilize a histogram for each variable, and use quantiles or 
densities, etc. to determine the natural breaks for the high/med/low 
ranges for each of the IVs.




On 2022-05-29 15:28, Janet Choate wrote:
> Hi R community,
> I have a data frame with three variables, where each row adds up to 90.
> I want to assign a category of low, medium, or high to the values in 
> each
> row - where the lowest value per row will be set to 10, the medium 
> value
> set to 30, and the high value set to 50 - so each row still adds up to 
> 90.
> 
> For example:
> Data: Orig
> tree  shrub  grass
> 32     11       47
> 23      41      26
> 49      23      18
> 
> Data: New
> tree  shrub  grass
> 30      10      50
> 10       50     30
> 50       30     10
> 
> I am not attaching any code here as I have not been able to write 
> anything
> effective! appreciate help with this!
> thank you,
> JC
> 
> --
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list