[R] How to represent tree-structured values

Richard O'Keefe r@oknz @end|ng |rom gm@||@com
Mon May 30 06:54:44 CEST 2022


There is a kind of data I run into fairly often
which I have never known how to represent in R,
and nothing I've tried really satisfies me.

Consider for example
 ...
 - injuries
   ...
   - injuries to limbs
     ...
     - injuries to extremities
       ...
       - injuries to hands
         - injuries to dominant hand
         - injuries to non-dominant hand
       ...
     ...
   ...

This isn't ordinal data, because there is no
"left to right" order on the values.  But there
IS a "part/whole" order, which an analysis should
respect, so it's not pure nominal data either.

As one particular example, if I want to
tabulate data like this, an occurrence of one
value should be counted as an occurrence of
*every* superordinate value.

Examples of such data include "why is this patient
being treated", "what drug is this patient being
treated with", "what geographic region is this
school from", "what biological group does this
insect belong to".

So what is the recommended way to represent
and the recommended way to analyse such data in R?

	[[alternative HTML version deleted]]



More information about the R-help mailing list