[R] How to represent tree-structured values

Mon May 30 13:13:54 CEST 2022

For visualising hierarchical data a treemap can also work well. For 
example, using the treemap package:

n <- 1000

library(data.table)
library(treemap)

dta <- data.table(
   level1 = sample(LETTERS[1:5], n, replace = TRUE),
   level2 = sample(letters[1:5], n, replace = TRUE),
   level3 = sample(1:9, n, replace = TRUE),
   event = sample(0:1, n, replace = TRUE)
   )

tab <- dta[, .(n = .N, rate = sum(event)/.N),
   by = .(level1, level2, level3)]

treemap(tab, index = names(tab)[1:3], vSize = "n", vColor = "rate",
   type = "value", fontsize.labels = 20*c(1, 0.7, 0))

--

Jan

On 30-05-2022 11:40, Jim Lemon wrote:
> Hi Richard,
> Thinking about this, you might also find intersectDiagram, also in
> plotrix, to be useful.
>
> Jim
>
> On Mon, May 30, 2022 at 4:37 PM Jim Lemon <drjimlemon using gmail.com> wrote:
>> Hi Richard,
>> Some years ago I had a try at illustrating Multiple Causes of Death
>> (MCoD) data. I settled on what is sometimes called a "sizetree". You
>> can see some examples in the sizetree function help page in "plotrix".
>> Unfortunately I can't use the original data as it was confidential.
>>
>> Jim
>>
>> On Mon, May 30, 2022 at 2:55 PM Richard O'Keefe <raoknz using gmail.com> wrote:
>>> There is a kind of data I run into fairly often
>>> which I have never known how to represent in R,
>>> and nothing I've tried really satisfies me.
>>>
>>> Consider for example
>>>   ...
>>>   - injuries
>>>     ...
>>>     - injuries to limbs
>>>       ...
>>>       - injuries to extremities
>>>         ...
>>>         - injuries to hands
>>>           - injuries to dominant hand
>>>           - injuries to non-dominant hand
>>>         ...
>>>       ...
>>>     ...
>>>
>>> This isn't ordinal data, because there is no
>>> "left to right" order on the values.  But there
>>> IS a "part/whole" order, which an analysis should
>>> respect, so it's not pure nominal data either.
>>>
>>> As one particular example, if I want to
>>> tabulate data like this, an occurrence of one
>>> value should be counted as an occurrence of
>>> *every* superordinate value.
>>>
>>> Examples of such data include "why is this patient
>>> being treated", "what drug is this patient being
>>> treated with", "what geographic region is this
>>> school from", "what biological group does this
>>> insect belong to".
>>>
>>> So what is the recommended way to represent
>>> and the recommended way to analyse such data in R?
>>>
>>>          [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.