```This does not appear to be a legitimate topic for r-help: it is are
not a consulting service. Please see the posting guide.

Of course, others may disagree and reply. Wouldn't be the first time I'm wrong.

> today I have a more general question concerning the approach of storing
> different values from the analysis of multiple variables.
>
> My task is to compare distributions in a universe with distributions from
> the respondents using a whole bunch of variables. Comparison shall be done
> on relative frequencies (proportions).
>
> I was thinking about the structure I should store the results in and came
> up with the following:
>
> -- cut --
> library(stringi)
> # Result data frame
> # Some sort of tidytidy data set where
> # each value is stored as an identity.
> # This way all values for all variables could be stored in
> # one unique data structure.
> # If an additional variable added for the name of the
> # research one could also build result data set across
> # surveys.
> # Values for measure could be "number" for 'raw' values or
> # "freq" for frequencies/counts.
> # Values for unit could be "n" for 'numbers' and
> # "%" for percentages.
> d_test <- data.frame(
>     group = rep(c("Universe", "Respondents"), each = 16),
>     variable = rep("State", 32),
>     value = rep(c(11.3,
>                     12.7,
>                     3.3,
>                     5,
>                     0.6,
>                     8.1,
>                     6.2,
>                     5.8,
>                     6.4,
>                     14.5,
>                     8.3,
>                     0.3,
>                     3.8,
>                     2.5,
>                     8.1,
>                     3), 2),
>                 "Bayern",
>                 "Berlin",
>                 "Brandenburg",
>                 "Bremen",
>                 "Hamburg",
>                 "Hessen",
>                 "Mecklenburg-Vorpommern",
>                 "Niedersachsen",
>                 "Nordrhein-Westfalen",
>                 "Rheinland-Pfalz",
>                 "Saarland",
>                 "Sachsen",
>                 "Sachsen-Anhalt",
>                 "Schleswig-Holstein",
>                 "Thueringen"),2),
>     measure = rep("freq", 32),
>     unit = rep("%", 32),
>     stringsAsFactors = FALSE
> )
> # This way the variables can be selected using simple
> # value selection from Base R functionality.
> data <- d_test[d_test\$variable == "State" ,]
> # And plot results for every variable.
> ggplot(
>   data = data,
>   aes(
>     x = label,
>     y = value,
>     fill = group)) +
>   geom_bar(stat = "identity", position = "dodge") +
>   theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
>   scale_fill_discrete(name = stringi::stri_trans_totitle(names(data)[1]))
> +
>   scale_x_discrete(name = data\$variable[1]) +
>   scale_y_discrete(name = data\$unit[1])
> -- cut --
> The reporting / presentation is done in R Markdown. I would load the
> result data set once at the beginning and running the comparisons as plots
> on each variable named in the results data set under "variable".
> If I follow this approach for my customer relationship survey, do think I
> would face drawbacks or run into serious trouble?
> I am interested in your opinion and open for other approaches and
> suggestions.
> Kind regards
> Georg
