[R] Avoid duplication in dplyr::summarise
Eric Berger
ericjberger at gmail.com
Sat Sep 9 15:02:40 CEST 2017
Hi Lars,
Two comments:
1. You can achieve what you want with a slight modification of your
definition of s(), using the hint from the error message that you need an
argument '.':
s <- function(.) {
dplyr::summarise(., x1m = mean(X1),
x2m = mean(X2),
x3m = mean(X3),
x4m = mean(X4))
}
2. You have not given a great test case in how you set your two factors
because the two group_by()'s will give the identical groupings, An
alternative which confirms that the function s() does what you want might
be:
df <- data.frame(matrix(rnorm(40), 10, 4),
f1 = base::sample(letters[1:3],30,replace=TRUE),
f2 = base::sample(letters[4:6],30,replace=TRUE))
HTH,
Eric
On Sat, Sep 9, 2017 at 1:52 PM, Edjabou Vincent <maklawe at gmail.com> wrote:
> Hi Lars
>
> I am not very sure what you really want. However, I am suggesting the
> following code that enables (1) to obtain the full summary of your data and
> (2) retrieve only mean of X values as function of factors f1 and f2.
>
> library(tidyverse)
> library(psych)
> df <- data.frame(matrix(rnorm(40), 10, 4),
> f1 = gl(3, 10, labels = letters[1:3]),
> f2 = gl(3, 10, labels = letters[4:6]))
>
> ##To get all summary of your data
> df%>% gather(X_name,X_value,X1:X4)%>%
> group_by(f1,f2,X_name)%>%
> do(describe(.$X_value))
>
> ##To obtain only means of your data
> df%>% gather(X_name,X_value,X1:X4)%>%
> group_by(f1,f2,X_name)%>%
> do(describe(.$X_value))%>%
> select(mean)%>%# You select only mean value
> spread(X_name,mean)#
>
> Vincent
>
> Med venlig hilsen/ Best regards
>
> Edjabou Maklawe Essonanawe Vincent
> Mobile: +45 31 95 99 33
>
> On Sat, Sep 9, 2017 at 12:30 PM, Lars Bishop <lars52r at gmail.com> wrote:
>
> > Dear group,
> >
> > Is there a way I could avoid the sort of duplication illustrated below?
> > i.e., I have the same dplyr::summarise function on different group_by
> > arguments. So I'd like to create a single summarise function that could
> be
> > applied to both. My attempt below fails.
> >
> > df <- data.frame(matrix(rnorm(40), 10, 4),
> > f1 = gl(3, 10, labels = letters[1:3]),
> > f2 = gl(3, 10, labels = letters[4:6]))
> >
> >
> > df %>%
> > group_by(f1, f2) %>%
> > summarise(x1m = mean(X1),
> > x2m = mean(X2),
> > x3m = mean(X3),
> > x4m = mean(X4))
> >
> > df %>%
> > group_by(f1) %>%
> > summarise(x1m = mean(X1),
> > x2m = mean(X2),
> > x3m = mean(X3),
> > x4m = mean(X4))
> >
> > # My fail attempt
> >
> > s <- function() {
> > dplyr::summarise(x1m = mean(X1),
> > x2m = mean(X2),
> > x3m = mean(X3),
> > x4m = mean(X4))
> > }
> >
> > df %>%
> > group_by(f1) %>% s
> > Error in s(.) : unused argument (.)
> >
> > Regards,
> > Lars.
> >
> > [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/
> > posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
More information about the R-help
mailing list