Computing means of multiple variables based on a condition
William Dunlap
wdunlap at tibco.com
Thu May 26 03:06:38 CEST 2016
Just to be clear, do you really want your 'condition' groups to be be
subsets
of one another? Most (all?) of the *ply functions assume you want
non-overlapping groups so they do a split-summarize-combine sequence.
You would have to replace the split part of that.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Wed, May 25, 2016 at 3:37 PM, KMNanus <kmnanus at gmail.com> wrote:
> I have a large dataset, a sample of which is:
>
> a<- c(“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”,“A”, “B”)
> b <-c(15, 35, 20, 99, 75, 64, 33, 78, 45, 20)
> c<- c( 111, 234, 456, 876, 246, 662, 345, 480, 512, 179)
> d<- c(1.1, 3.2, 14.2, 8.7, 12.5, 5.9, 8.3, 6.0, 2.9, 9.3)
>
> df <- data.frame(a,b,c,d)
>
> I’m trying to construct a data frame that shows the means of c & b based
> on the condition of d and grouped by a.
>
> I want to create the data frame below, then use ggplot2 to create a line
> plot of b at various conditions of d.
>
> I can compute the grouped means (d>=2, d>=4, etc.) one at a time using
> dplyr but haven’t figured out how to put them all together or put them in
> one data frame.
>
> I’d rather not use a loop and am relatively new to R. Is there a way i
> can use tapply and set it to the conditions above so that I can create the
> df below?
>
>
> condition mean(b) mean(c)
> A d>=2 ____ _____
> B d>=2 ____ _____
> A d>=4 ____ _____
> B d>=4 ____ _____
> A d>=6 ____ _____
> B d>=6 ____ _____
>
>
>
> Ken
> kmnanus at gmail.com
> 914-450-0816 (tel)
> 347-730-4813 (fax)
>
>
>
