[R] getting data into correct format for summarizing ... reshape, aggregate, or...

John Kane jrkrideau at yahoo.ca
Mon Sep 15 18:48:00 CEST 2008


I think your problem is coming from the cbind.  You are forcing the data into a matrix not a data.frame. Neither aggregate or cast will work on that matrix.

Do a str(df1) or class(df1) and you will see what is happening

Try this using the reshape package.  Note the code runs but I have not verified the results. The function approach comes from Hadley's vignette at had.co.nz/reshape/introduction.pdf .
===================================================================== 

df1 <- data.frame(RiverMile, constituent, value)
cast(df1, RiverMile + constituent ~ ., function(x) c(means= mean(x),SD=sd(x)))
=====================================================================


--- On Mon, 9/15/08, stephen sefick <ssefick at gmail.com> wrote:

> From: stephen sefick <ssefick at gmail.com>
> Subject: [R] getting data into correct format for summarizing ... reshape, aggregate, or...
> To: "R-help Mailing List" <r-help at r-project.org>
> Received: Monday, September 15, 2008, 12:14 PM
> I would like to reformat this data frame into something that
> I can
> produce some descriptive statistics.  I have been playing
> around with
> the reshape package and maybe this is not the best way to
> proceed.  I
> would like to use RiverMile and constituent as the grouping
> variables
> to get the summary statistics:
> 
> 198a    198b
> mean   mean
> sd       sd
> ...        ...
> 
> etc. for all of these.
> I have tried reshape and aggregate and I am sure that I am
> missing something...
> 
> below is a naive attempt at making a data frame with the
> columns in
> the correct class-  This can be improved also.  There are
> NA in the
> real data set, but I didn't know how to randomly
> intersperse NA in a
> created matrix.  I hope this makes sense.  If it
> doesn't I will go
> back to the drawing board and try and clarify this.
> 
> value <- rnorm(30)
> RiverMile <- c(rep(215, length.out=10), rep(202,
> length.out=10),
> rep(198, length.out=10))
> constituent <- c (rep("a", length.out=5),
> rep("b", length.out=5),
> rep("a", length.out=5), rep("b",
> length.out=5), rep("a",
> length.out=5), rep("b", length.out=5))
> df <- cbind(as.integer(RiverMile),
> as.factor(constituent), as.numeric(value))
> df.1 <- as.data.frame(df)
> df.1[,"V1"] <-
> as.integer(df.1[,"V1"])
> df.1[,"V2"] <-
> as.factor(df.1[,"V2"])
> df.1[,"V3"] <-
> as.numeric(df.1[,"V3"])
> colnames(df.1) <- c("RiverMile",
> "constituent", "value")
> 
> 
> -- 
> Stephen Sefick
> Research Scientist
> Southeastern Natural Sciences Academy
> 
> Let's not spend our time and resources thinking about
> things that are
> so little or so large that all they really do for us is
> puff us up and
> make us feel like gods. We are mammals, and have not
> exhausted the
> annoying little problems of being mammals.
> 
> 	-K. Mullis
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained,
> reproducible code.


      __________________________________________________________________
[[elided Yahoo spam]]



More information about the R-help mailing list