[R] summary statistics into table/data base, many factors to analyse

Gabor Grothendieck ggrothendieck at gmail.com
Sat Nov 22 09:57:20 CET 2008


On Fri, Nov 21, 2008 at 5:50 AM, Gerit Offermann <gerit.offermann at gmx.de> wrote:
> Dear list,
>
> thanks to your help I managed to find means of analysing my data.
>
> However, the whole data set contains 264 variables. Of which some are
> factors, others are not. The factors tend to be grouped, e.g.
> data$f1304 to data$f1484 and data$f3204 to data$5408.
>
> But there are other types of variables in the data set as well,
> e.g. data$f1504.
>
> Not every spot is taken, i.e data$f1345 to data$1399 might not exist
> in the data set.

We can compute on the names like this (using the builtin anscombe
data set to get just columns y1, x1, x2, x3, x4).  Try this:

# display anscombe data set
anscombe

# names.x are names that start with x
names.x <- grep("^x", names(anscombe), value = TRUE)
anscombe[, c("y1", names.x)]

>
> The solution "summaryBy" works for cross analysis, of which there is
> a handful. So I am not worried here.
>
> The solution from Jorge is fine.
> However, I am trying to get my head around how to efficiently
> reduce my data set to the dependet variable and the factors such that
> the solution is applicable.
>
> Having to type each variable into
> my.reduced.data <- cbind(my.data$f1001, my.data$1002, my.data$1003...
> is an obvious option, but does not seem to be the most efficient one.
>
> Are there better ways to go about?
>
> Thanks,
> Gerit
> --
> Sensationsangebot nur bis 30.11: GMX FreeDSL - Telefonanschluss + DSL
> für nur 16,37 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list