[R] Data frame manipulation - newbie question

Rense Nieuwenhuis rense.nieuwenhuis at gmail.com
Sun Jan 6 16:50:17 CET 2008


Hi,

you may want to use that apply / tapply function. Some find it a bit  
hard to grasp at first, but it will help you many times in many  
situations when you get the hang of it.

Maybe you can get some information on my site: http:// 
www.rensenieuwenhuis.nl/r-project/manual/basics/tables/


Hope this helps,

Rense Nieuwenhuis



On Jan 3, 2008, at 11:53 , José Augusto M. de Andrade Junior wrote:

> Hi all,
>
> Could someone please explain how can i efficientily query a data frame
> with several factors, as shown below:
>
> ---------------------------------------------------------------------- 
> -----------------------------------
> Data frame: pt.knn
> ---------------------------------------------------------------------- 
> -----------------------------------
> row | k.idx   |   step.forwd  |  pt.num |   model |   prev  |  value
> |  abs.error
> 1      200        0                  1             lm          09
> 10.5       1.5
> 2      200        0                  2             lm          11
> 10.5       1.5
> 3      201        1                  1             lm          10
> 12          2.0
> 4      201        1                  2             lm          12
> 12          2.0
> 5      202        2                  1             lm          12
> 12.1       0.1
> 6      202        2                  2             lm          12
> 12.1       0.1
> 7      200        0                  1             rlm         10.1
> 10.5       0.4
> 8      200        0                  2             rlm         10.3
> 10.5       0.2
> 9      201        1                  1             rlm         11.6
> 12          0.4
> 10    201        1                  2             rlm         11.4
> 12          0.6
> 11    202        2                  1             rlm         11.8
> 12.1       0.1
> 12    202        2                  2             rlm         11.9
> 12.1       0.2
> ---------------------------------------------------------------------- 
> ------------------------------------
>
> k.idx, step.forwd, pt.num and model columns are FACTORS.
> prev, value, abs.error are numeric
>
> I need to take the mean value of the numeric columns  (prev, value and
> abs.error) for each k.idx and step.forwd and model. So: rows 1 and 2,
> 3 and 4, 5 and 6,7 and 8, 9 and 10, 11 and 12 must be grouped
> together.
>
> Next, i need to plot a boxplot of the mean(abs.error) of each model
> for each k.idx.
> I need to compare the abs.error of the two models for each step and
> the mean overall abs.error of each model. And so on.
>
> I read the manuals, but the examples there are too simple. I know how
> to do this manipulation in a "brute force" manner, but i wish to learn
> how to work the right way with R.
>
> Could someone help me?
> Thanks in advance.
>
> José Augusto
> Undergraduate student
> University of São Paulo
> Business Administration Faculty
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting- 
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list