[R] Applying a function to categorized data?

Robert Latest boblatest at gmail.com
Fri Apr 13 16:52:38 CEST 2012


Hello Steve,

thank you for your reply. You're right, just before I read your post
I'd found aggregate() and indeed it brought me a long way towards my
goal.

I've been a C programmer for 20+ years, and I'm fairly firm in SQL, so
to understand R I need to lose my scalar and row (record) oriented
thinking and get my head into vectors and columns.

I'm still nowhere near where I think I need to be in order to work mit
my data. I'll get back to the list when I have pinpointed my problem a
bit better, and I'll also supply some sample data.

Have a nice weekend,
robert

On Thu, Apr 12, 2012 at 8:52 PM, steven mosher <moshersteven at gmail.com> wrote:
>  Welcome to R and the list.
>
>  Others may suggest books ( Nutshell was my first ) but first there are some
> things that will help you
>  both in programming and getting help on the list.
>
>  You should post executable code in your question. So, build a toy example
> of the data.frame you have
> and show what you tried. Folks here should be able to run your toy example
> and  show you how to get the answer you want.
>
> For your problem I'm guessing that aggregate() would be one path
>
> ?aggregate
>
>  you will need to specify   "by"  to aggregate by month
>
> Steve
>
> On Thu, Apr 12, 2012 at 7:10 AM, Robert Latest <boblatest at gmail.com> wrote:
>>
>> Hi all,
>>
>> I'm just getting started in R. My problem is the following:
>>
>> I have a data frame (v1) with lots of production data measurements.
>> Each row contains a single measurement ('ARI_MIT') with a timestamp. I
>> want to "lump" the data by months with their mean and standard
>> deviation.
>>
>> I have already successfully managed to do the lumping by adding
>> another column to my data frame:
>>
>> v1$MONTH = strftime(v1$TIMESTAMP, "%y%m")
>>
>> This makes a nice month-wise boxplot of my data, although I don't have
>> an idea why:
>> boxplot(v1$ARI_MIT ~ v1$MONTH)
>>
>> I don't need this plotted, though, but in the form of a new data frame
>> with three columns: the month, the mean, and the standard deviation of
>> all values from that month.
>>
>> I tried un-stacking v1 into a list of vectors and then looping over
>> its elements, calculating the mean of each group:
>>
>> for (i in unstack(v1, v1$ARI_MIT ~ v1$MONTH)) { write(mean(i), "") }
>>
>> This works, but how do I get the data into a data frame? With the
>> month labels in a column? They are not avaliable inside the loop body.
>>
>> I know I need to get a book on R.
>>
>> Thanks,
>> robert
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list