[R] Data Frame housekeeping

Jonathan Daily biomathjdaily at gmail.com
Tue May 24 21:24:26 CEST 2011


My suggestion, since bold doesn't show up in a text only mailing list,
would be to look into the function ?aggregate.

It looks like something like (assuming your data is in a mydat):

mydat.new <- aggregate(cbind(STN_ID, YEAR, MM, DAY) ~ ELEM + ?, mydat,
FUN = ?) #this is up to you

Alternatively, the plyr package is great at transforming data.frames.

On Tue, May 24, 2011 at 3:03 PM, Scott Hatcher
<scott.v.hatcher at gmail.com> wrote:
> Hello,
>
> I have a large data frame that is organized by date in a peculiar way. I
> am seeking advice on how to transform the data into a format that is of
> more use to me.
>
> The data is organized as follows:
>
>     STN_ID YEAR MM ELEM      X1         X2       X3         X4
> X5        X6         X7
> 1  2402594 1997   9   1 *-00233* *-00204* *-00119*  -00190  -00251
> -00243  -00249
> 2  2402594 1997  10  1              -00003  -00005  -00001  -00039
> -00031 -00036  -00033
> 3  2402594 1997  11  1              000025  000065  000070  000069
> 000115  000072  000093
> 4  2402594 1997  12  1              000160  000114  000143  000140
> 000093  000068  000157
> 5  2402594 1998  1    1              000067  000095  000139  000113
> 000066  000081  000070
> 6  2402594 1998  2    1              000098  000102  000140  000124
> 000082  000111  000047
> 7  2402594 1998  3    1              -00039  -00006  000015  000015
> 000016  000035  000013
> 8  2402594 1998  4    1              -00035  -00046  -00046  -00062
> -00018  -00025  -00012
> 9  2402594 1998  5    1              000031  000011  -00005  -00061
> -00061  -00080  -00217
> 10 2402594 1997  9    2 *-00339  -00339  -00343*  -00346  -00285
> -00253  -00253
> 11 2402594 1997  10  2              -00207  -00289  -00278  -00271
> -00258  -00315  -00341
> 12 2402594 1997  11  2              -00242  -00230  -00206  -00180
> -00256  -00227  -00241
> 13 2402594 1997  12  2              -00155  -00153  -00118  -00066
> -00088  -00073  -00032
> 14 2402594 1998  1    2              000003  -00021  -00033  -00022
> -00014  000001  000008
> 15 2402594 1998  2    2              000050  000077  000106  000073
> 000060  000060  000083
> 16 2402594 1998  3    2              000095  000063  000030  000057
> 000073  000144  000090
> 17 2402594 1998  4    2              000128  000178  000195  000157
> 000160  000160  000117
> 18 2402594 1998  5    2              000074  000064  000051  000027
> 000053  000063  000067
>
> Where "MM" is the month of the year, and ELEM is the variable to which
> the values in the X* columns describe (in the actual data there are 31 X
> columns, one for each day of the month). The values in bold are the
> values that are transferred into the small chart below (which is the
> result I hope to get). This is to give a sense of how the data is picked
> out of the original data frame.
>
> I would like to organize the data so it looks like this:
>
>       STN_ID YEAR MM DAY    ELEM1 ELEM2
> 1     2402594 1997   9  X1       -00233 -00339
> 2     2402594 1997   9  X2       -00204 000077
> 3     2402594 1997   9  X3       -00119 000030
>
> Such that I create a new column named "DAY" that is made up of the
> numbers following "X" in the original data.frame columns. Also, the ELEM
> values are converted to columns and parsed with the ELEM code (in this
> case 1 and 2).
>
> I have tried to split apart the columns, transform them, and bind them
> back together, but my ability to do so just isn't there yet. I am still
> fairly new to R, and would really appreciate some help in working
> towards organizing this data frame.
>
> Thanks in advance,
> Scott Hatcher
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
===============================================
Jon Daily
Technician
===============================================
#!/usr/bin/env outside
# It's great, trust me.



More information about the R-help mailing list