[R] Restructuring Hadley CET data

(Ted Harding) ted.harding at nessie.mcc.ac.uk
Fri Apr 27 23:25:00 CEST 2007


Hi Folks,
I have a nasty data restructuring problem!

I can think of one or two really clumsy ways of doing it
with 'for' loops and the like, but I can't think of a
*neat* way of doing it in R.

The data are the Hadley Centre "Central England Temperature"
series, daily from 01/01/1772 to 31/03/2007, and can be
viewed/downloaded at

  http://hadobs.metoffice.com/hadcet/cetdl1772on.dat

and the structure is as follows:

Year DoM Jan  Feb Mar  Apr May  Jun Jul Aug  Sep Oct  Nov Dec
-------------------------------------------------------------
1772  1   32  -15  18   25  87  128 187 177  105 111   78 112
1772  2   20    7  28   38  77  138 154 158  143 150   85  62
1772  3   27   15  36   33  84  170 139 153  113 124   83  60
1772  4   27  -25  61   58  96   90 151 160  173 114   60  47
1772  5   15   -5  68   69 133  146 179 170  173 116   83  50
1772  6   22  -45  51   77 113  105 175 198  160 134  134  42
.................................
.................................
1772 27    0   46  66   74  77  198 156 144   76 104   45   5
1772 28   15   77  86   64 116  167 151 155   66  84   60  10
1772 29  -33   56  83   50 113  131 170 182  135 140   63  12
1772 30  -10 -999  66   77 121  122 179 163  143 143   55  15
1772 31   -8 -999  46 -999 108 -999 168 144 -999 145 -999  22
1773  1   20    0  79   13  93  174 104 151  171 131   68  55
1773  2   10   17  71   25  65  109 128 184  164  91   34  75
1773  3    5  -28  94   70  41   79 135 192  149 101   78  85
1773  4    5  -23  99  107  49  107 144 173  144  98   86  83
1773  5  -28  -30  76   65  83  128 144 182  116  98   66  38
.................................

"DoM" is Day of Month, 1-31 for each month ("short" months
get entries -999 on missing days).

So each year is a block of 31 lines and 14 columns, pf
which the last 12 are Temperature (in 10ths of a degreeC),
each column a month, running down each column for the
31 days of the month in that year.

What I want to do is convert this into a 4-column format:

  Year, Month, DoM, Temp

with a separate row for each consecutive day from 01/01/1772
to 31/02/2007, and omitting days which have a "-999" entry
(OK I still have to check that "-999" is only used for DoMs
which don't exist, and don't also indicate that a Temperature
may be missing for some other reason; but I believe the series
to be complete).

What it boils down to is stacking the 12 31-day Temperature
columns on top of each other in each year, filling in the
Year, Month, DoM, and stacking the results for consecutive
years on top of each other (after which one can strike out
the "-999"s). Hence, really clunky for-loops!

Any really *neat* ideas for this?

With thanks,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 27-Apr-07                                       Time: 22:24:51
------------------------------ XFMail ------------------------------



More information about the R-help mailing list