[R] Aggregate rainfall data

boB Rudis bob at rudis.net
Wed Jul 13 13:04:44 CEST 2016


use `gsub()` after the `as.character()` conversion to remove
everything but valid numeric components from the strings.

On Wed, Jul 13, 2016 at 6:21 AM, roslinazairimah zakaria
<roslinaump at gmail.com> wrote:
> Dear David,
>
> I got your point.  How do I remove the data that contain "0.0?".
>
> I tried : balok <- cbind(balok3[,-5], balok3$Rain.mm[balok3$Rain.mm==0.0?]
> <- NA)
>
> However all the Rain.mm column all become NA.
>
>    day month year     Time balok3$Rain.mm[balok3$Rain.mm == "0.0?"] <- NA
> 1   30     7 2008  9:00:00                                             NA
> 2   30     7 2008 10:00:00                                             NA
> 3   30     7 2008 11:00:00                                             NA
> 4   30     7 2008 12:00:00                                             NA
> 5   30     7 2008 13:00:00                                             NA
> 6   30     7 2008 14:00:00                                             NA
> 7   30     7 2008 15:00:00                                             NA
> 8   30     7 2008 16:00:00                                             NA
> 9   30     7 2008 17:00:00                                             NA
> 10  30     7 2008 18:00:00                                             NA
>
> Thank you so much.
>
>
> On Wed, Jul 13, 2016 at 9:42 AM, David Winsemius <dwinsemius at comcast.net>
> wrote:
>
>>
>> > On Jul 12, 2016, at 3:45 PM, roslinazairimah zakaria <
>> roslinaump at gmail.com> wrote:
>> >
>> > Dear R-users,
>> >
>> > I have these data:
>> >
>> > head(balok, 10); tail(balok, 10)
>> >        Date     Time Rain.mm
>> > 1  30/7/2008  9:00:00       0
>> > 2  30/7/2008 10:00:00       0
>> > 3  30/7/2008 11:00:00       0
>> > 4  30/7/2008 12:00:00       0
>> > 5  30/7/2008 13:00:00       0
>> > 6  30/7/2008 14:00:00       0
>> > 7  30/7/2008 15:00:00       0
>> > 8  30/7/2008 16:00:00       0
>> > 9  30/7/2008 17:00:00       0
>> > 10 30/7/2008 18:00:00       0
>> >           Date     Time Rain.mm
>> > 63667 4/11/2015  3:00:00       0
>> > 63668 4/11/2015  4:00:00       0
>> > 63669 4/11/2015  5:00:00       0
>> > 63670 4/11/2015  6:00:00       0
>> > 63671 4/11/2015  7:00:00       0
>> > 63672 4/11/2015  8:00:00       0
>> > 63673 4/11/2015  9:00:00     0.1
>> > 63674 4/11/2015 10:00:00     0.1
>> > 63675 4/11/2015 11:00:00     0.1
>> > 63676 4/11/2015 12:00:00    0.1?
>> >
>> >> str(balok)
>> > 'data.frame':   63676 obs. of  3 variables:
>> > $ Date   : Factor w/ 2654 levels "1/1/2009","1/1/2010",..: 2056 2056 2056
>> > 2056 2056 2056 2056 2056 2056 2056 ...
>> > $ Time   : Factor w/ 24 levels "1:00:00","10:00:00",..: 24 2 3 4 5 6 7 8
>> 9
>> > 10 ...
>> > $ Rain.mm: Factor w/ 352 levels "0","0.0?","0.1",..: 1 1 1 1 1 1 1 1 1 1
>>
>> Thar's your problem:
>>
>>   Rain.mm: Factor w/ 352 levels "0","0.0?","0.1"
>>
>> Need to use the standard fix for the screwed-up-factor-on-input-problem
>>
>>   balok$Rain.mm2 <- as.numeric( as.character(balok$Rain.mm) )
>>
>> Cannot just do as.numeric because factors are actually already numeric.
>>
>> --
>> David.
>>
>>
>> > ...
>> >
>> > and I have change the data as follows:
>> >
>> > realdate <- as.Date(balok$Date,format="%d/%m/%Y")
>> > dfdate <- data.frame(date=realdate)
>> > year=as.numeric (format(realdate,"%Y"))
>> > month=as.numeric (format(realdate,"%m"))
>> > day=as.numeric (format(realdate,"%d"))
>> >
>> > balok2 <-cbind(dfdate,day,month,year,balok[,2:3])
>> > colnames(balok2)
>> > head(balok2)
>> >        date day month year     Time Rain.mm
>> > 1 2008-07-30  30     7 2008  9:00:00       0
>> > 2 2008-07-30  30     7 2008 10:00:00       0
>> > 3 2008-07-30  30     7 2008 11:00:00       0
>> > 4 2008-07-30  30     7 2008 12:00:00       0
>> > 5 2008-07-30  30     7 2008 13:00:00       0
>> > 6 2008-07-30  30     7 2008 14:00:00       0
>> > ...
>> >
>> >> balok3 <- balok2[,-1]; head(balok3, n=100)
>> >    day month year     Time Rain.mm
>> > 1    30     7 2008  9:00:00       0
>> > 2    30     7 2008 10:00:00       0
>> > 3    30     7 2008 11:00:00       0
>> > 4    30     7 2008 12:00:00       0
>> > 5    30     7 2008 13:00:00       0
>> > 6    30     7 2008 14:00:00       0
>> > 7    30     7 2008 15:00:00       0
>> > 8    30     7 2008 16:00:00       0
>> > 9    30     7 2008 17:00:00       0
>> > 10   30     7 2008 18:00:00       0
>> > 11   30     7 2008 19:00:00       0
>> > 12   30     7 2008 20:00:00       0
>> > 13   30     7 2008 21:00:00       0
>> > 14   30     7 2008 22:00:00       0
>> > 15   30     7 2008 23:00:00       0
>> > 16   30     7 2008 24:00:00       0
>> > 17   31     7 2008  1:00:00       0
>> > 18   31     7 2008  2:00:00       0
>> > 19   31     7 2008  3:00:00       0
>> > 20   31     7 2008  4:00:00       0
>> > 21   31     7 2008  5:00:00       0
>> > 22   31     7 2008  6:00:00       0
>> > 23   31     7 2008  7:00:00       0
>> > 24   31     7 2008  8:00:00       0
>> > 25   31     7 2008  9:00:00       0
>> > 26   31     7 2008 10:00:00       0
>> > 27   31     7 2008 11:00:00       0
>> > 28   31     7 2008 12:00:00       0
>> > 29   31     7 2008 13:00:00       0
>> > 30   31     7 2008 14:00:00       0
>> > 31   31     7 2008 15:00:00       0
>> > 32   31     7 2008 16:00:00       0
>> > 33   31     7 2008 17:00:00       0
>> > 34   31     7 2008 18:00:00       0
>> > 35   31     7 2008 19:00:00       0
>> > 36   31     7 2008 20:00:00       0
>> > 37   31     7 2008 21:00:00       0
>> > 38   31     7 2008 22:00:00       0
>> > 39   31     7 2008 23:00:00       0
>> > 40   31     7 2008 24:00:00       0
>> > 41    1     8 2008  1:00:00       0
>> > 42    1     8 2008  2:00:00       0
>> > 43    1     8 2008  3:00:00       0
>> > 44    1     8 2008  4:00:00       0
>> > 45    1     8 2008  5:00:00       0
>> > 46    1     8 2008  6:00:00       0
>> > 47    1     8 2008  7:00:00       0
>> > 48    1     8 2008  8:00:00       0
>> > 49    1     8 2008  9:00:00       0
>> > 50    1     8 2008 10:00:00       0
>> > 51    1     8 2008 11:00:00       0
>> > 52    1     8 2008 12:00:00       0
>> > 53    1     8 2008 13:00:00       0
>> > 54    1     8 2008 14:00:00       0
>> > 55    1     8 2008 15:00:00       0
>> > 56    1     8 2008 16:00:00       0
>> > 57    1     8 2008 17:00:00       0
>> > 58    1     8 2008 18:00:00       0
>> > 59    1     8 2008 19:00:00       0
>> > 60    1     8 2008 20:00:00       0
>> > 61    1     8 2008 21:00:00       0
>> > 62    1     8 2008 22:00:00       0
>> > 63    1     8 2008 23:00:00       0
>> > 64    1     8 2008 24:00:00       0
>> > 65    2     8 2008  1:00:00       0
>> > 66    2     8 2008  2:00:00       0
>> > 67    2     8 2008  3:00:00       0
>> > 68    2     8 2008  4:00:00       0
>> > 69    2     8 2008  5:00:00       0
>> > 70    2     8 2008  6:00:00       0
>> > 71    2     8 2008  7:00:00       0
>> > 72    2     8 2008  8:00:00       0
>> > 73    2     8 2008  9:00:00       0
>> > 74    2     8 2008 10:00:00       0
>> > 75    2     8 2008 11:00:00       0
>> > 76    2     8 2008 12:00:00       0
>> > 77    2     8 2008 13:00:00       0
>> > 78    2     8 2008 14:00:00       0
>> > 79    2     8 2008 15:00:00       0
>> > 80    2     8 2008 16:00:00       0
>> > 81    2     8 2008 17:00:00       0
>> > 82    2     8 2008 18:00:00       0
>> > 83    2     8 2008 19:00:00       0
>> > 84    2     8 2008 20:00:00       0
>> > 85    2     8 2008 21:00:00       0
>> > 86    2     8 2008 22:00:00       0
>> > 87    2     8 2008 23:00:00       0
>> > 88    2     8 2008 24:00:00    11.1
>> > 89    3     8 2008  1:00:00     0.4
>> > 90    3     8 2008  2:00:00       0
>> > 91    3     8 2008  3:00:00       0
>> > 92    3     8 2008  4:00:00       0
>> > 93    3     8 2008  5:00:00       0
>> > 94    3     8 2008  6:00:00       0
>> > 95    3     8 2008  7:00:00       0
>> > 96    3     8 2008  8:00:00       0
>> > 97    3     8 2008  9:00:00       0
>> > 98    3     8 2008 10:00:00       0
>> > 99    3     8 2008 11:00:00       0
>> > 100   3     8 2008 12:00:00       0
>> >
>> > The rainfall data is in hourly unit, and I would like to sum the Rain.mm
>> > according to month.  I tried to use aggregate(), but I got this message:
>> >
>> > dt <- balok4
>> > str(dt)
>> > aggbalok <- aggregate(dt[,5], by=dt[,c(1,4)],FUN=sum, na.rm=TRUE)
>> > aggbalok
>> >
>> > Error in Summary.factor(1L, na.rm = TRUE) :
>> >  sum not meaningful for factors
>> >
>> >
>> > Thank you so much for any help given.
>> >
>> > Roslina
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>>
>>
>
>
> --
> *Dr. Roslinazairimah Binti Zakaria*
> *Tel: +609-5492370; Fax. No.+609-5492766*
>
> *Email: roslinazairimah at ump.edu.my <roslinazairimah at ump.edu.my>;
> roslinaump at gmail.com <roslinaump at gmail.com>*
> Deputy Dean (Academic & Student Affairs)
> Faculty of Industrial Sciences & Technology
> University Malaysia Pahang
> Lebuhraya Tun Razak, 26300 Gambang, Pahang, Malaysia
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list