[R] data load from excel files

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Nov 13 16:10:28 CET 2019


Hello,

Try which.max?

Hope this helps,

Rui Barradas

Às 14:10 de 13/11/19, ani jaya escreveu:
> Thank you very much Mr. Rui, but for delete the duplicated row I use:
> 
> ...
> library(tidyverse)
> alldata<-data.frame(Reduce(rbind, pon1))
> c<-(which(duplicated(alldata$Tanggal))) #duplicate
> alldata<-alldata[-c,]
> attach(alldata)
> ....
> 
> because not every last row from every df is bad one.
> 
> Another problem is I want to know when the max value is occurred. So 
> basically I have maximum value every month (maxi, n=360, from 1986 to 
> 2015) and I want to find annual_maxima.
> 
> 
> ...
> maxi<-lapply(pon1, function(x) max(x$RR,na.rm=T))
> maxi<-data.frame(Reduce(rbind, maxi))
> names(maxi)<-"maxi"
> annual_maxima <- rep(NA,30)
> date <- rep(NA,30)
> for(i in 1:30){
> 
>    annual_maxima[i] <- max(maxi$maxi[(i*12-11):(i*12)])
>    date[i]<-Tanggal[which(RR==annual_maxima[i])]
> }
> ....
> 
> Here "alldata" contain "Tanggal" in this case is date and rainfall 
> ("RR"). What I get is error stated that:
> 
> In date[i] <- Tanggal[which(RR == annual_maxima[i])] : number of items 
> to replace is not a multiple of replacement length
> 
> 
> Maybe you have some idea where the problem is, I would be thankful.
> 
> Best,
> Ani
> 
> 
> 
> 
> 
> On Wed, Nov 13, 2019 at 5:49 PM Rui Barradas <ruipbarradas using sapo.pt 
> <mailto:ruipbarradas using sapo.pt>> wrote:
> 
>     Hello,
> 
>     Maybe the following will get you close to what you want.
> 
> 
>     # remove the last row from every df
>     pon1 <- lapply(pon1, function(DF){
>         DF[[1]] <- as.Date(DF[["Tanggal"]], "%d-%m-%Y")
>         DF[-nrow(DF), ]
>     })
> 
> 
>     # order the list by year-month
>     inx_ym <- sapply(pon1, function(DF){
>         format(DF[["Tanggal"]][1], "%Y-%m")
>     })
>     pon1 <- pon1[order(inx_ym)]
> 
> 
>     # get the minimum and maximum of every "RR"
>     min.RR <- sapply(pon1, function(DF) min(DF[["RR"]], na.rm = TRUE))
>     max.RR <- sapply(pon1, function(DF) max(DF[["RR"]], na.rm = TRUE))
> 
> 
>     Hope this helps,
> 
>     Rui Barradas
> 
> 
> 
>     Às 07:50 de 13/11/19, ani jaya escreveu:
>      > Dear R-Help,
>      >
>      > I have 30 of year-based excel files and each file contain month
>     sheets. I
>      > have some problem here. My data is daily rainfall but there is
>     extra 1 day
>      > (first date of next month) for several sheets. My main goal is to
>     get the
>      > minimum value for every month.
>      >
>      > First, how to extract those data to list of data frame based on
>     year and
>      > delete every overlapping date?
>      > Second, how to sort it based on date with ascending order (old to
>     new)?
>      > Third, how to get the maximum together with the date?
>      >
>      > I did this one,
>      >
>      > ...
>      > file.list <- list.files(pattern='*.xlsx')
>      > file.list<-mixedsort(file.list)
>      >
>      > #
>      >
>     https://stackoverflow.com/questions/12945687/read-all-worksheets-in-an-excel-workbook-into-an-r-list-with-data-frames
>      >
>      > read_excel_allsheets <- function(filename, tibble = FALSE) {
>      >    sheets <- readxl::excel_sheets(filename)
>      >    x <- lapply(sheets, function(X) read.xlsx(filename, sheet=X,
>     rows=9:40,
>      > cols=1:2))
>      >    if(!tibble) x <- lapply(x, as.data.frame)
>      >    names(x) <- sheets
>      >    x
>      > }
>      >
>      > pon<-lapply(file.list, function(i) read_excel_allsheets(i))
>      > pon1<-do.call("rbind",pon)
>      > names(pon1) <- paste0("M.", 1:360)
>      > pon1 <-lapply(pon1,function(x){x$RR[x$RR==8888] <- NA; x})
>      > pon1 <-lapply(pon1,function(x){x$RR[x$RR==""] <- NA; x})
>      > maxi<-lapply(pon1, function(x) max(x$RR,na.rm=T))
>      > maxi<-data.frame(Reduce(rbind, maxi))
>      > names(maxi)<-"maxi"
>      > ....
>      >
>      > but the list start from January for every year, and move to
>     February and so
>      > on. And there is no date in "maxi". Here some sample what I get
>     from my
>      > simple code.
>      >
>      >> pon1[256:258]$M.256
>      >        Tanggal   RR
>      > 1  01-09-2001  5.2
>      > 2  02-09-2001  0.3
>      > 3  03-09-2001 29.0
>      > 4  04-09-2001  0.7
>      > 5  05-09-2001  9.6
>      > 6  06-09-2001  0.7
>      > 7  07-09-2001   NA
>      > 8  08-09-2001 13.2
>      > 9  09-09-2001   NA
>      > 10 10-09-2001   NA
>      > 11 11-09-2001  0.0
>      > 12 12-09-2001 66.0
>      > 13 13-09-2001  0.0
>      > 14 14-09-2001 57.6
>      > 15 15-09-2001 18.0
>      > 16 16-09-2001 29.2
>      > 17 17-09-2001 52.2
>      > 18 18-09-2001  7.0
>      > 19 19-09-2001   NA
>      > 20 20-09-2001 74.5
>      > 21 21-09-2001 20.3
>      > 22 22-09-2001 49.6
>      > 23 23-09-2001  0.0
>      > 24 24-09-2001  1.3
>      > 25 25-09-2001  0.0
>      > 26 26-09-2001  1.0
>      > 27 27-09-2001  0.1
>      > 28 28-09-2001  1.9
>      > 29 29-09-2001  9.5
>      > 30 30-09-2001  3.3
>      > 31 01-10-2001  0.0
>      >
>      > $M.257
>      >        Tanggal   RR
>      > 1  01-09-2002  0.0
>      > 2  02-09-2002  0.0
>      > 3  03-09-2002  0.0
>      > 4  04-09-2002 12.8
>      > 5  05-09-2002  1.0
>      > 6  06-09-2002  0.0
>      > 7  07-09-2002   NA
>      > 8  08-09-2002 22.2
>      > 9  09-09-2002   NA
>      > 10 10-09-2002   NA
>      > 11 11-09-2002  0.0
>      > 12 12-09-2002  0.0
>      > 13 13-09-2002  0.0
>      > 14 14-09-2002   NA
>      > 15 15-09-2002  0.0
>      > 16 16-09-2002  0.0
>      > 17 17-09-2002  0.0
>      > 18 18-09-2002 13.3
>      > 19 19-09-2002  0.0
>      > 20 20-09-2002  0.0
>      > 21 21-09-2002  0.0
>      > 22 22-09-2002  0.0
>      > 23 23-09-2002  0.0
>      > 24 24-09-2002  0.0
>      > 25 25-09-2002  0.0
>      > 26 26-09-2002  0.5
>      > 27 27-09-2002  2.1
>      > 28 28-09-2002   NA
>      > 29 29-09-2002 18.5
>      > 30 30-09-2002  0.0
>      > 31 01-10-2002   NA
>      >
>      > $M.258
>      >        Tanggal   RR
>      > 1  01-09-2003  0.0
>      > 2  02-09-2003  0.0
>      > 3  03-09-2003  0.0
>      > 4  04-09-2003  4.0
>      > 5  05-09-2003  0.3
>      > 6  06-09-2003  0.0
>      > 7  07-09-2003   NA
>      > 8  08-09-2003  0.0
>      > 9  09-09-2003  0.0
>      > 10 10-09-2003  0.0
>      > 11 11-09-2003   NA
>      > 12 12-09-2003  1.0
>      > 13 13-09-2003  0.0
>      > 14 14-09-2003 60.0
>      > 15 15-09-2003  4.5
>      > 16 16-09-2003  0.1
>      > 17 17-09-2003  2.1
>      > 18 18-09-2003   NA
>      > 19 19-09-2003  0.0
>      > 20 20-09-2003   NA
>      > 21 21-09-2003   NA
>      > 22 22-09-2003 31.5
>      > 23 23-09-2003 42.0
>      > 24 24-09-2003 43.3
>      > 25 25-09-2003  2.8
>      > 26 26-09-2003 21.4
>      > 27 27-09-2003  0.8
>      > 28 28-09-2003 42.3
>      > 29 29-09-2003  5.3
>      > 30 30-09-2003 17.3
>      > 31 01-10-2003  0.0
>      >
>      >
>      > Any lead or help is very appreciate.
>      >
>      > Best,
>      >
>      > Ani
>      >
>      >       [[alternative HTML version deleted]]
>      >
>      > ______________________________________________
>      > R-help using r-project.org <mailto:R-help using r-project.org> mailing list
>     -- To UNSUBSCRIBE and more, see
>      > https://stat.ethz.ch/mailman/listinfo/r-help
>      > PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>      > and provide commented, minimal, self-contained, reproducible code.
>      >
>



More information about the R-help mailing list