[R] data load from excel files

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Nov 13 09:49:36 CET 2019


Hello,

Maybe the following will get you close to what you want.


# remove the last row from every df
pon1 <- lapply(pon1, function(DF){
   DF[[1]] <- as.Date(DF[["Tanggal"]], "%d-%m-%Y")
   DF[-nrow(DF), ]
})


# order the list by year-month
inx_ym <- sapply(pon1, function(DF){
   format(DF[["Tanggal"]][1], "%Y-%m")
})
pon1 <- pon1[order(inx_ym)]


# get the minimum and maximum of every "RR"
min.RR <- sapply(pon1, function(DF) min(DF[["RR"]], na.rm = TRUE))
max.RR <- sapply(pon1, function(DF) max(DF[["RR"]], na.rm = TRUE))


Hope this helps,

Rui Barradas



Às 07:50 de 13/11/19, ani jaya escreveu:
> Dear R-Help,
> 
> I have 30 of year-based excel files and each file contain month sheets. I
> have some problem here. My data is daily rainfall but there is extra 1 day
> (first date of next month) for several sheets. My main goal is to get the
> minimum value for every month.
> 
> First, how to extract those data to list of data frame based on year and
> delete every overlapping date?
> Second, how to sort it based on date with ascending order (old to new)?
> Third, how to get the maximum together with the date?
> 
> I did this one,
> 
> ...
> file.list <- list.files(pattern='*.xlsx')
> file.list<-mixedsort(file.list)
> 
> #
> https://stackoverflow.com/questions/12945687/read-all-worksheets-in-an-excel-workbook-into-an-r-list-with-data-frames
> 
> read_excel_allsheets <- function(filename, tibble = FALSE) {
>    sheets <- readxl::excel_sheets(filename)
>    x <- lapply(sheets, function(X) read.xlsx(filename, sheet=X, rows=9:40,
> cols=1:2))
>    if(!tibble) x <- lapply(x, as.data.frame)
>    names(x) <- sheets
>    x
> }
> 
> pon<-lapply(file.list, function(i) read_excel_allsheets(i))
> pon1<-do.call("rbind",pon)
> names(pon1) <- paste0("M.", 1:360)
> pon1 <-lapply(pon1,function(x){x$RR[x$RR==8888] <- NA; x})
> pon1 <-lapply(pon1,function(x){x$RR[x$RR==""] <- NA; x})
> maxi<-lapply(pon1, function(x) max(x$RR,na.rm=T))
> maxi<-data.frame(Reduce(rbind, maxi))
> names(maxi)<-"maxi"
> ....
> 
> but the list start from January for every year, and move to February and so
> on. And there is no date in "maxi". Here some sample what I get from my
> simple code.
> 
>> pon1[256:258]$M.256
>        Tanggal   RR
> 1  01-09-2001  5.2
> 2  02-09-2001  0.3
> 3  03-09-2001 29.0
> 4  04-09-2001  0.7
> 5  05-09-2001  9.6
> 6  06-09-2001  0.7
> 7  07-09-2001   NA
> 8  08-09-2001 13.2
> 9  09-09-2001   NA
> 10 10-09-2001   NA
> 11 11-09-2001  0.0
> 12 12-09-2001 66.0
> 13 13-09-2001  0.0
> 14 14-09-2001 57.6
> 15 15-09-2001 18.0
> 16 16-09-2001 29.2
> 17 17-09-2001 52.2
> 18 18-09-2001  7.0
> 19 19-09-2001   NA
> 20 20-09-2001 74.5
> 21 21-09-2001 20.3
> 22 22-09-2001 49.6
> 23 23-09-2001  0.0
> 24 24-09-2001  1.3
> 25 25-09-2001  0.0
> 26 26-09-2001  1.0
> 27 27-09-2001  0.1
> 28 28-09-2001  1.9
> 29 29-09-2001  9.5
> 30 30-09-2001  3.3
> 31 01-10-2001  0.0
> 
> $M.257
>        Tanggal   RR
> 1  01-09-2002  0.0
> 2  02-09-2002  0.0
> 3  03-09-2002  0.0
> 4  04-09-2002 12.8
> 5  05-09-2002  1.0
> 6  06-09-2002  0.0
> 7  07-09-2002   NA
> 8  08-09-2002 22.2
> 9  09-09-2002   NA
> 10 10-09-2002   NA
> 11 11-09-2002  0.0
> 12 12-09-2002  0.0
> 13 13-09-2002  0.0
> 14 14-09-2002   NA
> 15 15-09-2002  0.0
> 16 16-09-2002  0.0
> 17 17-09-2002  0.0
> 18 18-09-2002 13.3
> 19 19-09-2002  0.0
> 20 20-09-2002  0.0
> 21 21-09-2002  0.0
> 22 22-09-2002  0.0
> 23 23-09-2002  0.0
> 24 24-09-2002  0.0
> 25 25-09-2002  0.0
> 26 26-09-2002  0.5
> 27 27-09-2002  2.1
> 28 28-09-2002   NA
> 29 29-09-2002 18.5
> 30 30-09-2002  0.0
> 31 01-10-2002   NA
> 
> $M.258
>        Tanggal   RR
> 1  01-09-2003  0.0
> 2  02-09-2003  0.0
> 3  03-09-2003  0.0
> 4  04-09-2003  4.0
> 5  05-09-2003  0.3
> 6  06-09-2003  0.0
> 7  07-09-2003   NA
> 8  08-09-2003  0.0
> 9  09-09-2003  0.0
> 10 10-09-2003  0.0
> 11 11-09-2003   NA
> 12 12-09-2003  1.0
> 13 13-09-2003  0.0
> 14 14-09-2003 60.0
> 15 15-09-2003  4.5
> 16 16-09-2003  0.1
> 17 17-09-2003  2.1
> 18 18-09-2003   NA
> 19 19-09-2003  0.0
> 20 20-09-2003   NA
> 21 21-09-2003   NA
> 22 22-09-2003 31.5
> 23 23-09-2003 42.0
> 24 24-09-2003 43.3
> 25 25-09-2003  2.8
> 26 26-09-2003 21.4
> 27 27-09-2003  0.8
> 28 28-09-2003 42.3
> 29 29-09-2003  5.3
> 30 30-09-2003 17.3
> 31 01-10-2003  0.0
> 
> 
> Any lead or help is very appreciate.
> 
> Best,
> 
> Ani
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list