[R] how to make the code more efficient using lapply

Stephen HonKit Wong @tephen66 @ending from gm@il@com
Fri May 25 08:24:12 CEST 2018


Dear All,

I have a following for-loop code which is basically intended to read in
many excel files (each file has many columns and rows) in a directory and
extract the some rows and columns out of each file and then combine them
together into a dataframe. I use for loop which can do the work but quite
slow. How to make it faster using lapply function ?  Thanks in advance!



temp.df<-c() # create an empty list to store the extracted result from each
excel file inside for-loop


for (i in list.files()) {  # loop through each excel file in the directory

  temp<-read_xlsx(i,sheet=1,range=cell_cols(c(1,30,38:42)))  # from package
"readxl" to read in excel file

  temp<-temp[grep("^geneA$|^geneB$|^geneC$",temp$Id),]   # extract rows
based on temp$id

  names(temp)<-gsub("^.*] ","",names(temp)) # clean up column names

  temp.df<-append(temp.df, list(as.data.frame(temp))) # change the
dataframe to list, so it can be append to list.

  if (i == list.files()[length(list.files())]){ # if it is last excel file,
then combine all the rows in the list into a dataframe because they all
have same column names

    temp.df.all<-do.call("rbind",temp.df)

    write_xlsx(temp.df.all, path="output.xlsx")  # write_xlsx from package
writexl.

  }

  }




*Stephen*

	[[alternative HTML version deleted]]



More information about the R-help mailing list