[R] textual analysis - transforming several pdf to txt - naming the files

Cecília Carmo cec|||@@c@rmo @end|ng |rom u@@pt
Wed Jul 5 12:12:12 CEST 2023


convertpdf2txt <- function(dirpath){

   files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names
= TRUE)
   files <- chartr("\\", "/", files)

   x <- lapply(files, function(x){
     pdftools::pdf_text(x) %>%
       paste0(collapse = " ") %>%
       stringr::str_squish()
   })
   new_names <- tools::file_path_sans_ext(files)
   new_names <- paste(new_names, "txt", sep = ".")
   setNames(x, new_names)
}

# apply function
# note that my test files are in "~/Temp"
txts <- convertpdf2txt(here::here("~", "Temp"))
names(txts)


Thank you very much, but the following error appeared:

Error: unexpected '}' in "}"




Cec�lia Carmo

Universidade de Aveiro

	[[alternative HTML version deleted]]



More information about the R-help mailing list