[R] textual analysis - transforming several pdf to txt - naming the files

Rui Barradas ru|pb@rr@d@@ @end|ng |rom @@po@pt
Wed Jul 5 17:43:19 CEST 2023


Às 11:12 de 05/07/2023, Cecília Carmo escreveu:
> convertpdf2txt <- function(dirpath){
> 
>     files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names
> = TRUE)
>     files <- chartr("\\", "/", files)
> 
>     x <- lapply(files, function(x){
>       pdftools::pdf_text(x) %>%
>         paste0(collapse = " ") %>%
>         stringr::str_squish()
>     })
>     new_names <- tools::file_path_sans_ext(files)
>     new_names <- paste(new_names, "txt", sep = ".")
>     setNames(x, new_names)
> }
> 
> # apply function
> # note that my test files are in "~/Temp"
> txts <- convertpdf2txt(here::here("~", "Temp"))
> names(txts)
> 
> 
> Thank you very much, but the following error appeared:
> 
> Error: unexpected '}' in "}"
> 
> 
> 
> 
> Cec�lia Carmo
> 
> Universidade de Aveiro
> 
> 	[[alternative HTML version deleted]]
> 
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Hello,

I had tested the code with a couple of PDF's and it ran with no errors 
or warnings.
That error is telling that a "}" is not balanced but in my code they all 
are, RStudio checks it automatically.

Can you try to check in an editor with syntax highlighting?


Hope this helps,

Rui Barradas



More information about the R-help mailing list