[R] Extracting the first currency value from PDF files

Manish Mukherjee m@n|@hmukherjee @end|ng |rom hotm@||@com
Wed May 13 15:33:03 CEST 2020


Hi All,

Need some help with the following code , i have a number of pdf files , and the first page of those files gives a currency value $xxx,xxx,xxx . How to extract this value from a number of PDF files and put it in a data frame . I am able to do it for a single file
with the code where opinions is the text data and 1 is the first currency value
```
d=str_nth_currency(opinions, 1)
df = subset(d, select = c(amount) )
df

I want this to loop over multiple pdf files

I have tried somesthing like this but not working
for (i in 1:length(files)){
  print(i)
  pdf_text(paste("filepath ", files[i],sep = ""))
  str_nth_currency(files[i], 1)
}


Please help.

	[[alternative HTML version deleted]]



More information about the R-help mailing list