[R] Extracting the first currency value from PDF files

John Kane jrkr|de@u @end|ng |rom gm@||@com
Wed May 13 16:04:11 CEST 2020


It looks like you are using the str_nth_currency() function from the strex
package but we have no idea of what the pdf files are or how you are
importing them is to R.
We need a lot more information on what you are doing "before" you use the
function.

Have a look at
http://adv-r.had.co.nz/Reproducibility.html
or
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example



On Wed, 13 May 2020 at 09:33, Manish Mukherjee <manishmukherjee using hotmail.com>
wrote:

> Hi All,
>
> Need some help with the following code , i have a number of pdf files ,
> and the first page of those files gives a currency value $xxx,xxx,xxx . How
> to extract this value from a number of PDF files and put it in a data frame
> . I am able to do it for a single file
> with the code where opinions is the text data and 1 is the first currency
> value
> ```
> d=str_nth_currency(opinions, 1)
> df = subset(d, select = c(amount) )
> df
>
> I want this to loop over multiple pdf files
>
> I have tried somesthing like this but not working
> for (i in 1:length(files)){
>   print(i)
>   pdf_text(paste("filepath ", files[i],sep = ""))
>   str_nth_currency(files[i], 1)
> }
>
>
> Please help.
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
John Kane
Kingston ON Canada

	[[alternative HTML version deleted]]



More information about the R-help mailing list