daiR: Interface with Google Cloud Document AI API

R interface for the Google Cloud Services 'Document AI API' <https://cloud.google.com/document-ai/> with additional tools for output file parsing and text reconstruction. 'Document AI' is a powerful server-based OCR service that extracts text and tables from images and PDF files with high accuracy. 'daiR' gives R users programmatic access to this service and additional tools to handle and visualize the output. See the package website <https://dair.info/> for more information and examples.

Version: 1.0.0
Depends: R (≥ 4.2.0)
Imports: base64enc, beepr, cli, data.table, fs, gargle, glue, googleCloudStorageR, graphics, grDevices, httr, jsonlite, lifecycle, magick, pdftools, purrr, readtext, stats, stringr, utils, xml2
Suggests: knitr, ngram, rmarkdown, testthat (≥ 3.1.10)
Published: 2024-02-12
DOI: 10.32614/CRAN.package.daiR
Author: Thomas Hegghammer ORCID iD [aut, cre]
Maintainer: Thomas Hegghammer <hegghammer at gmail.com>
BugReports: https://github.com/Hegghammer/daiR/issues
License: MIT + file LICENSE
URL: https://github.com/Hegghammer/daiR, https://dair.info
NeedsCompilation: no
Materials: README NEWS
CRAN checks: daiR results

Documentation:

Reference manual: daiR.pdf
Vignettes: Complex file and folder management
Configuration
Working with Google Cloud Storage
Quickstart
Correcting text output
Extracting tables
Basic usage

Downloads:

Package source: daiR_1.0.0.tar.gz
Windows binaries: r-devel: daiR_1.0.0.zip, r-release: daiR_1.0.0.zip, r-oldrel: daiR_1.0.0.zip
macOS binaries: r-release (arm64): daiR_1.0.0.tgz, r-oldrel (arm64): daiR_1.0.0.tgz, r-release (x86_64): daiR_1.0.0.tgz, r-oldrel (x86_64): daiR_1.0.0.tgz
Old sources: daiR archive

Linking:

Please use the canonical form https://CRAN.R-project.org/package=daiR to link to this page.