RAGFlowChainR is an R package that brings Retrieval-Augmented Generation (RAG) capabilities to R, inspired by LangChain. It enables intelligent retrieval of documents from a local vector store (DuckDB), optional web search, and seamless integration with Large Language Models (LLMs).
Features include:
Python version: RAGFlowChain
(PyPI)
GitHub (Python): RAGFlowChain
install.packages("RAGFlowChainR")
To get the latest features or bug fixes, you can install the
development version of RAGFlowChainR
from GitHub:
# If needed
install.packages("remotes")
::install_github("knowusuboaky/RAGFlowChainR") remotes
See the full function reference or the package website for more details.
Sys.setenv(TAVILY_API_KEY = "your-tavily-api-key")
Sys.setenv(OPENAI_API_KEY = "your-openai-api-key")
Sys.setenv(GROQ_API_KEY = "your-groq-api-key")
Sys.setenv(ANTHROPIC_API_KEY = "your-anthropic-api-key")
To persist across sessions, add these to your
~/.Renviron
file.
library(RAGFlowChainR)
<- c("tests/testthat/test-data/sprint.pdf",
local_files "tests/testthat/test-data/introduction.pptx",
"tests/testthat/test-data/overview.txt")
<- c("https://www.r-project.org")
website_urls <- 1
crawl_depth
<- fetch_data(
response local_paths = local_files,
website_urls = website_urls,
crawl_depth = crawl_depth
)
response#> source title ...
#> 1 documents/sprint.pdf <NA> ...
#> 2 documents/introduction.pptx <NA> ...
#> 3 documents/overview.txt <NA> ...
#> 4 https://www.r-project.org R: The R Project for Statistical Computing ...
#> ...
cat(response$content[1])
#> Getting Started with Scrum\nCodeWithPraveen.com ...
<- create_vectorstore("tests/testthat/test-data/my_vectors.duckdb", overwrite = TRUE)
con
<- data.frame(head(response)) # reuse from fetch_data()
docs
insert_vectors(
con = con,
df = docs,
embed_fun = embed_openai(),
chunk_chars = 12000
)
build_vector_index(con, type = c("vss", "fts"))
<- search_vectors(con, query_text = "Tell me about R?", top_k = 5) response
response#> id page_content dist
#> 1 5 [Home]\nDownload\nCRAN\nR Project...\n... 0.2183
#> 2 6 [Home]\nDownload\nCRAN\nR Project...\n... 0.2183
#> ...
cat(response$page_content[1])
#> [Home]\nDownload\nCRAN\nR Project\nAbout R\nLogo\n...
<- create_rag_chain(
rag_chain llm = call_llm,
vector_database_directory = "tests/testthat/test-data/my_vectors.duckdb",
method = "DuckDB",
embedding_function = embed_openai(),
use_web_search = FALSE
)
<- rag_chain$invoke("Tell me about R") response
response#> $input
#> [1] "Tell me about R"
#>
#> $chat_history
#> [[1]] $role: "human", $content: "Tell me about R"
#> [[2]] $role: "assistant", $content: "R is a programming language..."
#>
#> $answer
#> [1] "R is a programming language and software environment commonly used for statistical computing and graphics..."
cat(response$answer)
#> R is a programming language and software environment commonly used for statistical computing and graphics...
call_llm(
prompt = "Summarize the capital of France.",
provider = "groq",
model = "llama3-8b",
temperature = 0.7,
max_tokens = 200
)
chatLLM
The chatLLM
package (now available on CRAN 🎉) offers a modular interface for
interacting with LLM providers including OpenAI,
Groq, and Anthropic.
install.packages("chatLLM")
Features:
openai
,
groq
, anthropic
)RAGFlowChainR
.Renviron
-based key management