hellmer

Process Lots of LLM Chats

https://github.com/dylanpieper/chatalot

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.4%) to scientific vocabulary

Keywords

batch batch-processing ellmer llm package r

Last synced: 6 months ago · JSON representation

Repository

Process Lots of LLM Chats

Basic Info

Host: GitHub
Owner: dylanpieper
License: other
Language: R
Default Branch: main
Homepage: https://dylanpieper.github.io/chatalot/
Size: 25.3 MB

Statistics

Stars: 24
Watchers: 2
Forks: 2
Open Issues: 0
Releases: 3

Topics

batch batch-processing ellmer llm package r

Created about 1 year ago · Last pushed 7 months ago

Metadata Files

Readme Changelog License

chatalot

chatalot processes lots of large language model chats in R and is an extension of ellmer.

Easily setup sequential and parallel chat processors with support for tool calling, structured data extraction, uploaded content, persistent caching, and sound notifications.

chatalot or ellmer?

| Priority | Function | Description | |----|----|----| | 🛡️ Slow and safe | chatalot::seq_chat() | Process chats in sequence with persistent caching | | ⚖️ Fast and safe | chatalot::future_chat() | Process chats in parallel with persistent caching | | 🚀 Maximum speed | ellmer::parallel_chat() | Process chats in parallel very quickly with no caching | | 💰 Cost savings | ellmer::batch_chat() | Batch APIs; ~50% cheaper with up to 24hr delays |

Installation

From CRAN:

``` r

install.packages("pak")

pak::pak("chatalot") ```

Development version:

r pak::pak("dylanpieper/chatalot")

Setup API Keys

API keys allow access to chat models and are stored as environmental variables. I recommend usethis to setup API keys in your .Renviron such as OPENAI_API_KEY=your-key:

r usethis::edit_r_environ(scope = c("user", "project"))

Basic Usage

Sequential Processing

Process chats in sequence, or one at a time. Use this function to process prompts slowly, such as when providers don't allow parallel processing or have strict rate limits, or when you want to periodically check the responses.

``` r library(chatalot)

chat <- seqchat("openai/gpt-4.1", systemprompt = "Reply concisely, one sentence")

prompts <- c( "What roles do people have in a castle?", "Why are castles needed?", "When was the first castle built?", "Where are most castles located?" )

response <- chat$process(prompts) ```

Access the responses:

``` r response$texts()

> [1] "In a castle, people served as rulers, warriors, administrators,

> craftsmen, and servants who managed its defense, governance, and daily upkeep."

> [2] "Castles have historically been built for defense and power consolidation,

> and today they serve as cultural landmarks that preserve our heritage

> and attract tourism."

> [3] "There isn’t a definitive \"first castle,\" but the earliest structures

> resembling castles emerged in medieval Europe around the 9th century."

> [4] "Most castles are located in Europe, particularly in historically

> turbulent regions like the United Kingdom, France, and Germany."

```

Parallel Processing

Parallel processing requests multiple chats at a time across multiple R processes using future workers:

r chat <- future_chat("openai/gpt-4.1", system_prompt = "Reply concisely, one sentence")

Use this function to process lots of chat prompts simultaneously and quickly. You may want to limit the number of simultaneous requests to meet a provider's rate limits by decreasing the number of workers (default is parallel::detectCores(), which is 10 on my Mac Mini M4):

r response <- chat$process(prompts, workers = 5)

Features

Tool Calling

``` r weather <- data.frame( city = c("Chicago", "New York", "Lisbon"), raining = c("Heavy", "None", "Overcast"), temperature = c("Cool", "Hot", "Warm"), wind = c("Strong", "Weak", "Strong") )

getweather <- tool( function(cities) weather[weather$city %in% cities, ], description = "Report on weather conditions.", arguments = list( cities = typearray(type_string(), "City names") ) )

chat$registertool(getweather)

response <- chat$process(interpolate("Brief weather update for {{weather$city}}?"))

response$texts()

> [1] "Chicago is experiencing heavy rain, cool temperatures, and strong winds."

> [2] "New York is experiencing hot conditions with no rain and light winds."

> [3] "In Lisbon, the weather is overcast with warm temperatures and strong winds."

```

Structured Data Extraction

Extract structured data using type specifications:

``` r prompts <- c( "I go by Alex. 42 years on this planet and counting.", "Pleased to meet you! I'm Jamal, age 27.", "They call me Li Wei. Nineteen years young.", "Fatima here. Just celebrated my 35th birthday last week.", "The name's Robert - 51 years old and proud of it.", "Kwame here - just hit the big 5-0 this year." )

response <- chat$process( prompts, type = typeobject( name = typestring(), age = type_number() ) )

response$texts()

> name age

> 1 Alex 42

> 2 Jamal 27

> 3 Li Wei 19

> 4 Fatima 35

> 5 Robert 51

> 6 Kwame 50

```

Uploaded Content

Process prompts with uploaded content (e.g., images and PDFs):

``` r baseprompt <- "What do you see in the image?" imgprompts <- list( c(baseprompt, contentimageurl("https://www.r-project.org/Rlogo.png")), c(baseprompt, contentimagefile(system.file("httr2.png", package = "ellmer"))) )

response <- chat$process(img_prompts)

response$texts()

> [[1]]

> [1] "The image shows the logo for R, a programming language and software environment

> used for statistical computing and graphics, featuring a stylized blue \"R\"

> inside a gray oval or ring."

> [[2]]

> [1] "The image shows a logo for \"httr2\" featuring a stylized red baseball batter

> silhouette on a dark blue hexagonal background."

```

Persistent Caching

If you interrupt chat processing or experience an error, you can call process() again to resume from the last saved chat, which is cached in an .rds file:

r response <- chat$process(prompts, file = "chat.rds")

If file is not defined, a temporary .rds file will be created by default.

Sound Notifications

Toggle sound notifications on completion, interruption, and error:

r response <- chat$process(prompts, beep = TRUE)

Verbosity Options

By default, the chat echo is set to FALSE to show a progress bar. However, you can still configure echo by first setting progress to FALSE:

``` r prompts <- c( "What is R?", "Explain base R versus tidyverse" )

response <- chat$process(prompts, progress = FALSE, echo = TRUE)

> R is a programming language and software environment used for

> statistical computing and graphics.

> Base R consists of the core functionalities built into R,

> while tidyverse is a collection of packages that offer a more

> consistent, readable, and streamlined approach to data manipulation,

> visualization, and analysis.

```

Methods

texts(): Returns response texts in the same format as the input prompts (i.e., a list if prompts were provided as a list, or a vector if prompts were provided as a vector). When a type is provided, returns a list with one element for each prompt. When type is consistent, returns a data frame with one row for each prompt, and one column for each property.
chats(): Returns a list of chat objects
progress(): Returns processing status

Rate Limits and Retry Methods

The following functions handle API rate limits differently:

chatalot::seq_chat() and chatalot::future_chat(): Rate limits are are not actively managed and are governed by choosing sequential processing or the number of parallel connections (workers), exceeding rate limits will fall back on ellmer's retry strategy
ellmer::parallel_chat(): Rate limits are managed by throttling the requests per minute (rpm) and configuring the number of parallel connections (max_active), exceeding rate limits will fall back on ellmer's retry strategy
ellmer::batch_chat(): Rate limits are managed by the provider

ellmer's retry strategy includes the following options:

options(ellmer_max_tries): Retries requests up to 3 times by default and will retry if the connection fails, not just if the request returns a transient error
options(ellmer_timeout_s): Sets the default timeout time in seconds, which also applies to the initial connection phase

You can also manage rate limits, specifically token usage limits, by limiting the number of maximum tokens per chat. The chat() interface includes a params parameter to configure max_tokens, which also works in chatalot's chat functions.

Owner

Login: dylanpieper
Kind: user

Repositories: 2
Profile: https://github.com/dylanpieper

GitHub Events

Total

Issues event: 4
Watch event: 4
Push event: 66
Pull request event: 1
Fork event: 1
Create event: 1

Last Year

Issues event: 4
Watch event: 4
Push event: 66
Pull request event: 1
Fork event: 1
Create event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 377
Total Committers: 2
Avg Commits per committer: 188.5
Development Distribution Score (DDS): 0.019

Past Year

Commits: 377
Committers: 2
Avg Commits per committer: 188.5
Development Distribution Score (DDS): 0.019

Top Committers

Name	Email	Commits
Dylan Pieper	d**r@g**m	370
Dylan Pieper	d**r@D**l	7

Issues and Pull Requests

Last synced: 6 months ago

Dependencies

DESCRIPTION cran

S7 >= 0.1.0 imports
beepr * imports
cli * imports
ellmer * imports
purrr * imports
rlang * imports
testthat >= 3.0.0 suggests

.github/workflows/pkgdown.yaml actions

JamesIves/github-pages-deploy-action v4.5.0 composite
actions/checkout v4 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

.github/workflows/testthat.yml actions

actions/checkout v3 composite
r-lib/actions/check-r-package v2 composite
r-lib/actions/setup-pandoc v2 composite
r-lib/actions/setup-r v2 composite
r-lib/actions/setup-r-dependencies v2 composite

hellmer

Science Score: 26.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

chatalot

chatalot or ellmer?

Installation

install.packages("pak")

Setup API Keys

Basic Usage

Sequential Processing

> [1] "In a castle, people served as rulers, warriors, administrators,

> craftsmen, and servants who managed its defense, governance, and daily upkeep."

> [2] "Castles have historically been built for defense and power consolidation,

> and today they serve as cultural landmarks that preserve our heritage

> and attract tourism."

> [3] "There isn’t a definitive \"first castle,\" but the earliest structures

> resembling castles emerged in medieval Europe around the 9th century."

> [4] "Most castles are located in Europe, particularly in historically

> turbulent regions like the United Kingdom, France, and Germany."

Parallel Processing

Features

Tool Calling

> [1] "Chicago is experiencing heavy rain, cool temperatures, and strong winds."

> [2] "New York is experiencing hot conditions with no rain and light winds."

> [3] "In Lisbon, the weather is overcast with warm temperatures and strong winds."

Structured Data Extraction

> name age

> 1 Alex 42

> 2 Jamal 27

> 3 Li Wei 19

> 4 Fatima 35

> 5 Robert 51

> 6 Kwame 50

Uploaded Content

> [[1]]

> [1] "The image shows the logo for R, a programming language and software environment

> used for statistical computing and graphics, featuring a stylized blue \"R\"

> inside a gray oval or ring."

> [[2]]

> [1] "The image shows a logo for \"httr2\" featuring a stylized red baseball batter

> silhouette on a dark blue hexagonal background."

Persistent Caching

Sound Notifications

Verbosity Options

> R is a programming language and software environment used for

> statistical computing and graphics.

> Base R consists of the core functionalities built into R,

> while tidyverse is a collection of packages that offer a more

> consistent, readable, and streamlined approach to data manipulation,

> visualization, and analysis.

Methods

Rate Limits and Retry Methods

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

Dependencies