manubot-ai-editor
A tool for performing automatic, AI-assisted revisions of Manubot manuscripts.
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 8 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.1%) to scientific vocabulary
Repository
A tool for performing automatic, AI-assisted revisions of Manubot manuscripts.
Basic Info
Statistics
- Stars: 39
- Watchers: 4
- Forks: 15
- Open Issues: 30
- Releases: 11
Metadata Files
README.md
Manubot AI Editor
A tool for performing automatic, AI-assisted revisions of Manubot manuscripts. Check out the manuscript about this tool for more background information.
Supported Large Language Models (LLMs)
We internally use LangChain to invoke models, which allows our tool to theoretically support whichever model providers LangChain supports. That said, we currently support OpenAI and Anthropic models only, and are working to add support for other model providers.
When using OpenAI models, our evaluations show that gpt-4-turbo
is in general the best model for revising academic manuscripts. Therefore, this is the default option for OpenAI.
We are still evaluating the models for other providers as we add them, and will update this section accordingly as we complete our evaluations.
Using in a Manubot manuscript
Much of these instructions rely on the specific details of GitHub's website interface, which can change over time. See their official docs for more info on configuring GitHub Actions, managing secrets, and running workflows.
Setup
First, you should decide which model provider you'll use. You can find details on how to set up each provider below: - OpenAI: you'll want to make an OpenAI account and create an API key. - Anthropic: you'll want to make an Anthropic account and create an API key.
Start with a manuscript repo forked from Manubot rootstock, then follow these steps:
- In your forks's "▶️ Actions" tab, enable GitHub Actions.
- In your fork's "⚙️ Settings" tab, give GitHub Actions workflows read/write permissions and allow them to create pull requests.
- If you haven't already, follow the directions above to create an account and get an API key for your chosen model provider.
- In your fork's "⚙️ Settings" tab, make a new Actions repository secret with the name
PROVIDER_API_KEYand paste in your API key as the secret.
If you prefer to select less options when running the workflow, you can optionally set up default values for the model provider and model at either the repo or organization level.
In your fork's "⚙️ Settings" tab, you can optionally create the folllowing Actions repository variables:
- AI_EDITOR_MODEL_PROVIDER: Either "openai" or "anthropic"; sets this as the default if "(repo default)" was selected in the workflow parameters.
If this is unspecified and "(repo default)" is selected, the workflow will throw an error.
- AI_EDITOR_LANGUAGE_MODEL: For the given provider, what model to use if the "model" field in the workflow parameters was left empty.
If this is unspecified, Manubot AI Editor will select the default model for your chosen provider.
Multiple Providers
In case you want to use several providers in the same repo, you'll have to register an API key for each provider you intend to use.
Like PROVIDER_API_KEY, these keys are also registered as GitHub secrets, and can be specified at either the repository or organizational level.
We currently support the following secrets, with more to follow as we integrate more providers:
- OPENAI_API_KEY: the API key for the "openai" provider
- ANTHROPIC_API_KEY: the API key for the "anthropic" provider
See the API key variables docs for more information.
Configuring prompts
In order to revise your manuscript, prompts must be provided to the AI model. Manubot rootstock comes with several default, general-purpose prompts so that you can immediately use the AI editor without having to write and configure your own prompts.
But you can also define your own prompts, apply them to specific content, and control other behavior using YAML configuration files that you include with your manuscript. See docs/custom-prompts.md for more information.
Running the editor
- In your forks's "▶️ Actions" tab, go to the
ai-revisionworkflow. - Manually run the workflow. You should see several options you can specify, such as the branch to revise and the AI model to use. See these docs for an explanation of each option.
- Within a few minutes, the workflow should run, the editor should generate revisions, and a pull request should be created in your fork!
Caveats
In the current implementation, the AI editor can only process, independently, one paragraph at a time. This limits the contextual information the LLM receives and thus the specificity of what it can check and fix. For instance, the revision process does not use information in other places of the manuscript to revise the current paragraph. In addition, we provide section-specific prompts to revise text from different sections of the manuscript, such as the Abstract, Introduction, Results, etc. However, some paragraphs from the same section need different revision strategies. For example, in the Discussion section of a manuscript, the first paragraph should typically summarize the findings from the Results section, while the rest of the paragraphs should follow a different structure. The AI editor, however, can only judge each paragraph with the same section-specific prompt.
Finally, in addition to revising the paragraph using an LLM, the AI Editor will also perform some postprocessing of the revised text such as using one line per sentence to simplify diffs. This might not work as expected in some cases.
We plan to reduce or remove these limitations in the future.
Using from the command line
First, install Manubot in a Python environment, e.g.:
bash
pip install --upgrade manubot[ai-rev]
You also need to export an environment variable with your model provider's API key, e.g.:
```bash export OPENAIAPIKEY=ABCD1234
export ANTHROPICAPIKEY=ABCD1234 # if you were using anthropic
```
If you only ever use one model provider (e.g., just OpenAI or just Anthropic), you can alternatively provide just
PROVIDER_API_KEY and it will be used with any model provider the tool invokes.
To select a specific provider, set the environment variable AI_EDITOR_MODEL_PROVIDER to one of the following values:
- openai for OpenAI
- anthropic for Anthropic
If AI_EDITOR_MODEL_PROVIDER is unset, it will default to "openai".
You can also provide other environment variables that will change the behavior of the editor (such as revising certain files only).
For example, to specify the temperature parameter of OpenAI models, you can set the variable export AI_EDITOR_TEMPERATURE=0.50.
See the complete list of supported variables for
more information.
Then, from the root directory of your Manubot manuscript, run the following:
```bash
⚠ THIS WILL OVERWRITE YOUR LOCAL MANUSCRIPT
manubot ai-revision --content-directory content/ --config-directory ci/ ```
The editor will revise each paragraph of your manuscript and write back the revised files in the same directory. Finally, (assuming you are tracking changes to your manuscript with git) you can review each change and either keep it (commit it) or reject it (revert it).
Using model providers' APIs can sometimes incur costs. If you're worried about this or otherwise want to test things out before hitting the real API, you can run a local "dry run" by with a "fake" model:
bash
manubot ai-revision \
--content-directory content/ \
--config-directory ci/ \
--model-type DummyManuscriptRevisionModel \
--model-kwargs add_paragraph_marks=True
When it finishes, check out your manuscript files. This will allow you to detect whether the editor is identifying paragraphs correctly. If you find a problem, please report the issue.
Text Encodings
By default, Manubot AI Editor will assume that your input and output files are
encoded in the utf-8 encoding.
If you'd prefer for the tool to make a best effort to guess the input encoding
and write the output in the same encoding, set the env var
AI_EDITOR_SRC_ENCODING to _auto_; the detected encoding will also be used to
write the output files.
Alternatively, if you prefer to have your files interpreted or written using
specific encodings, you can specify the input encoding with the
AI_EDITOR_SRC_ENCODING and the output encoding with the
AI_EDITOR_DST_ENCODING environment variables.
Seethese variables' help docs for more information.
Also, see Python 3 Docs: Standard Encodings for a list of possible encodings.
Using the Python API
You can also use the functions of the editor directly from Python.
Since these functions are low-level and not tied to a particular manuscript, you don't have to install Manubot and can just install this package:
bash
pip install -U manubot-ai-editor
Example usage:
```python import shutil from pathlib import Path
from manubotaieditor.editor import ManuscriptEditor from manubotaieditor.models import GPT3CompletionModel
create a manuscript editor object.
me = ManuscriptEditor( # where your Markdown files (*.md) are contentdir="content", # where CI-related configuration, including the AI editor's, is stored. # optional, will fallback to defaults if omitted. configdir="ci" )
create a model to revise the manuscript
(if using another provider, e.g. anthropic, replace modelprovider="openai" with modelprovider="anthropic")
model = GPT3CompletionModel( title=me.title, keywords=me.keywords, model_provider="openai", )
create a temporary directory to store the revised manuscript
outputfolder = (Path("tmp") / "manubot-ai-editor-output").resolve() shutil.rmtree(outputfolder, ignoreerrors=True) outputfolder.mkdir(parents=True, exist_ok=True)
revise the manuscript
me.revisemanuscript(outputfolder, model)
the revised manuscript is now in the output_folder
uncomment the following code if you want to OVERWRITE the original manuscript in the content folder with the revised manuscript
for f in output_folder.glob("*"):
f.rename(me.content_dir / f.name)
# remove output folder
output_folder.rmdir()
```
The cli_process function in this file provides another example of how to use the API.
Development and Contributions
Please see our CONTRIBUTING.md guide for more information on developing this project or making a contributon.
Owner
- Name: Manubot
- Login: manubot
- Kind: organization
- Website: https://manubot.org
- Repositories: 7
- Profile: https://github.com/manubot
Next generation of scholarly publishing: open, collaborative, reproducible, free.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
---
cff-version: 1.2.0
title: Manubot AI Editor
message: >-
If you use this work in some way, please cite both the article from
preferred-citation and the software itself. These details can be
found within the CITATION.cff file.
type: software
authors:
- given-names: Milton
family-names: Pividori
orcid: "https://orcid.org/0000-0002-3035-4403"
- given-names: Faisal
family-names: Alquaddoomi
orcid: "https://orcid.org/0000-0003-4297-8747"
- given-names: Vincent
family-names: Rubinetti
orcid: "https://orcid.org/0000-0002-4655-3773"
- given-names: Dave
family-names: Bunten
orcid: "https://orcid.org/0000-0001-6041-3665"
- given-names: Casey
family-names: Greene
orcid: "https://orcid.org/0000-0001-8713-9213"
repository-code: "https://github.com/manubot/manubot-ai-editor"
abstract: |
A tool for performing automatic, AI-assisted revisions of Manubot manuscripts.
keywords:
- manubot
- AI
- editor
- manuscript
- revision
- research
- large-language-models
license: BSD-3-Clause
identifiers:
- description: Manuscript
type: doi
value: "10.1093/jamia/ocae139"
- description: Software
type: doi
value: "10.5281/zenodo.14911573"
preferred-citation:
title: >-
A publishing infrastructure for Artificial Intelligence (AI)-assisted academic authoring
type: article
url: https://academic.oup.com/jamia/article/31/9/2103/7693927
authors:
- given-names: Milton
family-names: Pividori
orcid: "https://orcid.org/0000-0002-3035-4403"
- given-names: Casey S.
family-names: Greene
orcid: "https://orcid.org/0000-0001-8713-9213"
date-published: 2024-09-01
identifiers:
- type: doi
value: 10.1093/jamia/ocae139
GitHub Events
Total
- Create event: 12
- Release event: 2
- Issues event: 20
- Watch event: 4
- Delete event: 4
- Issue comment event: 28
- Push event: 65
- Pull request event: 36
- Pull request review event: 76
- Pull request review comment event: 60
- Fork event: 10
Last Year
- Create event: 12
- Release event: 2
- Issues event: 20
- Watch event: 4
- Delete event: 4
- Issue comment event: 28
- Push event: 65
- Pull request event: 36
- Pull request review event: 76
- Pull request review comment event: 60
- Fork event: 10
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 50
- Total pull requests: 60
- Average time to close issues: 2 months
- Average time to close pull requests: 22 days
- Total issue authors: 11
- Total pull request authors: 6
- Average comments per issue: 1.44
- Average comments per pull request: 1.13
- Merged pull requests: 55
- Bot issues: 0
- Bot pull requests: 2
Past Year
- Issues: 17
- Pull requests: 43
- Average time to close issues: about 2 months
- Average time to close pull requests: 25 days
- Issue authors: 6
- Pull request authors: 3
- Average comments per issue: 0.76
- Average comments per pull request: 1.02
- Merged pull requests: 39
- Bot issues: 0
- Bot pull requests: 2
Top Authors
Issue Authors
- miltondp (19)
- d33bs (12)
- falquaddoomi (7)
- vincerubinetti (3)
- danich1 (2)
- dhimmel (1)
- castedo (1)
- shanshen123654789 (1)
- SilasK (1)
- cgreene (1)
- agitter (1)
Pull Request Authors
- d33bs (34)
- falquaddoomi (24)
- miltondp (9)
- dependabot[bot] (2)
- cgreene (2)
- vincerubinetti (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 39 last-month
- Total dependent packages: 1
- Total dependent repositories: 0
- Total versions: 35
- Total maintainers: 3
pypi.org: manubot-ai-editor
A Manubot plugin to revise a manuscript using GPT-3
- Homepage: https://github.com/manubot/manubot-ai-editor
- Documentation: https://manubot-ai-editor.readthedocs.io/
- License: BSD-3-Clause
-
Latest release: 0.5.5
published about 1 year ago
Rankings
Maintainers (3)
Dependencies
- openai >=0.25
- pyyaml *
- openai 0.28
- pip
- pytest 7.*
- python 3.10.*
- pyyaml 6.*