mallm
Framework: Multi-Agent LLMs For Conversational Task-Solving (MALLM)
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.8%) to scientific vocabulary
Keywords
Repository
Framework: Multi-Agent LLMs For Conversational Task-Solving (MALLM)
Basic Info
Statistics
- Stars: 10
- Watchers: 1
- Forks: 0
- Open Issues: 6
- Releases: 7
Topics
Metadata Files
README.md
MALLM
Multi-Agent LLMs For Conversational Task-Solving: Framework
What does MALLM do?
Take a look at our demo to understand how MALLM structures multi-agent debates and what customization options it has.
Install
Create an environment with:
conda create --name mallm python=3.12
Package
Install as a package:
pip install -e .
Create Data
Download and create the test data: python data/data_downloader.py --datasets=[SQuAD2,ETPC]
You can use any dataset for this project as long as it follows this basic format. These datasets are supported by our automated formatting pipeline: AquaRat, BBQGenderIdentity, BTVote, ETHICS, ETPC, Europarl, GPQA, GSM8K, IFEval, MMLU, MMLUPro, MUSR, MathLvl5, MoCaMoral, MoralExceptQA, MultiNews, SQuAD2, SimpleEthicalQuestions, StrategyQA, WMT19DeEn, WinoGrande, XSum
Run from Terminal
MALLM relies on an external API like OpenAI or Text Generation Inference by Huggingface.
Once the endpoint is available, you can initiate all discussions with a single script. Example with TGI:
python mallm/scheduler.py --input_json_file_path=data/datasets/etpc_debugging.json --output_json_file_path=test_out.json --task_instruction_prompt="Paraphrase the input text." --endpoint_url="http://127.0.0.1:8080/v1"
Or with OpenAI:
python mallm/scheduler.py --input_json_file_path=data/datasets/etpc_debugging.json --output_json_file_path=test_out.json --task_instruction_prompt="Paraphrase the input text." --endpoint_url="https://api.openai.com/v1" --api_key="<your-key>"
Run command line scripts
You can run the command line scripts from the terminal. The following command will run the scheduler with the given parameters:
mallm-run --input_json_file_path=data/datasets/etpc_debugging.json --output_json_file_path=test_out.json --task_instruction_prompt="Paraphrase the input text." --endpoint_url="http://127.0.0.1:8080/v1" --model_name="tgi"
or use the evaluation script:
mallm-evaluate --input_json_file_path=test_out.json --output_json_file_path=test_out_evaluated.json --metrics=[bleu,rouge]
Run as Module
If installed, you can use MALLM in code:
```py from mallm import scheduler from mallm.utils.config import Config
mallmscheduler = scheduler.Scheduler( Config( inputjsonfilepath="data/datasets/etpcdebugging.json", outputjsonfilepath="testout.json", taskinstructionprompt="Paraphrase the input text.", endpointurl="http://127.0.0.1:8080/v1" ) ) mallm_scheduler.run() ```
Code Structure
MALLM is composed of three parts.
The framework follows this structure and can be found in the mallm directory.
1) Agents (subdirectory: mallm/agents/)
2) Discourse Policy (subdirectory: mallm/discourse_policy/)
3) Decision Protocol (subdirectory: mallm/decision_protocol/)
Experiments can be implemented as a separate repository, loading MALLM as a package.
Arguments
Config Arguments:
py
input_json_file_path: str = None
output_json_file_path: str = None
task_instruction_prompt: str = None
task_instruction_prompt_template: Optional[str] = None
endpoint_url: str = "https://api.openai.com/v1"
model_name: str = "gpt-3.5-turbo"
api_key: str = "-"
max_turns: int = 10
skip_decision_making: bool = False
discussion_paradigm: str = "memory"
response_generator: str = "simple"
decision_protocol: str = "hybrid_consensus"
visible_turns_in_memory: int = 2
debate_rounds: int = 2
concurrent_api_requests: int = 100
use_baseline: bool = False
use_chain_of_thought: bool = True
num_agents: int = 3
num_neutral_agents: int = 0
agent_generator: str = "expert"
agent_generators_list: list = []
trust_remote_code: bool = False
num_samples: Optional[int] = None
hf_dataset_split: Optional[str] = "test"
hf_token: Optional[str] = None
hf_dataset_version: Optional[str] = None
hf_dataset_input_column: Optional[str] = None
hf_dataset_reference_column: Optional[str] = None
hf_dataset_context_column: Optional[str] = None
use_ablation: bool = False
shuffle_input_samples: bool = False
all_agents_generate_first_draft: bool = False
all_agents_generate_draft: bool = False
voting_protocols_with_alterations: bool = False
calculate_persona_diversity: bool = False
challenge_final_results: bool = False
judge_intervention: Optional[str] = None
judge_metric: Optional[str] = None
judge_endpoint_url: Optional[str] = None
judge_model_name: Optional[str] = None
judge_api_key: str = "-"
judge_always_intervene: bool = False
Discussion Parameters:
Response Generators: critical, freetext, reasoning, simple, splitfreetext
Decision Protocols: approval_voting, consensus_voting, cumulative_voting, hybrid_consensus, judge, majority_consensus, ranked_voting, simple_voting, supermajority_consensus, unanimity_consensus
Persona Generators: expert, ipip, mock, nopersona
Discussion Paradigms: collective_refinement, debate, memory, relay, report
Evaluation
We provide some basic evaluation metrics that can be directly applied to the output json of mallm.
Supported metrics: answerability, bertscore, bleu, ifeval, includes_answer, meteor, multichoice, rouge, squad
From terminal:
mallm-evaluate --input_json_file_path=test_out.json --output_json_file_path=test_out_evaluated.json --metrics=[bleu,rouge]
From script:
```py from mallm.evaluation.evaluator import Evaluator
evaluator = Evaluator(inputfilepath="test_out.json", metrics=["bleu", "rouge"], extensive=False) evaluator.process() ```
Logging
To enable logging you can add a handler to the library logger. This can be done with the following code
```py import logging
Configure logging for the library
librarylogger = logging.getLogger("mallm") librarylogger.setLevel(logging.INFO)
Add handlers to the logger
stream_handler = logging.StreamHandler()
Optionally set a formatter
formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s') stream_handler.setFormatter(formatter)
Attach the handler to the logger
librarylogger.addHandler(streamhandler) ```
Using the Batch Executor
The batch executor allows you to run multiple configurations of the MALLM (Multi-Agent Language Model) scheduler in sequence. This is useful for running experiments with different parameters or processing multiple datasets.
Location
- The batch executor script is located in the
mallm/scriptsfolder and is namedbatch_mallm.py. - A template for the batch configuration file is provided as
batch.json.templatein the same folder.
Setup
- Prepare your configuration file:
- Copy the
batch.json.templatefile and rename it (e.g.,my_batch_config.json). - Edit the JSON file to define your configurations. The file has four main sections:
name: A descriptive name for the batch of runs. This is optional but will be added to the output filename and can help identify the purpose of the batch.repeats: The number of times to repeat each run. This is useful for running multiple trials with the same configuration.common: Contains settings that apply to all runs unless overridden.runs: An array of run-specific configurations.
- Copy the
Example:
json
{
"name": "test",
"repeats": 2,
"common": {
"model_name": "gpt-3.5-turbo",
"max_turns": 10,
"num_agents": 3
},
"runs": [
{
"input_json_file_path": "path/to/data1.json",
"output_json_file_path": "path/to/output1.json",
"task_instruction_prompt": "Instruction for run 1"
},
{
"input_json_file_path": "path/to/data2.json",
"output_json_file_path": "path/to/output2.json",
"task_instruction_prompt": "Instruction for run 2",
"model_name": "gpt-4",
"max_turns": 15
}
]
}
In this example, the second run overrides the model_name and max_turns settings from the common configuration.
- Ensure all required dependencies are installed.
Running the Batch Executor
To run the batch executor, use the following command from the terminal:
mallm-batch path/to/your/batch_config.json
Behavior
- The batch executor will process each run configuration in the order they appear in the JSON file.
- For each run:
- It will create a
Configobject by merging the common settings with the run-specific settings. - It will then initialize a
Schedulerwith this configuration and run it. - Progress and any errors will be printed to the console.
- It will create a
- If a configuration is invalid or encounters an error during execution, the batch processor will skip to the next run.
- The process continues until all runs have been attempted.
Tips
- Place settings that are common to most or all runs in the
commonsection to reduce repetition. - Run-specific settings will override common settings if both are specified.
- Always test your configurations individually before running them in a batch to ensure they work as expected.
- Use descriptive output file names to easily identify the results of each run.
- Monitor the console output for any error messages or skipped configurations.
By using the batch executor with common settings, you can easily manage multiple experiments or process various datasets with shared parameters, saving time and reducing the chance of configuration errors.
Contributing
If you want to contribute, please use this pre-commit hook to ensure the same formatting for everyone.
bash
pip install pre-commit
pre-commit install
Testing
You can run unit tests locally:
pytest ./test/
Citation
If you use this repository for your research work, please cite it in the following way.
comming soon
Owner
- Name: Multi Agent LLMs
- Login: Multi-Agent-LLMs
- Kind: organization
- Repositories: 1
- Profile: https://github.com/Multi-Agent-LLMs
Organization for MALLM: Multi Agent LLMs
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Becker" given-names: "Jonas" orcid: "https://orcid.org/0009-0006-6438-1211" - family-names: "Kaesberg" given-names: "Lars" orcid: "https://orcid.org/0009-0002-1686-3743" - family-names: "Bauer" given-names: "Niklas" title: "mallm" version: 0.1.0 date-released: 2024-06-17 url: "https://github.com/Multi-Agent-LLMs/mallm"
GitHub Events
Total
- Create event: 18
- Issues event: 3
- Release event: 12
- Watch event: 21
- Delete event: 25
- Issue comment event: 10
- Push event: 113
- Pull request review comment event: 13
- Pull request review event: 17
- Pull request event: 17
Last Year
- Create event: 18
- Issues event: 3
- Release event: 12
- Watch event: 21
- Delete event: 25
- Issue comment event: 10
- Push event: 113
- Pull request review comment event: 13
- Pull request review event: 17
- Pull request event: 17
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 2
- Total pull requests: 5
- Average time to close issues: 9 months
- Average time to close pull requests: 23 days
- Total issue authors: 1
- Total pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.6
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 5
- Average time to close issues: 9 months
- Average time to close pull requests: 23 days
- Issue authors: 1
- Pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.6
- Merged pull requests: 3
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- lkaesberg (2)
- jpwahle (1)
- jonas-becker (1)
Pull Request Authors
- jonas-becker (8)
- lkaesberg (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 39 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 5
- Total maintainers: 1
pypi.org: mallm
Multi-Agent Large Language Models for Collaborative Task-Solving.
- Documentation: https://mallm.readthedocs.io/
- License: Apache-2.0
-
Latest release: 1.0.5
published 8 months ago
Rankings
Maintainers (1)
Dependencies
- abatilo/actions-poetry v2 composite
- actions/checkout v4 composite
- actions/github-script v6 composite
- actions/setup-python v3 composite
- 129 dependencies
