walert

Contains all utility code for 'Behind The Scenes' of Walert.

https://github.com/sachinpc1993/walert

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 4 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (9.2%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Contains all utility code for 'Behind The Scenes' of Walert.

Basic Info

Host: GitHub
Owner: sachinpc1993
Language: Python
Default Branch: main
Size: 7.62 MB

Statistics

Stars: 0
Watchers: 3
Forks: 3
Open Issues: 0
Releases: 0

Created almost 3 years ago · Last pushed 10 months ago

Metadata Files

Readme Citation

Walert - A Conversational Agent

We built Walert, a conversational agent that answers FAQs about programs of study that are offered in the School of Computing Technologies at RMIT University. This intent-based approach, deployed in Amazon Echo device, was showcased as a demo at RMIT University’s Open Day in August 2023.

Teaser Video: https://drive.google.com/file/d/1Z2ZRveFYlX96v4ncq4RL-gzNbOlCJYGL/view?usp=sharing

Amazon Echo Demo Link: https://bit.ly/chiir24walertdemovideo

Demo Video Link (Intent-Based version deployed on Amazon Echo Device): https://bit.ly/WalertIntentDemo

Demo Video Link (Retrieval Augmented Generation based version): https://bit.ly/WalertRAGDemo

You can view our poster presented at CHIIR24: Walert Poster

Overall Architecture

Note: This repository contains all utility code for 'Behind The Scenes' of Walert.

You will find in quantitative_eval folder all the required codes and files to rerun the experiments in the paper.

Evaluation Results

NDCG for Known and Inferred Questions NDCG

% of unanswered out-of-knowledge-base questions

BERTScore

ROUGE-1 ROUGE

Citation

If you use or reference this work, please cite it as follows: @inproceedings{10.1145/3627508.3638309, author = {Pathiyan Cherumanal, Sachin and Tian, Lin and Abushaqra, Futoon M. and Magnoss\~{a}o de Paula, Angel Felipe and Ji, Kaixin and Ali, Halil and Hettiachchi, Danula and Trippas, Johanne R. and Scholer, Falk and Spina, Damiano}, title = {Walert: Putting Conversational Information Seeking Knowledge into Action by Building and Evaluating a Large Language Model-Powered Chatbot}, year = {2024}, isbn = {9798400704345}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3627508.3638309}, doi = {10.1145/3627508.3638309}, booktitle = {Proceedings of the 2024 Conference on Human Information Interaction and Retrieval}, pages = {401–405}, numpages = {5}, keywords = {conversational information seeking, large language models, retrieval-augmented generation}, location = {<conf-loc>, <city>Sheffield</city>, <country>United Kingdom</country>, </conf-loc>}, series = {CHIIR '24} }

Owner

Name: Sachin Pathiyan Cherumanal
Login: sachinpc1993
Kind: user
Location: Melbourne, Victoria
Company: RMIT University & Five9 Inc.

Website: https://linktr.ee/sachinpc
Twitter: SachinPC10
Repositories: 2
Profile: https://github.com/sachinpc1993

PhD Student @ RMIT University | Data Scientist @Five9

Citation (CITATION.bib)

@inproceedings{pathiyan2024walert,
author = {Pathiyan Cherumanal, Sachin and Tian, Lin and Abushaqra, Futoon M. and Magnoss\~{a}o de Paula, Angel Felipe and Ji, Kaixin and Ali, Halil and Hettiachchi, Danula and Trippas, Johanne R. and Scholer, Falk and Spina, Damiano},
title = {Walert: Putting Conversational Information Seeking Knowledge into Action by Building and Evaluating a Large Language Model-Powered Chatbot},
year = {2024},
isbn = {9798400704345},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3627508.3638309},
doi = {10.1145/3627508.3638309},
abstract = {Creating and deploying customized applications is crucial for operational success and enriching user experiences in the rapidly evolving modern business world. A prominent facet of modern user experiences is the integration of chatbots or voice assistants. The rapid evolution of Large Language Models (LLMs) has provided a powerful tool to build conversational applications. We present Walert, a customized LLM-based conversational agent able to answer frequently asked questions about computer science degrees and programs at RMIT University. Our demo aims to showcase how conversational information-seeking researchers can effectively communicate the benefits of using best practices to stakeholders interested in developing and deploying LLM-based chatbots. These practices are well-known in our community but often overlooked by practitioners who may not have access to this knowledge. The methodology and resources used in this demo serve as a bridge to facilitate knowledge transfer from experts, address industry professionals’ practical needs, and foster a collaborative environment. The data and code of the demo are available at&nbsp;https://github.com/rmit-ir/walert.},
booktitle = {Proceedings of the 2024 Conference on Human Information Interaction and Retrieval},
pages = {401–405},
numpages = {5},
keywords = {conversational information seeking, large language models, retrieval-augmented generation},
location = {<conf-loc>, <city>Sheffield</city>, <country>United Kingdom</country>, </conf-loc>},
series = {CHIIR '24}
}

GitHub Events

Total

Push event: 1

Last Year

Push event: 1

Dependencies

quantitative_eval/requirements.txt pypi

Cython ==3.0.3
Jinja2 ==3.1.2
MarkupSafe ==2.1.3
Pillow ==10.1.0
PyYAML ==6.0.1
Pygments ==2.16.1
annotated-types ==0.5.0
autocast ==0.0.1b1
beautifulsoup4 ==4.12.2
blis ==0.7.11
catalogue ==2.0.10
cbor ==1.0.0
cbor2 ==5.5.0
certifi ==2023.7.22
charset-normalizer ==3.3.0
click ==8.1.7
cloudpathlib ==0.15.1
coloredlogs ==15.0.1
confection ==0.1.3
contourpy ==1.1.1
cramjam ==2.7.0
cycler ==0.12.1
cymem ==2.0.8
faiss ==1.7.4
fastparquet ==2023.8.0
filelock ==3.12.4
flatbuffers ==23.5.26
fonttools ==4.43.1
fsspec ==2023.9.2
huggingface-hub ==0.16.4
humanfriendly ==10.0
idna ==3.4
ijson ==3.2.3
importlib-metadata ==6.8.0
importlib-resources ==6.1.0
inscriptis ==2.3.2
ir-datasets ==0.5.5
joblib ==1.3.2
kiwisolver ==1.4.5
langcodes ==3.3.0
lightgbm ==4.1.0
llvmlite ==0.41.1
lxml ==4.9.3
lz4 ==4.3.2
markdown-it-py ==3.0.0
matplotlib ==3.7.3
mdurl ==0.1.2
mpmath ==1.3.0
murmurhash ==1.0.10
networkx ==3.1
nmslib ==2.1.2
numba ==0.58.1
numpy ==1.24.3
onnxruntime ==1.16.0
orjson ==3.9.9
packaging ==23.2
pandas ==2.0.3
pathy ==0.10.2
pip ==23.2.1
preshed ==3.0.9
protobuf ==4.24.4
psutil ==5.9.5
pyautocorpus ==0.1.12
pydantic ==2.4.2
pydantic_core ==2.10.1
pyjnius ==1.5.0
pyparsing ==3.1.1
pyserini ==0.22.0
python-dateutil ==2.8.2
pytrec-eval ==0.5
pytz ==2023.3.post1
ranx ==0.3.18
regex ==2023.10.3
requests ==2.31.0
rich ==13.6.0
safetensors ==0.3.3
scikit-learn ==1.3.1
scipy ==1.10.1
seaborn ==0.13.0
sentencepiece ==0.1.99
setuptools ==68.0.0
six ==1.16.0
smart-open ==6.4.0
soupsieve ==2.5
spacy ==3.7.1
spacy-legacy ==3.0.12
spacy-loggers ==1.0.5
srsly ==2.4.8
sympy ==1.12
tabulate ==0.9.0
thinc ==8.2.1
threadpoolctl ==3.2.0
tokenizers ==0.14.0
torch ==2.1.0
tqdm ==4.66.1
transformers ==4.34.0
trec-car-tools ==2.6
typer ==0.9.0
typing_extensions ==4.8.0
tzdata ==2023.3
unlzw3 ==0.2.2
urllib3 ==2.0.6
warc3-wet ==0.2.3
warc3-wet-clueweb09 ==0.2.5
wasabi ==1.1.2
weasel ==0.3.2
wheel ==0.37.1
zipp ==3.17.0
zlib-state ==0.1.6

quantitative_eval/src/intent-based/lambda/requirements.txt pypi

ask-sdk-core ==1.11.0
boto3 ==1.9.216

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science