https://github.com/cyberagentailab/cxsimulator

CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment [Kasuga+, CIKM'24]

https://github.com/cyberagentailab/cxsimulator

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.1%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment [Kasuga+, CIKM'24]

Basic Info
  • Host: GitHub
  • Owner: CyberAgentAILab
  • License: apache-2.0
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 902 KB
Statistics
  • Stars: 1
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme License

README.md

CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment

Akira Kasuga   Ryo Yonetani  

CyberAgent, Inc.  

CIKM 2024

arXiv paper License

Cover Image


📌 Overview

CXSimulator framework uses LLMs to represent user behavior events as semantic embeddings and predicts transitions between these events. This enables simulation of user reactions to new campaigns, eliminating the need for costly online testing and providing valuable insights to marketers.

🛠 Prerequisites

| Operating System | Based on | | ------------------------------ | ------------------------------------------------------- | | Debian GNU/Linux 12 (bookworm) | python:3.10-bookworm |

| Software | Install | | --------------------- | ----------------------------------------------------------------------------------- | | Python >= 3.10,< 3.12 | - | | Poetry >= 1.8.0 | installer | | pre-commit >= 3.8.0 | pip install pre-commit |

| Cloud Infrastructure | Link | Summary | | -------------------- | --------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Cloud BigQuery | Google Analytics Sample | The dataset provides 12 months (August 2016 to August 2017) of obfuscated Google Analytics 360 data from the Google Merchandise Store , a real ecommerce store that sells Google-branded merchandise, | | AzureOpenAI | Generate embeddings with Azure OpenAI | An embedding is a special format of data representation that can be easily utilized by machine learning models and algorithms. The embedding is an information dense representation of the semantic meaning of a piece of text. |

🔧 Setup

shell poetry install

🚀 Getting started (Using Cache Data)

Help

shell poetry run python -m cxsim --help poetry run task --list

Preprocess and Train

shell poetry run task model_using_cache

Simulation

shell poetry run task simulation_using_cache

📊 Execute All Steps

Environment Setting

[!IMPORTANT] Authentication for cloud services is a prerequisite for executing all steps and may incur some costs.

Google Cloud

  1. Enable BigQuery API in your project.

  2. Install gcloud CLI

  3. Auth Login.

shell gcloud auth application-default login

Microsoft AzureOpenAI

  1. Copy template bash cp ./src/cxsim/config/.env.template ./src/cxsim/config/.env
  2. Add the following content to the .env file:

bash # Azure OpenAI AZURE_OPENAI_US_ENDPOINT=XXXXXXXX AZURE_OPENAI_US_VERSION=2024-03-01-preview AZURE_OPENAI_US_KEY=XXXXXXXX # Google Cloud GOOGLE_CLOUD_PROJECT_ID=XXXXXXXX

Preprocess and Train

[!NOTE] Once you've completed poetry run task model_using_cache, you can skip this step. In the next step, you'll simulate your campaigns using pre-trained models.

shell poetry run task model

Simulation

shell poetry run task simulation --campaign-title "Enjoy 1 month Free of YouTube Premium for Youtube related Product"

If you would like to new data period,

shell poetry run task simulation_for_new --campaign-title "Enjoy 1 month Free of YouTube Premium for Youtube related Product"

📄 Citation

bibtex @inproceedings{kasuga2024CXSimulator title={CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment}, author={Akira Kasuga and Ryo Yonetani}, booktitle={Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (CIKM ’24)}, year={2024}, url={https://github.com/CyberAgentAILab/CXSimulator.git}, doi={https://doi.org/10.1145/3627673.3679894} }

License

This project is licensed under the Apache License 2.0.

Owner

  • Name: CyberAgent AI Lab
  • Login: CyberAgentAILab
  • Kind: organization
  • Location: Japan

GitHub Events

Total
  • Watch event: 10
  • Member event: 2
  • Push event: 1
  • Public event: 1
  • Fork event: 1
Last Year
  • Watch event: 10
  • Member event: 2
  • Push event: 1
  • Public event: 1
  • Fork event: 1

Dependencies

poetry.lock pypi
  • annotated-types 0.7.0
  • anyio 4.6.0
  • cachetools 5.5.0
  • certifi 2024.8.30
  • charset-normalizer 3.3.2
  • colorama 0.4.6
  • db-dtypes 1.3.0
  • distro 1.9.0
  • exceptiongroup 1.2.2
  • google-api-core 2.20.0
  • google-auth 2.35.0
  • google-cloud-bigquery 3.25.0
  • google-cloud-core 2.4.1
  • google-crc32c 1.6.0
  • google-resumable-media 2.7.2
  • googleapis-common-protos 1.65.0
  • grpcio 1.66.1
  • grpcio-status 1.66.1
  • h11 0.14.0
  • httpcore 1.0.5
  • httpx 0.27.2
  • idna 3.10
  • imbalanced-learn 0.12.3
  • jiter 0.5.0
  • joblib 1.4.2
  • lightgbm 4.5.0
  • markdown-it-py 3.0.0
  • mdurl 0.1.2
  • mslex 1.2.0
  • mypy 1.11.2
  • mypy-extensions 1.0.0
  • networkx 3.3
  • numpy 1.26.4
  • openai 1.47.1
  • packaging 24.1
  • pandas 2.2.3
  • proto-plus 1.24.0
  • protobuf 5.28.2
  • psutil 5.9.8
  • pyarrow 17.0.0
  • pyasn1 0.6.1
  • pyasn1-modules 0.4.1
  • pydantic 2.9.2
  • pydantic-core 2.23.4
  • pygments 2.18.0
  • python-dateutil 2.9.0.post0
  • python-dotenv 1.0.1
  • pytz 2024.2
  • requests 2.32.3
  • rich 13.8.1
  • rsa 4.9
  • ruff 0.4.10
  • scikit-learn 1.5.2
  • scipy 1.14.1
  • six 1.16.0
  • sniffio 1.3.1
  • taskipy 1.13.0
  • threadpoolctl 3.5.0
  • tomli 2.0.1
  • tqdm 4.66.5
  • typing-extensions 4.12.2
  • tzdata 2024.2
  • urllib3 2.2.3
pyproject.toml pypi
  • mypy ^1.10.0 develop
  • ruff ^0.4.5 develop
  • db_dtypes ^1.2.0
  • google-cloud-bigquery ^3.20.1
  • imbalanced-learn ^0.12.2
  • lightgbm ^4.3.0
  • networkx ^3.3
  • numpy ^1.26.4
  • openai ^1.14.1
  • pandas ^2.2.1
  • python >=3.10,<3.12
  • python-dotenv ^1.0.1
  • rich ^13.7.1
  • scikit-learn ^1.4.1
  • taskipy ^1.13.0