model-quantization-aggregation
Replication package for the paper "Aggregating empirical evidence from data strategies studies: a case on model quantization" published in the 19th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.4%) to scientific vocabulary
Keywords
Repository
Replication package for the paper "Aggregating empirical evidence from data strategies studies: a case on model quantization" published in the 19th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).
Basic Info
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
model-quantization-aggregation
Replication package for the paper:
"Aggregating empirical evidence from data strategies studies: a case on model quantization" submitted to the 19th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM).
Contents
This replication package contains the following components:
Data:
- Raw, external, interim, and processed data are stored in the
data/directory.
- Raw, external, interim, and processed data are stored in the
Source Code:
- Located in the
src/directory, it includes scripts for data processing, analysis, and evidence extraction. - Key modules:
data/papers/entities.py&data/papers/knowledge_extraction.py: Define the structure and data extraction logic for the papers analyzed.data/download.py: Downloads the list of papers from arXiv and merges them with the Scopus list.data/selection/llm.py: Implements logic for selecting studies using large language models.
- Located in the
Jupyter Notebooks:
- Located in the
notebooks/directory, these notebooks contain the analysis and visualization of the data. - Notebooks include:
1.0-llm-promt-refinement.ipynb: Refines the prompt for LLMs and the selection of LLM.2.0-model-quantization-paper-selection.ipynb: Filters the raw list of papers using the selected GEMINI 2.0.3.0-final-selection-analysis.ipynb: Analyzes the final selection of papers.4.0-paper-metadata-analysis.ipynb: Analyzes metadata from selected papers.5.0-evidence-analysis.ipynb: Analyzes evidence extracted from the papers and generates the forest plot.
- Located in the
Documentation:
data/processed/evidence-diagrams-mapping.md: Links to evidence diagrams generated during the study.data/processed/paperkey/metadata.json: Contains metadata for the specific paper.data/processed/paperkey/systematic-studies-quality-evaluation.md: Contains the filled quality evaluation form for the specific paper.
Project Structure
The project is organized as follows:
data/
raw/ <- Contains the original list of papers retrieved from Scopus
external/ <- Contains the raw data obtained from the selected papers
interim/ <- Contains the interim data used in the analysis
processed/ <- Contains the processed data used in the analysis
evidence-diagrams-mapping.md <- Contains links to the evidence diagrams
notebooks/
1.0-llm-promt-refinement.ipynb
2.0-model-quantization-paper-selection.ipynb
3.0-second-selection-analysis.ipynb
4.0-paper-metadata-analysis.ipynb
5.0-evidence-analysis.ipynb
reports/
figures/
src/
data/
papers/ <- Contains the logic for extracting and analyzing data from papers
entities.py
knowledge_extraction.py
download.py
selection/ <- Utility functions for selecting studies using LLMs,
llm.py including the prompt
forestplot/ <- Utility functions for generating the forest plot
effect_intensity.py <- Definition of the effect intensity thresholds
run_evidence_extraction.py
config.py
.pre-commit-config.yaml
dot-env-template <- Template for environment variables
requirements.txt <- List of Python dependencies
uv.lock <- Environment lock file
LICENSE
pyproject.toml <- Project configuration file
README.md
Usage Instructions
Setup:
- Clone the repository:
bash git clone <repository-url> cd green-tactics-synthesis - Install dependencies:
The project is managed with uv. To install the dependencies, run:
bash uv syncAlternatively, you can use pip to install the dependencies listed inrequirements.txt:
bash pip install -r requirements.txt
- Clone the repository:
Getting the Data:
- Run the download script to fetch the list of papers from arXiv and merge it with the Scopus list:
bash python src/data/downlad.py
- Run the download script to fetch the list of papers from arXiv and merge it with the Scopus list:
- We do not provide the raw data from the selected papers to prevent potential copyright issues. However, we provide instructions on how to obtain the data in each paper's README file. Located in the
data/external/directory.
Extracting the evidence:
- Use the
run_evidence_extraction.pymodule to extract the evidence from the selected papers.
- Use the
Explore the data with Jupyter Notebooks:
- Open the Jupyter notebooks in the
notebooks/directory to explore the data and analysis.
- Open the Jupyter notebooks in the
Notes
- Ensure all required data is placed in the appropriate directories.
- For any issues or questions, please contact the authors of the paper.
License
This project is licensed under the Apache 2.0 License. See the LICENSE file for details.
Owner
- Name: Santiago del Rey
- Login: santidrj
- Kind: user
- Repositories: 11
- Profile: https://github.com/santidrj
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: >-
Aggregating empirical evidence from data strategy studies:
a case on model quantization – Replication package
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Santiago
family-names: del Rey
email: santiago.del.rey@upc.edu
affiliation: Universitat Politècnica de Catalunya
orcid: 'https://orcid.org/0000-0003-4979-414X'
identifiers:
- type: doi
value: 10.5281/zenodo.15850734
description: The Zenodo link for version 1.0.
- type: doi
value: 10.48550/arXiv.2505.00816
description: The ArXiv deposit of the pre-print.
repository-code: 'https://github.com/santidrj/model-quantization-aggregation'
abstract: >-
Replication package for the paper "Aggregating empirical
evidence from data strategies studies: a case on model
quantization" published in the 19th ACM/IEEE International
Symposium on Empirical Software Engineering and
Measurement (ESEM).
keywords:
- Software Engineering
- Research Synthesis
- Structured Synthesis Method
- Green IN AI
- Model Quantization
license: Apache-2.0
version: '1.0'
date-released: '2025-07-09'
GitHub Events
Total
- Release event: 1
- Push event: 2
- Create event: 1
Last Year
- Release event: 1
- Push event: 2
- Create event: 1
Dependencies
- altair [all]>=5.5.0
- anthropic >=0.44.0
- fastexcel >=0.14.0
- forestplot >=0.4.1
- google-genai >=1.24.0
- hvplot >=0.11.1
- itables >=2.2.4
- jsonlines >=4.0.0
- jupyter >=1.1.1
- matplotlib >=3.10.3
- numpy >=2.2.0
- pandarallel >=1.6.5
- pandas >=2.2.3
- polars >=1.29.0
- python-dotenv >=1.0.1
- requests >=2.32.3
- scikit-learn >=1.6.1
- seaborn >=0.13.2
- statsmodels >=0.14.4
- tiktoken >=0.8.0
- tqdm >=4.67.1
- xlsxwriter >=3.2.0
- xmltodict >=0.14.2
- altair ==5.5.0
- altair-tiles ==0.4.0
- annotated-types ==0.7.0
- anthropic ==0.49.0
- anyio ==4.9.0
- anywidget ==0.9.18
- appnope ==0.1.4
- argon2-cffi ==23.1.0
- argon2-cffi-bindings ==21.2.0
- arro3-core ==0.4.6
- arrow ==1.3.0
- asttokens ==3.0.0
- async-lru ==2.0.5
- attrs ==25.3.0
- autopep8 ==2.3.2
- babel ==2.17.0
- beautifulsoup4 ==4.13.4
- bleach ==6.2.0
- bokeh ==3.7.2
- cachetools ==5.5.2
- certifi ==2025.1.31
- cffi ==1.17.1
- cfgv ==3.4.0
- charset-normalizer ==3.4.1
- click ==8.1.8
- colorama ==0.4.6
- colorcet ==3.1.0
- comm ==0.2.2
- contourpy ==1.3.2
- cycler ==0.12.1
- debugpy ==1.8.14
- decorator ==5.2.1
- defusedxml ==0.7.1
- deptry ==0.23.0
- dill ==0.4.0
- distlib ==0.3.9
- distro ==1.9.0
- et-xmlfile ==2.0.0
- exceptiongroup ==1.2.2
- executing ==2.2.0
- fastexcel ==0.14.0
- fastjsonschema ==2.21.1
- filelock ==3.18.0
- fonttools ==4.57.0
- forestplot ==0.4.1
- fqdn ==1.5.1
- google-ai-generativelanguage ==0.6.15
- google-api-core ==2.24.2
- google-api-python-client ==2.167.0
- google-auth ==2.39.0
- google-auth-httplib2 ==0.2.0
- google-generativeai ==0.8.5
- googleapis-common-protos ==1.70.0
- grpcio ==1.71.0
- grpcio-status ==1.71.0
- h11 ==0.14.0
- holoviews ==1.20.2
- httpcore ==1.0.8
- httplib2 ==0.22.0
- httpx ==0.28.1
- hvplot ==0.11.2
- identify ==2.6.10
- idna ==3.10
- ipykernel ==6.29.5
- ipython ==8.35.0
- ipywidgets ==8.1.6
- isoduration ==20.11.0
- itables ==2.3.0
- jedi ==0.19.2
- jinja2 ==3.1.6
- jiter ==0.9.0
- joblib ==1.4.2
- json5 ==0.12.0
- jsonlines ==4.0.0
- jsonpointer ==3.0.0
- jsonschema ==4.23.0
- jsonschema-specifications ==2024.10.1
- jupyter ==1.1.1
- jupyter-client ==8.6.3
- jupyter-console ==6.6.3
- jupyter-core ==5.7.2
- jupyter-events ==0.12.0
- jupyter-lsp ==2.2.5
- jupyter-server ==2.15.0
- jupyter-server-terminals ==0.5.3
- jupyterlab ==4.4.0
- jupyterlab-pygments ==0.3.0
- jupyterlab-server ==2.27.3
- jupyterlab-widgets ==3.0.14
- kiwisolver ==1.4.8
- linkify-it-py ==2.0.3
- markdown ==3.8
- markdown-it-py ==3.0.0
- markupsafe ==3.0.2
- matplotlib ==3.10.3
- matplotlib-inline ==0.1.3
- mdit-py-plugins ==0.4.2
- mdurl ==0.1.2
- mercantile ==1.2.1
- mistune ==3.1.3
- narwhals ==1.35.0
- nbclient ==0.10.2
- nbconvert ==7.16.6
- nbformat ==5.10.4
- nbqa ==1.9.1
- nest-asyncio ==1.6.0
- nodeenv ==1.9.1
- notebook ==7.4.3
- notebook-shim ==0.2.4
- numpy ==2.2.5
- openpyxl ==3.1.5
- overrides ==7.7.0
- packaging ==25.0
- pandarallel ==1.6.5
- pandas ==2.2.3
- pandocfilters ==1.5.1
- panel ==1.6.2
- param ==2.2.0
- parso ==0.8.4
- patsy ==1.0.1
- pexpect ==4.9.0
- pillow ==11.2.1
- platformdirs ==4.3.7
- polars ==1.29.0
- pre-commit ==4.2.0
- prometheus-client ==0.21.1
- prompt-toolkit ==3.0.51
- proto-plus ==1.26.1
- protobuf ==5.29.4
- psutil ==7.0.0
- psygnal ==0.12.0
- ptyprocess ==0.7.0
- pure-eval ==0.2.3
- pyarrow ==19.0.1
- pyasn1 ==0.6.1
- pyasn1-modules ==0.4.2
- pycodestyle ==2.13.0
- pycparser ==2.22
- pydantic ==2.11.3
- pydantic-core ==2.33.1
- pygments ==2.19.1
- pyparsing ==3.2.3
- python-dateutil ==2.9.0.post0
- python-dotenv ==1.1.0
- python-json-logger ==3.3.0
- pytz ==2025.2
- pyviz-comms ==3.0.4
- pywin32 ==310
- pywinpty ==2.0.15
- pyyaml ==6.0.2
- pyzmq ==26.4.0
- referencing ==0.36.2
- regex ==2024.11.6
- requests ==2.32.3
- requirements-parser ==0.11.0
- rfc3339-validator ==0.1.4
- rfc3986-validator ==0.1.1
- rpds-py ==0.24.0
- rsa ==4.9.1
- ruff ==0.11.6
- scikit-learn ==1.6.1
- scipy ==1.15.2
- seaborn ==0.13.2
- send2trash ==1.8.3
- setuptools ==79.0.0
- six ==1.17.0
- sniffio ==1.3.1
- soupsieve ==2.7
- stack-data ==0.6.3
- statsmodels ==0.14.4
- terminado ==0.18.1
- threadpoolctl ==3.6.0
- tiktoken ==0.9.0
- tinycss2 ==1.4.0
- tokenize-rt ==6.1.0
- tomli ==2.2.1
- tornado ==6.4.2
- tqdm ==4.67.1
- traitlets ==5.14.3
- types-python-dateutil ==2.9.0.20241206
- types-setuptools ==79.0.0.20250422
- typing-extensions ==4.13.2
- typing-inspection ==0.4.0
- tzdata ==2025.2
- uc-micro-py ==1.0.3
- uri-template ==1.3.0
- uritemplate ==4.1.1
- urllib3 ==2.4.0
- vega-datasets ==0.9.0
- vegafusion ==2.0.2
- virtualenv ==20.30.0
- vl-convert-python ==1.7.0
- wcwidth ==0.2.13
- webcolors ==24.11.1
- webencodings ==0.5.1
- websocket-client ==1.8.0
- widgetsnbextension ==4.0.14
- xlsxwriter ==3.2.3
- xmltodict ==0.14.2
- xyzservices ==2025.1.0
- 194 dependencies