retorch-llm-rp

Replication package for LLM System testing experimentation

https://github.com/giis-uniovi/retorch-llm-rp

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 12 DOI reference(s) in README
✓
Academic publication links
Links to: springer.com, zenodo.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.2%) to scientific vocabulary

Keywords

chatgpt e2e-testing llm system-testing test-automation testing

Last synced: 6 months ago · JSON representation ·

Repository

Replication package for LLM System testing experimentation

Basic Info

Host: GitHub
Owner: giis-uniovi
License: apache-2.0
Language: Java
Default Branch: main
Homepage:
Size: 168 KB

Statistics

Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 1

Topics

chatgpt e2e-testing llm system-testing test-automation testing

Created over 1 year ago · Last pushed 10 months ago

Metadata Files

Readme License Citation

Replication package for 'Exploratory study of the usefulness of LLMs in System testing'

This repository contains the replication package of the paper Software System Testing assisted by Large Language Models: An Exploratory Study published at 36th International Conference on Testing Software and Systems (ICTSS24)

The replication package comprises the test scripts used to generate the test scenarios and system test code as well as the different inputs required: the user requirements, the system test cases provided as example, the test scenarios used as input and the scenario used as example. The replication package also provides the different outputs of our exploratory study in the /docs folder, the original raw data is available in the ZENODO repository.

Replication package structure and naming conventions:

The replication package is structured as follows:

/docs: contains the experimental outputs as well as the experimental baselines, namely as aforementioned .
/llm-rp-expstudy/src/main: contains all the necessary Java scripting code for execute the different prompts to the OpenAI API.
/llm-rp-expstudy/src/main/resources: contains all the necessary inputs for the prompts: scenarios, test cases and user requirements as well as the examples.

The naming conventions are :

Test Scenarios given as output are named using the number of research question as well as the OpenAI model and prompting technique (e.g., RQ1-TestScenarios-GPT4oCOT).
System Test Cases given as output are named using the number of research question, followed by a traceability letter ( e.g. A,B,C or D) and the test case requested (e.g., RQ2-B-AccessCourseViewClasses).
Prompt Inputs are named with the word input* followed by the type of input (e.g. inputSystemTestCases.txt, inputTestScenarioExample.txt etc.)

Experimental Subject

The experimental subject is a real-world application called FullTeaching, used as a demonstrator of the ElasTest EU Project. FullTeaching provides an education platform composed of several test resources, such as web servers, databases, and multimedia servers that allows to create online classrooms, classes or publish and create class resources.

To the best of our knowledge, FullTeaching has two test suites available in different repositories [1] [2]. The test suite used to generate the raw datasets provided in this replication package is a compilation of the available test, cases in these repositories. The test suite is made available as the version v1.2.0 in the retorch-st-fullteaching GitHub repository.

The user requirements are extracted of the FullTeaching documentation (Fuente Pérez, P. (2017). FullTeaching : Aplicación Web de docencia con videoconferencia.) and translated to english. The spanish version can be consulted here and the english version here.

Treatment Replication Overview

The process consists of two distinct parts: the generation of test scenarios performed through a single script(RQ1Experimentation.java), and the generation of the test system test cases (RQ2Experimentation.java) using the best test scenarios of the first part. These two parts are detailed below.

Test Scenarios Generation: This process is accomplished through the execution of a single script that take the user requirements as input, The output is provided in the resources (llm-rp-expstudy/src/main/resources/outputs), namely with the version of the model and the prompting strategy used.
System Test Cases Generation: The process takes the best previously generated test scenarios and several system test cases as input. Automatically the script makes a cross validation leaving the most close test case in terms of levenshtein distance, and asking the model to generate its scenario. The output is provided in the resources (llm-rp-expstudy/src/main/resources/outputs), namely with the version of the model and the prompting strategy used and the scenario requested.

In both cases the prompts used are stored in the target folder (llm-rp-expstudy/src/main/resources/outputs) for debugging purposes The comparison baseline and how we selected the test cases from the original test suite is described in the Test Scenarios Baseline and Experimental Set-up

Treatment Replication Procedure

To execute the different Java scripts, your system needs the following requirements:

Install java and maven, this experimentation was performed using the following versions:
- Maven 3.9.7
- Java 21 LTS
Create an environment variable CHATGPT_API_KEY with your OpenAI API token
Execute the two Java files.

Replication procedure outputs

The outputs of the replication procedure are the following:

Test Scenarios:
System Test Cases (each file contains 4o/4o-mini and both prompting techniques):

Contributing

See the general contribution policies and guidelines for giis-uniovi at CONTRIBUTING.md.

Contact

Contact any of the researchers who authored the paper; their affiliation and contact information are provided in the paper itself.

Citing this work

Cristian Augusto, Jesús Morán, Antonia Bertolino, Claudio de la Riva, and Javier Tuya, “Software System Testing assisted by Large Language Models: An Exploratory Study” in 36th International Conference on Testing Software and Systems, ICTSS24, London (UK), 2025, LNCS 15383, pp. 239–255., Springer Cham https://doi.org/10.1007/978-3-031-80889-0_17 - Full Paper available - Authors version - Download citation

Acknowledgments

This work was supported in part by the project PID2022-137646OB-C32 under Grant MCIN/ AEI/10.13039/501100011033/FEDER, UE, by the Ministry of Science and Innovation (SPAIN) and in part by the project MASE RDS-PTR2224_P2.1 Cybersecurity (Italy).

Owner

Name: GIIS
Login: giis-uniovi
Kind: organization
Location: Spain

Website: http://giis.uniovi.es
Repositories: 17
Profile: https://github.com/giis-uniovi

Software Engineering Research Group - University of Oviedo, ES

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you cite this replication package, please cite it as below."
authors:
  - family-names: "Augusto"
    given-names: "Cristian"
    orcid: "https://orcid.org/0000-0001-6140-1375"
  - family-names: "Morán"
    given-names: "Jesús"
    orcid: "https://orcid.org/0000-0002-7544-3901"
  - family-names: "Bertolino"
    given-names: "Antonia"
    orcid: "https://orcid.org/0000-0001-8749-1356"
  - family-names: "De la Riva"
    given-names: "Claudio"
    orcid: "https://orcid.org/0000-0001-5592-9683"
  - family-names: "Tuya"
    given-names: "Javier"
    orcid: "https://orcid.org/0000-0002-1091-934X"
title: "Replication package for 'Software System Testing assisted by Large Language Models: An Exploratory Study'"
version: "1.0"
doi: "10.5281/zenodo.13761150"
date-released: "2024-09-20"
url: "https://github.com/giis-uniovi/retorch-llmexpstudy-rp"
preferred-citation:
  type: conference-paper
  authors:
    - family-names: "Augusto"
      given-names: "Cristian"
      orcid: "https://orcid.org/0000-0001-6140-1375"
    - family-names: "Morán"
      given-names: "Jesús"
      orcid: "https://orcid.org/0000-0002-7544-3901"
    - family-names: "Bertolino"
      given-names: "Antonia"
      orcid: "https://orcid.org/0000-0001-8749-1356"
    - family-names: "De la Riva"
      given-names: "Claudio"
      orcid: "https://orcid.org/0000-0001-5592-9683"
    - family-names: "Tuya"
      given-names: "Javier"
      orcid: "https://orcid.org/0000-0002-1091-934X"
  title: "Software System Testing assisted by Large Language Models: An Exploratory Study"
  year: 2025
  collection-title: "Proceedings of the 36th International Conference on Testing Software and Systems"
  conference:
    name: "36th International Conference on Testing Software and Systems"
    location: "Royal National Hotel and King's College London"
    address: "38-51 Bedford Way"
    city: "London"
    region: "London"
    post-code: "WC1H 0DG"  
    country: "United Kingdom"
    date-start: "2024-10-30"
    date-end: "2024-11-01"
  pages:
    start: 239
    end: 255
  doi: "10.1007/978-3-031-80889-0_17"

GitHub Events

Total

Issues event: 1
Delete event: 1
Issue comment event: 9
Push event: 14
Pull request event: 2
Create event: 1

Last Year

Issues event: 1
Delete event: 1
Issue comment event: 9
Push event: 14
Pull request event: 2
Create event: 1

Committers

Last synced: over 1 year ago

All Time

Total Commits: 3
Total Committers: 2
Avg Commits per committer: 1.5
Development Distribution Score (DDS): 0.333

Past Year

Commits: 3
Committers: 2
Avg Commits per committer: 1.5
Development Distribution Score (DDS): 0.333

Top Committers

Name	Email	Commits
Cristian Augusto	4****n	2
Javier	1****a	1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 2
Total pull requests: 2
Average time to close issues: 3 days
Average time to close pull requests: about 7 hours
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 1.0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 2
Average time to close issues: 3 days
Average time to close pull requests: about 7 hours
Issue authors: 1
Pull request authors: 1
Average comments per issue: 1.0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

retorch-llm-rp

Science Score: 67.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Replication package for 'Exploratory study of the usefulness of LLMs in System testing'

Replication package structure and naming conventions:

Experimental Subject

Treatment Replication Overview

Treatment Replication Procedure

Replication procedure outputs

Contributing

Contact

Citing this work

Acknowledgments

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels