universal-judge

Universal judge for educational software testing

https://github.com/dodona-edu/universal-judge

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.3%) to scientific vocabulary

Keywords

dodona educational-software judge

Last synced: 6 months ago · JSON representation ·

Repository

Universal judge for educational software testing

Basic Info

Host: GitHub
Owner: dodona-edu
License: mit
Language: Python
Default Branch: master
Homepage: https://docs.dodona.be/en/tested
Size: 90 MB

Statistics

Stars: 10
Watchers: 4
Forks: 7
Open Issues: 58
Releases: 1

Topics

dodona educational-software judge

Created over 6 years ago · Last pushed 9 months ago

Metadata Files

Readme License Citation

TESTed: universal judge for educational software testing

TESTed is a software test framework to evaluate submissions for programming exercises across multiple programming languages, using a single test suite per exercise.

TESTed is developed by Team Dodona at Ghent University. If you use this software in research, please cite:

Strijbol, N., Van Petegem, C., Maertens, R., Sels, B., Scholliers, C., Dawyndt, P., & Mesuere, B. (2023). TESTed—An educational testing framework with language-agnostic test suites for programming exercises. SoftwareX, 22, 101404. doi:10.1016/j.softx.2023.101404

[!IMPORTANT] The documentation below is intended for running TESTed as a standalone tool. If you are looking to create exercises for Dodona, we have more suitable documentation available.

Installing TESTed

TESTed is implemented in Python, but has various dependencies for its language-specific modules.

To be able to work with all these dependencies on different platforms we make use of devcontainers. This means that you can use the provided .devcontainer/devcontainer.json to open a container with all dependencies installed.

Modern IDEs like Visual Studio Code and PyCharm support devcontainers out of the box.

If you prefer installing all dependencies on your local machine, you can find the installed dependencies in the dockerfile. The extra development dependencies are listed in the dev-dependencies.sh file.

Running TESTed

TESTed evaluates a submission for a programming exercise based on a test suite that specifies some test cases for the exercise. In what follows, we guide you through the configuration of a simple programming exercise and running TESTed to evaluate a submission using the test suite of the exercise. The directory ./exercise/ in the root directory of TESTed contains some more examples of programming exercises with test suites for TESTed.

1. Create an exercise

Let's configure a simple programming exercise that asks to implement a function echo. The function takes a single argument and returns its argument.

Start creating a directory for the configuration of the exercise. To keep things simple, we add the exercise to the exercise subdirectory in the root directory of TESTed.

bash mkdir exercise/simple-example

Note that you would normally not store your exercises in the TESTed repository. We recommend creating a new repository for your exercises.

2. Create a test suite

The next step is to design a test suite that will be used to evaluate submission for the exercise. Again, to keep things simple, we will only include a single test case in the test suite.

yaml - tab: Echo testcases: - expression: "echo('input-1')" return: "input-1"

This test suite describes the following tests: we have one tab, which is named Echo. Inside this tab, there is one test case, in which we call the function echo with the string argument "input-1". The expected output is a return value (again a string) of "input-1". All other tests use the defaults: for example, no output is allowed on stderr, while stdout is ignored.

Put the file containing the test suite in the following location:

```bash

Create the file

$ touch exercise/simple-example/suite.yaml

Now you should put the content from above in the file.

```

3. Create some submissions

Now create two Python submissions for the programming exercise. The first one contains a correct solution, and the second one returns the wrong result.

bash $ cat exercise/simple-example/correct.py def echo(argument): return argument $ cat exercise/simple-example/wrong.py def echo(argument): # Oops, this is wrong. return argument * 2

4. Evaluate the submissions

To evaluate a submission with TESTed, you need to provide a test suite and configuration information. This information can be piped to TESTed via stdin, but to make things easier, we will add the information to a configuration file in the directory of the exercise. In practice, this configuration file would be created by the learning environment in which TESTed is integrated.

bash $ cat exercise/simple-example/config.json { "programming_language": "python", "natural_language": "en", "resources": "exercise/simple-example/", "source": "exercise/simple-example/correct.py", "judge": ".", "workdir": "workdir/", "test_suite": "suite.yaml", "memory_limit": 536870912, "time_limit": 60 }

These attributes are used by TESTed:

programming_language: programming language of the submission
resources: path of a directory with resources TESTed can use
source: path of the submission that must be evaluated
judge: path of the root directory of TESTEd
workdir: path of a temporary directory (see below)
test_suite: path of the test suite, relative to the resources directory (as defined above)

Before evaluating a submission, TESTed generates test code in the workdir. Create that directory:

bash $ mkdir workdir/

The content in this directory stays in place after TESTed finishes its evaluation, so you can inspect the generated test code. Before running TESTed again, you'll need to clear this directory.

With this command, TESTed will evaluate the submission and generate feedback on stdout.

bash $ python -m tested -c exercise/simple-example/config.json {"command": "start-judgement"} {"title": "Echo", "command": "start-tab"} {"command": "start-context"} {"description": {"description": "echo('input-1')", "format": "python"}, "command": "start-testcase"} {"expected": "input-1", "channel": "return (String)", "command": "start-test"} {"generated": "input-1", "status": {"enum": "correct"}, "command": "close-test"} {"command": "close-testcase"} {"command": "close-context"} {"command": "close-tab"} {"command": "close-judgement"} By default, TESTed generates its feedback on stdout. The feedback is formatted in the JSON Lines text format, meaning that each line contains a JSON object. Here's how you get an overview of all options supported by TESTed:

```bash $ python -m tested --help usage: main.py [-h] [-c CONFIG] [-o OUTPUT] [-v]

The programming-language-agnostic educational test framework.

optional arguments: -h, --help show this help message and exit -c CONFIG, --config CONFIG Where to read the config from -o OUTPUT, --output OUTPUT Where the judge output should be written to. -v, --verbose Include verbose logs. It is recommended to also use -o in this case. ```

Adjust the configuration file if you want to evaluate the wrong submission.

For reference, the file tested/dsl/schema.json contains the JSON Schema of the test suite format.

Running TESTed locally

The python -m tested command is intended for production use. However, it is not always convenient to create a config.json file for each exercise to run.

TESTed supports two ways of running TESTed without a config file. The first way is:

```bash

Run a hard-coded exercise with logs enabled, useful for debugging

$ python -m tested.manual ```

This command is useful when debugging TESTed itself or a particularly challenging exercise. It will execute a hardcoded config, which is set in tested/manual.py.

The second way is:

```bash

Run an exercise with CLI paramaters

$ python -m tested.cli --help usage: cli.py [-h] -e EXERCISE [-s SUBMISSION] [-t TESTSUITE] [-f] [-v] [-d] [-p PROGRAMMING_LANGUAGE]

Simple CLI for TESTed

options: -h, --help show this help message and exit -e EXERCISE, --exercise EXERCISE Path to a directory containing an exercise -s SUBMISSION, --submission SUBMISSION Path to a submission to evaluate -t TESTSUITE, --testsuite TESTSUITE Path to a test suite -f, --full If the output should be shown in full (default: false) -v, --verbose If the judge should be verbose in its output (default: false) -d, --debug If the judge should be outputing the debug messages (default: false) -p PROGRAMMINGLANGUAGE, --programminglanguage PROGRAMMING_LANGUAGE The programming language to use

additional information: The CLI only looks at a config.json file in the exercise directory. It does not look in folders above the exercise directory. ```

This is the "CLI mode": here you can pass various options as command line parameters. For example, for exercises following a standardized directory structure, the path to the exercise folder is often enough.

TESTed repository

The repository of TESTed is organized as follows:

tested: Python code of the actual judge (run by Dodona)
tests: unit tests for TESTed

Useful commands

You can run the basic unit tests with:

bash pytest tests/test_functionality.py

You can run the full test suite with:

bash pytest -n auto tests/

We use black and isort for code formatting. pyright is used for type checking. You can run them with:

bash black ./tested ./tests isort ./tested ./tests pyright ./tested ./tests

Owner

Name: Dodona
Login: dodona-edu
Kind: organization
Email: dodona@ugent.be
Location: Ghent, Belgium

Website: https://dodona.ugent.be
Twitter: DodonaEdu
Repositories: 20
Profile: https://github.com/dodona-edu

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  TESTed—An educational testing framework with
  language-agnostic test suites for programming exercises
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Niko
    family-names: Strijbol
    email: niko.strijbol@ugent.be
    affiliation: >-
      Department of Applied Mathematics, Computer Science
      and Statistics, Ghent University
    orcid: 'https://orcid.org/0000-0002-3161-174X'
  - given-names: Charlotte
    family-names: Van Petegem
    email: charlotte.vanpetegem@ugent.be
    affiliation: >-
      Department of Applied Mathematics, Computer Science
      and Statistics, Ghent University
    orcid: 'https://orcid.org/0000-0003-0779-4897'
  - given-names: Rien
    family-names: Maertens
    email: rien.maertens@ugent.be
    affiliation: >-
      Department of Applied Mathematics, Computer Science
      and Statistics, Ghent University
    orcid: 'https://orcid.org/0000-0002-2927-3032'
  - given-names: Boris
    family-names: Sels
    email: boris.sels@gmail.com
    affiliation: >-
      Department of Applied Mathematics, Computer Science
      and Statistics, Ghent University
    orcid: 'https://orcid.org/0000-0002-0870-9554'
  - given-names: Christophe
    family-names: Scholliers
    email: christophe.scholliers@ugent.be
    affiliation: >-
      Department of Applied Mathematics, Computer Science
      and Statistics, Ghent University
    orcid: 'https://orcid.org/0000-0002-2837-4763'
  - given-names: Peter
    family-names: Dawyndt
    orcid: 'https://orcid.org/0000-0002-1623-9070'
    email: peter.dawyndt@ugent.be
    affiliation: >-
      Department of Applied Mathematics, Computer Science
      and Statistics, Ghent University
  - given-names: Bart
    family-names: Mesuere
    email: bart.mesuere@ugent.be
    orcid: 'https://orcid.org/0000-0003-0610-3441'
    affiliation: >-
      Department of Applied Mathematics, Computer Science
      and Statistics, Ghent University
identifiers:
  - type: doi
    value: 10.1016/j.softx.2023.101404
  - type: url
    value: https://www.sciencedirect.com/science/article/pii/S2352711023001000
repository-code: 'https://github.com/dodona-edu/universal-judge'
url: 'https://docs.dodona.be/en/tested'
license: MIT

GitHub Events

Total

Create event: 21
Issues event: 36
Watch event: 2
Delete event: 15
Member event: 1
Issue comment event: 109
Push event: 404
Pull request review comment event: 188
Pull request event: 34
Pull request review event: 170
Fork event: 1

Last Year

Create event: 21
Issues event: 36
Watch event: 2
Delete event: 15
Member event: 1
Issue comment event: 109
Push event: 404
Pull request review comment event: 188
Pull request event: 34
Pull request review event: 170
Fork event: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 81
Total pull requests: 38
Average time to close issues: 4 months
Average time to close pull requests: 12 days
Total issue authors: 7
Total pull request authors: 6
Average comments per issue: 1.44
Average comments per pull request: 0.21
Merged pull requests: 33
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 13
Pull requests: 5
Average time to close issues: 5 months
Average time to close pull requests: 8 days
Issue authors: 6
Pull request authors: 3
Average comments per issue: 0.46
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

niknetniko (56)
pdawyndt (33)
bsels (12)
BrentBlanckaert (6)
DieterPi (3)
jorg-vr (3)
tibvdm (2)
MaybeJustJames (1)
BertGuillemyn (1)

Pull Request Authors

bsels (28)
niknetniko (28)
jorg-vr (10)
BrentBlanckaert (10)
pdawyndt (3)
dependabot[bot] (2)
Bond-009 (2)
DieterPi (1)
JSteegmans (1)
tibvdm (1)
JeroenUH (1)

Top Labels

Issue Labels

bug (18) enhancement (12) programming language (7) dsl (5) chore (3)

Pull Request Labels

bug (4) chore (4) programming language (3) dependencies (2) run tests (1) enhancement (1)

Dependencies

.github/workflows/ci.yml actions

actions/cache v2 composite
actions/checkout v2 composite
actions/setup-java v1 composite
actions/setup-node v2 composite
actions/setup-python v2 composite
haskell/actions/setup v1 composite
psf/black stable composite

.github/workflows/codeql.yml actions

actions/checkout v3 composite
github/codeql-action/analyze v2 composite
github/codeql-action/autobuild v2 composite
github/codeql-action/init v2 composite

tested/languages/csharp/templates/dotnet.csproj nuget

Pipfile pypi

pytest develop
pytest-cov develop
pytest-mock develop
pytest-xdist develop
attrs ==22.2.0
cattrs ==23.1.2
jinja2 ==3.1.2
jsonschema ==4.18.4
marko ==2.0.0
psutil ==5.9.5
pygments ==2.15.1
pylint ==2.17.1
python-i18n ==0.3.9
pyyaml ==6.0
typing-inspect ==0.9.0

Pipfile.lock pypi

coverage ==7.3.0 develop
execnet ==2.0.2 develop
iniconfig ==2.0.0 develop
packaging ==23.1 develop
pluggy ==1.2.0 develop
pytest ==7.4.0 develop
pytest-cov ==4.1.0 develop
pytest-mock ==3.11.1 develop
pytest-xdist ==3.3.1 develop
astroid ==2.15.6
attrs ==22.2.0
cattrs ==23.1.2
dill ==0.3.7
isort ==5.12.0
jinja2 ==3.1.2
jsonschema ==4.18.4
jsonschema-specifications ==2023.7.1
lazy-object-proxy ==1.9.0
marko ==2.0.0
markupsafe ==2.1.3
mccabe ==0.7.0
mypy-extensions ==1.0.0
platformdirs ==3.10.0
psutil ==5.9.5
pygments ==2.15.1
pylint ==2.17.1
python-i18n ==0.3.9
pyyaml ==6.0
referencing ==0.30.2
rpds-py ==0.9.2
tomlkit ==0.12.1
typing-extensions ==4.7.1
typing-inspect ==0.9.0
wrapt ==1.15.0

pyproject.toml pypi

universal-judge

Science Score: 57.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

TESTed: universal judge for educational software testing

Installing TESTed

Running TESTed

1. Create an exercise

2. Create a test suite

Create the file

Now you should put the content from above in the file.

3. Create some submissions

4. Evaluate the submissions

Running TESTed locally

Run a hard-coded exercise with logs enabled, useful for debugging

Run an exercise with CLI paramaters

TESTed repository

Useful commands

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies