Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary
Repository
GrETEL 5
Basic Info
- Host: GitHub
- Owner: CentreForDigitalHumanities
- License: other
- Language: TypeScript
- Default Branch: develop
- Homepage: http://gretel.hum.uu.nl
- Size: 9.97 MB
Statistics
- Stars: 4
- Watchers: 5
- Forks: 2
- Open Issues: 84
- Releases: 3
Metadata Files
README.md
GrETEL 5
GrETEL stands for Greedy Extraction of Trees for Empirical Linguistics. It is a user-friendly search engine for the exploitation of syntactically annotated corpora or treebanks
GrETEL is publicly available at https://gretel5.hum.uu.nl
Before you start
You need to install the following software:
- PostgreSQL >= 10, client, server and C libraries
- Python >= 3.8, <= 3.10
- virtualenv
- WSGI-compatible webserver (deployment only)
- Visual C++ for Python (Windows only)
- Node.js >= 14.21.2
- Yarn
- WebDriver for at least one browser (only for functional testing)
- BaseX
- Alpino dependency parser. It is recommended to use the same version used for creating the treebanks. This way an example based search will result in the same search structure as stored in the database.
- Redis
How it works
This project integrates three isolated subprojects, each inside its own subdirectory with its own code, package dependencies and tests:
- backend: the server side web application based on Django and DRF
- frontend: the client side web application based on Angular
- functional-tests: the functional test suite based on Selenium and pytest
Each subproject is configurable from the outside. Integration is achieved using "magic configuration" which is contained inside the root directory together with this README. In this way, the subprojects can stay truly isolated from each other.
If you are reading this README, you'll likely be working with the integrated project as a whole rather than with one of the subprojects in isolation. In this case, this README should be your primary source of information on how to develop or deploy the project. However, we recommend that you also read the "How it works" section in the README of each subproject.
Development
Quickstart
First time after cloning this project:
console
python bootstrap.py
Check or adjust the backend/gretel/settings.py to make sure it points to the correct location of the BaseX and Alpino server.
Alpino can be started in server mode using:
console
./alpino.sh
BaseX server can be started using:
console
basexserver -s
Celery (used for running tasks in the background) can be started using:
```console
sudo service redis start
cd backend python -m celery -A gretel.celery worker --loglevel=info -B ```
Running the application in development mode (hit ctrl-C to stop):
console
nvm use
source .env/bin/activate
yarn start
This will run the backend and frontend applications, as well as their unittests, and watch all source files for changes. You can visit the frontend on http://localhost:8000/, the browsable backend API on http://localhost:8000/api/ and the backend admin on http://localhost:8000/admin/. On every change, unittests rerun, frontend code rebuilds and open browser tabs refresh automatically (livereload).
Recommended order of development
For each new feature, we suggested that you work through the steps listed below. This could be called a back-to-front or "bottom up" order. Of course, you may have reasons to choose otherwise. For example, if very precise specifications are provided, you could move step 8 to the front for a more test-driven approach.
Steps 1–5 also include updating the unittests. Only functions should be tested, especially critical and nontrivial ones.
- Backend model changes including migrations.
- Backend serializer changes and backend admin changes.
- Backend API endpoint changes.
- Frontend model changes.
- Other frontend unit changes (templates, views, routers, FSMs).
- Frontend integration (globals, event bindings).
- Run functional tests, repair broken functionality and broken tests.
- Add functional tests for the new feature.
- Update technical documentation.
For release branches, we suggest the following checklist.
- Bump the version number in the
package.jsonnext to this README. - Run the functional tests in production mode, fix bugs if necessary.
- Try using the application in production mode, look for problems that may have escaped the tests.
- Add regression tests (unit or functional) that detect problems from step 3.
- Work on the code until new regression tests from step 4 pass.
- Optionally, repeat steps 2–5 with the application running in a real deployment setup (see Deployment).
Commands for common tasks
The package.json next to this README defines several shortcut commands to help streamline development. In total, there are over 30 commands. Most may be regarded as implementation details of other commands, although each command could be used directly. Below, we discuss the commands that are most likely to be useful to you. For full details, consult the package.json.
Install the pinned versions of all package dependencies in all subprojects:
console
yarn
Run backend and frontend in production mode:
console
yarn start-p
Run the functional test suite:
console
yarn test-func [FUNCTIONAL TEST OPTIONS]
The functional test suite by default assumes that you have the application running locally in production mode (i.e., on port 4200). See Configuring the browsers and Configuring the base address in functional-tests/README for options.
Run all tests (mostly useful for continuous integration):
console
yarn test [FUNCTIONAL TEST OPTIONS]
Run an arbitrary command from within the root of a subproject:
console
yarn back [ARBITRARY BACKEND COMMAND HERE]
yarn front [ARBITRARY FRONTEND COMMAND HERE]
yarn func [ARBITRARY FUNCTIONAL TESTS COMMAND HERE]
For example,
console
yarn back less README.md
is equivalent to
console
cd backend
less README.md
cd ..
Run python manage.py within the backend directory:
console
yarn django [SUBCOMMAND] [OPTIONS]
yarn django is a shorthand for yarn back python manage.py. This command is useful for managing database migrations, among other things.
Manage the frontend package dependencies:
console
yarn fyarn (add|remove|upgrade|...) (PACKAGE ...) [OPTIONS]
Notes on Python package dependencies
Both the backend and the functional test suite are Python-based and package versions are pinned using pip-tools in both subprojects. For ease of development, you most likely want to use the same virtualenv for both and this is also what the bootstrap.py assumes.
This comes with a small catch: the subprojects each have their own separate requirements.txt. If you run pip-sync in one subproject, the dependencies of the other will be uninstalled. In order to avoid this, you run pip install -r requirements.txt instead. The yarn command does this correctly by default.
Another thing to be aware of, is that pip-compile takes the old contents of your requirements.txt into account when building the new version based on your requirements.in. You can use the following trick to keep the requirements in both projects aligned so the versions of common packages don't conflict:
```console $ yarn back pip-compile
append contents of backend/requirements.txt to functional-tests/requirements.txt
$ yarn func pip-compile ```
Development mode vs production mode
The purpose of development mode is to facilitate live development, as the name implies. The purpose of production mode is to simulate deployment conditions as closely as possible, in order to check whether everything still works under such conditions. A complete overview of the differences is given below.
dimension | Development mode | Production mode
-----------|--------------------|-----------------
command | yarn start | yarn start-p
base address | http://localhost:8000 | http://localhost:4200
backend server (Django) | in charge of everything | serves backend only
frontend server (angular-cli) | serves | watch and build
static files | served directly by Django's staticfiles app | collected by Django, served by gulp-connect
backend DEBUG setting | True | False
backend ALLOWED_HOSTS | - | restricted to localhost
frontend sourcemaps | yes | no
frontend optimization | no | yes
Deployment
Both the backend and frontend applications have a section dedicated to deployment in their own READMEs. You should read these sections entirely before proceeding. All instructions in these sections still apply, though it is good to know that you can use the following shorthand commands from the integrated project root:
```console
collect static files of both backend and frontend, with overridden settings
$ yarn django collectstatic --settings SETTINGS --pythonpath path/to/SETTINGS.py ```
You should build the frontend before collecting all static files.
Notes for users
Only the properties of the first node matched by an XPATH variable is returned for analysis. For example:
A user searches for //node[node]. Two variables are found in this query: $node1 = //node and $node2 = $node1[node].
The following sentence would match this query:
node[np] (node[det] node[noun])
The node found for $node1 will then be node[np].
The node found for $node2 will then be node[det]. The properties of node[noun] will not be available for analysis using this query.
When searching for a more specific structure, this is unlikely to occur.
Info
- v5.1.0 June 2025: full replacement of GrETEL 4
- v4.2.0 August 2019: federated search, improved configuration and state management, download results with node properties and again many more fixes.
- v4.1.0 February 2019: Fixed support for GrInded corpora, many more fixes, feature complete replacement of version 3.
- v4.0.2 October 2018: GrETEL 4 release with many bugfixes and improvements.
- v4.0.0 June 2018: First GrETEL 4 release with new interface.
- v3.9.99 November 2017: GrETEL 4 currently under development!
- v3.0.2 July 2017: Show error message if the BaseX server is down
- v3.0. November 2016: GrETEL 3 initial release. Available at http://gretel.ccl.kuleuven.be/gretel3
Branches
main: official version of GrETEL 5, available at https://gretel5.hum.uu.nl/gretel/develop: development versiongretel2.0: official version of GrETEL 2.0, available at https://gretel.ccl.kuleuven.be/gretel-2.0
Credits
- Liesbeth Augustinus and Vincent Vandeghinste: concept and initial implementation;
- Bram Vanroy: GrETEL 3 improvements and design;
- Martijn van der Klis: initial GrETEL 4 functionality and improvements;
- Sheean Spoel, Gerson Foks and Jelte van Boheemen: additional GrETEL 4 functionality and improvements;
- Ben Bonfil and Tijmen Baarda: additional GrETEL 5 functionality and improvements;
- Jan Odijk project lead for GrETEL 4 and GrETEL 5 developments;
- Koen Mertens: federated search at Instituut voor de Nederlandse taal.
- Colleagues at the Centre for Computational Linguistics at KU Leuven, and Centre for Digital Humanities at Utrecht University for their feedback.
License
This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (cc-by-sa-4.0). See the LICENSE file for license rights and limitations.
Owner
- Name: Centre for Digital Humanities
- Login: CentreForDigitalHumanities
- Kind: organization
- Email: cdh@uu.nl
- Location: Netherlands
- Website: https://cdh.uu.nl/
- Repositories: 39
- Profile: https://github.com/CentreForDigitalHumanities
Interdisciplinary centre for research and education in computational and data-driven methods in the humanities.
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: GrETEL
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- name: 'Research Software Lab, Centre for Digital Humanities, Utrecht University'
website: 'https://cdh.uu.nl/centre-for-digital-humanities/research-software-lab/'
city: Utrecht
country: NL
- name: 'Centre for Computational Linguistics, KU Leuven'
website: 'http://www.arts.kuleuven.be/ling/ccl'
country: BE
city: Leuven
- website: 'https://ivdnt.org/'
name: INT
country: NL
city: Leiden
- affiliation: 'Centre for Computational Linguistics, KU Leuven'
family-names: Augustinus
given-names: Liesbeth
- affiliation: 'INL; Centre for Computational Linguistics, KU Leuven'
family-names: Vandeghinste
given-names: Vincent
- affiliation: 'Centre for Computational Linguistics, KU Leuven'
family-names: Vanroy
given-names: Bram
orcid: 'https://orcid.org/0000-0002-5450-201X'
- given-names: Jan
family-names: Odijk
affiliation: 'Institute for Language Sciences, Utrecht University'
orcid: 'https://orcid.org/0000-0003-3331-1182'
- affiliation: 'Centre for Digital Humanities, Utrecht University'
family-names: Klis
name-particle: van der
given-names: Martijn
orcid: 'https://orcid.org/0000-0003-0008-9028'
- affiliation: 'Centre for Digital Humanities, Utrecht University'
family-names: Spoel
given-names: Sheean
orcid: 'https://orcid.org/0000-0002-6802-4135'
- affiliation: 'Centre for Digital Humanities, Utrecht University'
family-names: Foks
given-names: Gerson
- affiliation: Instituut voor de Nederlandse Taal
family-names: Mertens
given-names: Koen
- affiliation: 'Centre for Digital Humanities, Utrecht University'
family-names: Boheemen
given-names: Jelte
name-particle: van
- affiliation: 'Centre for Digital Humanities, Utrecht University'
family-names: Baarda
given-names: Tijmen
orcid: 'https://orcid.org/0000-0002-2577-4948'
- affiliation: 'Centre for Digital Humanities, Utrecht University'
family-names: Bonfil
given-names: Ben
identifiers:
- type: doi
value: 10.5281/zenodo.7152769
repository-code: 'https://github.com/CentreForDigitalHumanities/gretel'
url: 'https://gretel.hum.uu.nl'
abstract: >-
Search engine for the exploitation of syntactically
annotated corpora or treebanks
keywords:
- corpus research
- natural language processing
- dutch
- syntax
- treebanks
license: CC-BY-NC-SA-4.0
version: 5.1.0
date-released: '2025-06-11'
GitHub Events
Total
- Push event: 10
- Pull request review event: 1
- Create event: 2
Last Year
- Push event: 10
- Pull request review event: 1
- Create event: 2
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Sheean Spoel | s****l@u****l | 465 |
| Bram Vanroy | b****y@h****m | 234 |
| Tijmen Baarda | t****a@g****m | 161 |
| ben | b****l@u****l | 132 |
| Gerson Foks | g****s@g****m | 73 |
| Koen Mertens | k****s@i****g | 43 |
| LiesbethA | L****A | 37 |
| Martijn van der Klis | M****s@u****l | 21 |
| Jelte van Boheemen | j****n@g****m | 20 |
| Jelte van Boheemen | j****n@u****l | 6 |
| root | r****t@l****n | 5 |
| Bram.Vanroy@UGent.be | B****y@U****e | 4 |
| Vincent Vandeghinste | v****t@c****e | 3 |
| Donatas Rasiukevicius | d****s@u****l | 2 |
| KCMertens | k****m@g****m | 1 |
| root | r****t@s****c | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 62
- Total pull requests: 38
- Average time to close issues: 2 months
- Average time to close pull requests: about 2 months
- Total issue authors: 7
- Total pull request authors: 4
- Average comments per issue: 0.63
- Average comments per pull request: 0.61
- Merged pull requests: 7
- Bot issues: 0
- Bot pull requests: 27
Past Year
- Issues: 2
- Pull requests: 2
- Average time to close issues: about 1 month
- Average time to close pull requests: 6 days
- Issue authors: 2
- Pull request authors: 2
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- bbonf (26)
- tijmenbaarda (20)
- oktaal (3)
- JessedeDoes (1)
Pull Request Authors
- oktaal (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- 1180 dependencies
- @angular-devkit/build-angular ^12.1.0 development
- @angular/cli ^12.1.0 development
- @angular/compiler ^12.1.0 development
- @angular/compiler-cli ^12.1.0 development
- @angular/language-service ^12.1.0 development
- @types/file-saver 2.0.1 development
- @types/jasmine ~3.6.0 development
- @types/jasminewd2 ^2.0.8 development
- @types/jquery ^3.3.31 development
- @types/jqueryui ^1.12.10 development
- @types/libxmljs ^0.18.5 development
- @types/lodash ^4.14.149 development
- ajv ^6.10.2 development
- array-flat-polyfill ^1.0.1 development
- jasmine-core ^3.7.1 development
- jasmine-spec-reporter ~5.0.0 development
- karma ^6.3.16 development
- karma-chrome-launcher ~3.1.0 development
- karma-cli ^2.0.0 development
- karma-coverage-istanbul-reporter ~3.0.3 development
- karma-jasmine ~4.0.1 development
- karma-jasmine-html-reporter ^1.6.0 development
- protractor ~7.0.0 development
- ts-node ~8.3.0 development
- tslint ~6.1.0 development
- typescript 4.2.4 development
- webpack ^5.40.0 development
- @angular/animations ^12.1.0
- @angular/cdk ^12.1.0
- @angular/common ^12.1.0
- @angular/core ^12.1.0
- @angular/forms ^12.1.0
- @angular/material ^12.1.0
- @angular/platform-browser ^12.1.0
- @angular/platform-browser-dynamic ^12.1.0
- @angular/platform-server ^12.1.0
- @angular/router ^12.1.0
- @fortawesome/angular-fontawesome ^0.9.0
- @fortawesome/fontawesome-free ^5.15.4
- @fortawesome/fontawesome-svg-core ^1.3.0
- @fortawesome/free-solid-svg-icons ^6.0.0
- @ng-select/ng-select ^7.0.1
- balloon-css ^1.2.0
- bulma ^0.9.3
- bulma-badge ^3.0.1
- bulma-pageloader ^2.2.0
- core-js ^3.6.2
- fast-xml-parser ^3.15.1
- file-saver ^2.0.2
- jquery ^3.4.1
- jquery-ui ^1.12.1
- jszip ^3.6.0
- lassy-xpath ^0.12.0
- lodash ^4.17.15
- ngx-clipboard ^12.3.0
- pivottable ^2.23.0
- primeicons ^4.1.0
- primeng ^12.0.0
- rxjs ^6.5.4
- tslib ^2.0.0
- xlsx ^0.17.0
- zone.js ~0.11.4
- guzzlehttp/guzzle ^6.1 development
- phpunit/phpunit 3.7.14 development
- altorouter/altorouter 1.1.0
- guzzlehttp/guzzle 6.3.0 development
- guzzlehttp/promises v1.3.1 development
- guzzlehttp/psr7 1.4.2 development
- phpunit/php-code-coverage 1.2.18 development
- phpunit/php-file-iterator 1.4.5 development
- phpunit/php-text-template 1.2.1 development
- phpunit/php-timer 1.0.9 development
- phpunit/php-token-stream 1.2.2 development
- phpunit/phpunit 3.7.14 development
- phpunit/phpunit-mock-objects 1.2.3 development
- psr/http-message 1.0.1 development
- symfony/yaml v2.1.13 development
- altorouter/altorouter v1.1.0
- alpino-query >=2.1.4
- alpino-query ==2.1.4
- lxml ==4.7.1
- actions/checkout v2 composite
- actions/setup-node v2 composite