khatt

Software for annotating manuscripts

https://github.com/CentreForDigitalHumanities/khatt

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.3%) to scientific vocabulary

Keywords

annotation-tool manuscript
Last synced: 6 months ago · JSON representation ·

Repository

Software for annotating manuscripts

Basic Info
  • Host: GitHub
  • Owner: CentreForDigitalHumanities
  • License: bsd-3-clause
  • Language: TypeScript
  • Default Branch: develop
  • Homepage:
  • Size: 860 KB
Statistics
  • Stars: 1
  • Watchers: 5
  • Forks: 1
  • Open Issues: 8
  • Releases: 1
Topics
annotation-tool manuscript
Created almost 7 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

KHATT

DOI

Knowledge Hyperlinking and Text Transcription - this tool was developed by the Research Software Lab of the Centre for Digital Humanities of Utrecht University, in collaboration with Cornelis van Lit.

This draft gives an overview of the intended interface: KHATT4Interface

To Run

Without Docker

  1. Install Node, Yarn, PostgreSQL and Python 3.6 on your computer.
  2. Start PostgreSQL
  3. To keep your Python packages isolated, install virtualenv
  4. Download this directory (using git clone or downloading a zip and extracting it)
  5. Navigate to {name-of-topdirectory}/backend
  6. Run python bootstrap.py
  7. Follow the instructions of the script: setting up a virtual environment, creating the database, creating a superuser
  8. Enable your virtual environment (in the typical case, by running source activate .env)
  9. Navigate to {name-of-topdirectory}
  10. Run yarn. This will install all frontend and backend dependencies. Grab a cup of coffee, tea or other beverage of your choice.
  11. Run yarn start. This will start the server.
  12. Open your browser. At localhost:8000, you should be able to see the application.
  13. As there are no books in the database yet, create them. Navigate to localhost:8000/api/books/ (NB: don't omit the last slash!)
  14. There is a form at the bottom of the page which allows you to define a book with title and author.
  15. Go to localhost:8000/upload to upload a manuscript page for the book (for now: .jpg, .png only).
  16. Start marking and annotating
  17. To download, go to localhost:8000/download. By clicking the download link, you will get all annotations in the database which have been marked as complete, with information about the associated book, manuscript and page.

With Docker (requires 10 GB+ hard disk space)

  1. NB: does not run on all Windows licenses, please check this link
  2. Download Docker
  3. Download this directory (using git clone or downloading a zip and extracting it)
  4. Navigate to {name-of-your-khatt-directory}
  5. Run docker-compose up. This will build and start the application. Wait until all four containers (frontend, db, backend, nginx) have been started. Grab a cup of coffee in the meantime.
  6. As there are no books in the database yet, create them. Navigate to localhost/api/books/ (NB: don't omit the last slash!)
  7. There is a form at the bottom of the page which allows you to define a book with title and author.
  8. Go to localhost/upload to upload a manuscript page for the book (for now: .jpg, .png only).
  9. Start marking and annotating
  10. To download, go to localhost/download. By clicking the download link, you will get all annotations in the database which have been marked as complete, with information about the associated book, manuscript and page.
  11. To stop the container at any point, press crtl-c.
  12. To start the container again, repeat steps 4 and 5. Startup should go much faster (since the build only needs to be done once).

For developers

You need to install the following software:

  • PostgreSQL >= 9.3, client, server and C libraries
  • Python >= 3.4, <= 3.7
  • virtualenv
  • WSGI-compatible webserver (deployment only)
  • Visual C++ for Python (Windows only)
  • Node.js >= 8
  • Yarn
  • WebDriver for at least one browser (only for functional testing)

Architecture

This project integrates three isolated subprojects, each inside its own subdirectory with its own code, package dependencies and tests:

  • backend: the server side web application based on Django and DRF

  • frontend: the client side web application based on Angular

  • functional-tests: the functional test suite based on Selenium and pytest

Each subproject is configurable from the outside. Integration is achieved using "magic configuration" which is contained inside the root directory together with this README. In this way, the subprojects can stay truly isolated from each other.

If you are reading this README, you'll likely be working with the integrated project as a whole rather than with one of the subprojects in isolation. In this case, this README should be your primary source of information on how to develop or deploy the project. However, we recommend that you also read the "How it works" section in the README of each subproject.

Quickstart

First time after cloning this project:

console $ yarn

Running the application in development mode (hit ctrl-C to stop):

console $ yarn start

This will run the backend and frontend applications, as well as their unittests, and watch all source files for changes. You can visit the frontend on http://localhost:8000/, the browsable backend API on http://localhost:8000/api/ and the backend admin on http://localhost:8000/admin/. On every change, unittests rerun, frontend code rebuilds and open browser tabs refresh automatically (livereload).

Recommended order of development

For each new feature, we suggested that you work through the steps listed below. This could be called a back-to-front or "bottom up" order. Of course, you may have reasons to choose otherwise. For example, if very precise specifications are provided, you could move step 8 to the front for a more test-driven approach.

Steps 1–5 also include updating the unittests. Only functions should be tested, especially critical and nontrivial ones.

  1. Backend model changes including migrations.
  2. Backend serializer changes and backend admin changes.
  3. Backend API endpoint changes.
  4. Frontend model changes.
  5. Other frontend unit changes (templates, views, routers, FSMs).
  6. Frontend integration (globals, event bindings).
  7. Run functional tests, repair broken functionality and broken tests.
  8. Add functional tests for the new feature.
  9. Update technical documentation.

For release branches, we suggest the following checklist.

  1. Bump the version number in the package.json next to this README.
  2. Run the functional tests in production mode, fix bugs if necessary.
  3. Try using the application in production mode, look for problems that may have escaped the tests.
  4. Add regression tests (unit or functional) that detect problems from step 3.
  5. Work on the code until new regression tests from step 4 pass.
  6. Optionally, repeat steps 2–5 with the application running in a real deployment setup (see Deployment).

Commands for common tasks

The package.json next to this README defines several shortcut commands to help streamline development. In total, there are over 30 commands. Most may be regarded as implementation details of other commands, although each command could be used directly. Below, we discuss the commands that are most likely to be useful to you. For full details, consult the package.json.

Install the pinned versions of all package dependencies in all subprojects:

console $ yarn

Run backend and frontend in production mode:

console $ yarn start-p

Run the functional test suite:

console $ yarn test-func [FUNCTIONAL TEST OPTIONS]

The functional test suite by default assumes that you have the application running locally in production mode (i.e., on port 4200). See Configuring the browsers and Configuring the base address in functional-tests/README for options.

Run all tests (mostly useful for continuous integration):

console $ yarn test [FUNCTIONAL TEST OPTIONS]

Run an arbitrary command from within the root of a subproject:

console $ yarn back [ARBITRARY BACKEND COMMAND HERE] $ yarn front [ARBITRARY FRONTEND COMMAND HERE] $ yarn func [ARBITRARY FUNCTIONAL TESTS COMMAND HERE]

For example,

console $ yarn back less README.md

is equivalent to

console $ cd backend $ less README.md $ cd ..

Run python manage.py within the backend directory:

console $ yarn django [SUBCOMMAND] [OPTIONS]

yarn django is a shorthand for yarn back python manage.py. This command is useful for managing database migrations, among other things.

Manage the frontend package dependencies:

console $ yarn fyarn (add|remove|upgrade|...) (PACKAGE ...) [OPTIONS]

Notes on Python package dependencies

Both the backend and the functional test suite are Python-based and package versions are pinned using pip-tools in both subprojects. For ease of development, you most likely want to use the same virtualenv for both and this is also what the bootstrap.py assumes.

This comes with a small catch: the subprojects each have their own separate requirements.txt. If you run pip-sync in one subproject, the dependencies of the other will be uninstalled. In order to avoid this, you run pip install -r requirements.txt instead. The yarn command does this correctly by default.

Another thing to be aware of, is that pip-compile takes the old contents of your requirements.txt into account when building the new version based on your requirements.in. You can use the following trick to keep the requirements in both projects aligned so the versions of common packages don't conflict:

```console $ yarn back pip-compile

append contents of backend/requirements.txt to functional-tests/requirements.txt

$ yarn func pip-compile ```

Development mode vs production mode

The purpose of development mode is to facilitate live development, as the name implies. The purpose of production mode is to simulate deployment conditions as closely as possible, in order to check whether everything still works under such conditions. A complete overview of the differences is given below.

dimension | Development mode | Production mode -----------|--------------------|----------------- command | yarn start | yarn start-p base address | http://localhost:8000 | http://localhost:4200 backend server (Django) | in charge of everything | serves backend only

frontend server (angular-cli) | serves | watch and build

static files | served directly by Django's staticfiles app | collected by Django, served by gulp-connect backend DEBUG setting | True | False backend ALLOWED_HOSTS | - | restricted to localhost

frontend sourcemaps | yes | no frontend optimization | no | yes

Deployment

Both the backend and frontend applications have a section dedicated to deployment in their own READMEs. You should read these sections entirely before proceeding. All instructions in these sections still apply, though it is good to know that you can use the following shorthand commands from the integrated project root:

```console

collect static files of both backend and frontend, with overridden settings

$ yarn django collectstatic --settings SETTINGS --pythonpath path/to/SETTINGS.py ```

You should build the frontend before collecting all static files.

Owner

  • Name: Centre for Digital Humanities
  • Login: CentreForDigitalHumanities
  • Kind: organization
  • Email: cdh@uu.nl
  • Location: Netherlands

Interdisciplinary centre for research and education in computational and data-driven methods in the humanities.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: khatt
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - name: >-
      Research Software Lab, Centre for Digital Humanities,
      Utrecht University
identifiers:
  - type: doi
    value: 10.5281/zenodo.8325276
repository-code: 'https://github.com/UUDigitalHumanitieslab/khatt'
abstract: Knowledge Hyperlinking and Text Transcription
license: BSD-3-Clause
version: 1.0.0

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

backend/Dockerfile docker
  • python 3.6 build
frontend/Dockerfile docker
  • node 12.2.0 build
frontend/package-lock.json npm
  • 1076 dependencies
frontend/package.json npm
  • @angular-devkit/build-angular ~0.803.14 development
  • @angular/cli ~8.3.14 development
  • @angular/compiler-cli ~8.2.11 development
  • @angular/language-service ~8.2.11 development
  • @types/jasmine ~3.3.8 development
  • @types/jasminewd2 ~2.0.3 development
  • @types/node ~8.9.4 development
  • codelyzer ^5.0.0 development
  • jasmine-core ~3.4.0 development
  • jasmine-spec-reporter ~4.2.1 development
  • karma ~4.1.0 development
  • karma-chrome-launcher ~2.2.0 development
  • karma-coverage-istanbul-reporter ~2.0.1 development
  • karma-jasmine ~2.0.1 development
  • karma-jasmine-html-reporter ^1.4.0 development
  • protractor ~5.4.0 development
  • ts-node ~7.0.0 development
  • tslint ~5.15.0 development
  • typescript ~3.5.3 development
  • @angular/animations ~8.2.11
  • @angular/common ~8.2.11
  • @angular/compiler ~8.2.11
  • @angular/core ~8.2.11
  • @angular/forms ~8.2.11
  • @angular/platform-browser ~8.2.11
  • @angular/platform-browser-dynamic ~8.2.11
  • @angular/router ~8.2.11
  • @fortawesome/angular-fontawesome ^0.5.0
  • @fortawesome/fontawesome-svg-core ^1.2.24
  • @fortawesome/free-solid-svg-icons ^5.11.1
  • @ngrx/effects ^8.3.0
  • @ngrx/store ^8.3.0
  • @ngx-resource/core ^7.1.3
  • @types/uuid ^3.4.5
  • bulma ^0.7.5
  • core-js ^3.5.0
  • ngx-restangular ^5.0.0
  • primeicons ^2.0.0
  • primeng ^8.0.3
  • rxjs ~6.4.0
  • tslib ^1.10.0
  • uuid ^3.3.3
  • zone.js ~0.9.1
frontend/yarn.lock npm
  • 1080 dependencies
package.json npm
  • @angular/cli >=8 <9 development
yarn.lock npm
  • 242 dependencies
backend/requirements.in pypi
  • django >=3.0a1,<4
  • django-livereload-server *
  • djangorestframework *
  • psycopg2--no-binarypsycopg2 *
  • pytest *
  • pytest-django *
  • pytest-xdist *
backend/requirements.txt pypi
  • apipkg ==1.5
  • asgiref ==3.2.3
  • atomicwrites ==1.3.0
  • attrs ==19.3.0
  • beautifulsoup4 ==4.8.1
  • django ==3.0b1
  • django-livereload-server ==0.3.2
  • djangorestframework ==3.10.3
  • execnet ==1.7.1
  • gunicorn ==20.0.4
  • importlib-metadata ==0.23
  • more-itertools ==7.2.0
  • packaging ==19.2
  • pluggy ==0.13.0
  • psycopg2 ==2.8.4
  • py ==1.8.0
  • pyparsing ==2.4.2
  • pytest ==5.2.1
  • pytest-django ==3.6.0
  • pytest-forked ==1.1.3
  • pytest-xdist ==1.30.0
  • pytz ==2019.3
  • six ==1.12.0
  • soupsieve ==1.9.4
  • sqlparse ==0.3.0
  • tornado ==6.0.3
  • urllib3 ==1.25.6
  • wcwidth ==0.1.7
  • zipp ==0.6.0
functional-tests/requirements.in pypi
  • pytest * test
  • selenium * test
functional-tests/requirements.txt pypi
  • atomicwrites ==1.3.0 test
  • attrs ==19.1.0 test
  • importlib-metadata ==0.15 test
  • more-itertools ==7.0.0 test
  • pluggy ==0.12.0 test
  • py ==1.8.0 test
  • pytest ==4.5.0 test
  • selenium ==3.141.0 test
  • six ==1.12.0 test
  • urllib3 ==1.25.3 test
  • wcwidth ==0.1.7 test
  • zipp ==0.5.1 test