tscan

T-scan: an analysis tool for dutch texts to assess the complexity of the text, based on original work by Rogier Kraf

https://github.com/centrefordigitalhumanities/tscan

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (2.1%) to scientific vocabulary

Keywords

dutch-language feature-extraction nlp text-difficulty
Last synced: 6 months ago · JSON representation ·

Repository

T-scan: an analysis tool for dutch texts to assess the complexity of the text, based on original work by Rogier Kraf

Basic Info
  • Host: GitHub
  • Owner: CentreForDigitalHumanities
  • License: agpl-3.0
  • Language: C++
  • Default Branch: master
  • Homepage:
  • Size: 37.2 MB
Statistics
  • Stars: 18
  • Watchers: 10
  • Forks: 7
  • Open Issues: 27
  • Releases: 18
Topics
dutch-language feature-extraction nlp text-difficulty
Created almost 11 years ago · Last pushed 9 months ago
Metadata Files
Readme Changelog License Citation Authors Codemeta

README

see README.md for more information

Owner

  • Name: Centre for Digital Humanities
  • Login: CentreForDigitalHumanities
  • Kind: organization
  • Email: cdh@uu.nl
  • Location: Netherlands

Interdisciplinary centre for research and education in computational and data-driven methods in the humanities.

Citation (CITATION.cff)

cff-version: 1.2.0
title: T-Scan
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Rogier
    family-names: Kraf
  - given-names: Ko
    name-particle: van der
    family-names: Sloot
    email: ko.vandersloot@let.ru.nl
    affiliation: >-
      Centre for Language and Speech Technology, Radboud
      University
  - given-names: Maarten
    name-particle: van
    family-names: Gompel
    email: proycon@anaproy.nl
    affiliation: >-
      Centre for Language and Speech Technology, Radboud
      University
    orcid: 'https://orcid.org/0000-0002-1046-0006'
  - given-names: Martijn
    name-particle: van der
    family-names: Klis
    email: m.h.vanderklis@uu.nl
    affiliation: 'Institute for Language Sciences, Utrecht University'
  - given-names: Sheean
    family-names: Spoel
    email: s.j.j.spoel@uu.nl
    affiliation: >-
      Research Software Lab, Centre for Digital Humanities,
      Utrecht University
  - given-names: Luka
    name-particle: van der
    family-names: Plas
    email: l.p.vanderplas@uu.nl
    affiliation: >-
      Research Software Lab, Centre for Digital Humanities,
      Utrecht University
    orcid: 'https://orcid.org/0000-0002-0217-7948'
identifiers:
  - type: doi
    value: 10.5281/zenodo.595912
repository-code: 'https://github.com/CentreForDigitalHumanities/tscan'
url: 'https://tscan.hum.uu.nl'
abstract: >-
  T-Scan is an analysis tool for dutch texts to assess the
  complexity of the text
license: AGPL-3.0
references:
  - authors:
    - given-names: Rogier
      family-names: Kraf
    - given-names: Henk
      family-names: Pander Maat
    title: 'Leesbaarheidsonderzoek: oude problemen, nieuwe kansen'
    type: article
    identifiers:
      - type: doi
        value: 10.5117/TVT2009.2.LEES356
  - authors:
    - given-names: Henk
      family-names: Pander Maat
    - given-names: Rogier
      family-names: Kraf
    - given-names: Antal
      name-particle: van den
      family-names: Bosch
    - given-names: Maarten
      name-particle: van
      family-names: Gompel
    - given-names: Ko
      name-particle: van der
      family-names: Sloot
    - given-names: Nick
      family-names: Dekker
    - given-names: Suzanne
      family-names: Kleijn
    - given-names: Ted
      family-names: Sanders
    title: 'T-Scan: a new tool for analyzing Dutch text'
    type: article
    url: 'http://www.clinjournal.org/sites/default/files/05-PanderMaat-etal-CLIN2014.pdf'
  - authors:
    - given-names: Rogier
      family-names: Kraf
    - given-names: Henk
      family-names: Pander Maat
    title: 'Handleiding T-Scan'
    type: article
    url: 'https://github.com/CentreForDigitalHumanities/tscan/raw/master/docs/tscanhandleiding.pdf'
keywords:
  - nlp
  - natural language processing
  - readability
  - feature extraction
  - dutch

CodeMeta (codemeta.json)

{
  "@context": [
    "https://doi.org/10.5063/schema/codemeta-2.0",
    "http://schema.org",
    {
      "entryPoints": {
        "@reverse": "schema:actionApplication"
      },
      "interfaceType": {
        "@id": "codemeta:interfaceType"
      }
    }
  ],
  "@type": "SoftwareSourceCode",
  "identifier": "tscan",
  "name": "T-Scan",
  "version": "0.10.0",
  "description": "T-Scan is an analysis tool for Dutch texts to assess the complexity of the text, and is based on original work by Rogier Kraf",
  "license": "https://opensource.org/licenses/AGPL-3.0",
  "url": "https://github.com/CentreForDigitalHumanities/tscan",
  "producer": [
    {
      "@id": "https://dig.hum.uu.nl",
      "@type": "Organization",
      "name": "Digital Humanities Lab",
      "url": "https://dig.hum.uu.nl",
      "parentOrganization": {
        "@id": "https://www.uu.nl",
        "@type": "Organization",
        "name": "Utrecht University",
        "url": "https://www.uu.nl"
      }
    },
    {
      "@id": "https://www.uu.nl/en/research/utrecht-institute-of-linguistics-ots",
      "@type": "Organization",
      "name": "Utrecht Institute of Linguistics OTS",
      "url": "https://www.uu.nl/en/research/utrecht-institute-of-linguistics-ots",
      "parentOrganization": {
        "@id": "https://www.uu.nl",
        "@type": "Organization",
        "name": "Utrecht University",
        "url": "https://www.uu.nl"
      }
    },
    {
      "@id": "https://www.ru.nl/clst",
      "@type": "Organization",
      "name": "Centre for Language and Speech Technology",
      "url": "https://www.ru.nl/clst",
      "parentOrganization": {
        "@id": "https://www.ru.nl/cls",
        "@type": "Organization",
        "name": "Centre for Language Studies",
        "url": "https://www.ru.nl/cls",
        "parentOrganization": {
          "@id": "https://www.ru.nl",
          "name": "Radboud University",
          "@type": "Organization",
          "url": "https://www.ru.nl",
          "location": {
            "@type": "Place",
            "name": "Nijmegen"
          }
        }
      }
    }
  ],
  "author": [
    {
      "@type": "Person",
      "givenName": "Sheean",
      "familyName": "Spoel",
      "email": "s.j.j.spoel@uu.nl",
      "affiliation": {
        "@id": "https://dig.hum.uu.nl/"
      }
    },
    {
      "@type": "Person",
      "givenName": "Luka",
      "familyName": "van der Plas",
      "email": "l.p.vanderplas@uu.nl",
      "affiliation": {
        "@id": "https://dig.hum.uu.nl/"
      }
    },
    {
      "@type": "Person",
      "givenName": "Martijn",
      "familyName": "van der Klis",
      "email": "m.h.vanderklis@uu.nl",
      "affiliation": {
        "@id": "https://www.uu.nl/en/research/utrecht-institute-of-linguistics-ots"
      }
    },
    {
      "@type": "Person",
      "givenName": "Ko",
      "familyName": "van der Sloot",
      "email": "ko.vandersloot@let.ru.nl",
      "affiliation": {
        "@id": "https://www.ru.nl/clst"
      }
    },
    {
      "@id": "https://orcid.org/0000-0002-1046-0006",
      "@type": "Person",
      "givenName": "Maarten",
      "familyName": "van Gompel",
      "email": "proycon@anaproy.nl",
      "affiliation": {
        "@id": "https://www.ru.nl/clst"
      }
    }
  ],
  "sourceOrganization": {
    "@id": "https://dig.hum.uu.nl"
  },
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "identifier": "c++",
    "name": "C++"
  },
  "operatingSystem": "POSIX",
  "codeRepository": "https://github.com/CentreForDigitalHumanities/tscan",
  "softwareRequirements": [
    {
      "@type": "SoftwareApplication",
      "identifier": "libfolia",
      "name": "libfolia"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "wopr",
      "name": "Wopr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "ucto",
      "name": "ucto"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "frog",
      "name": "frog"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "alpino",
      "name": "Alpino"
    }
  ],
  "funder": [
    {
      "@type": "Organization",
      "name": "Leesbaarheids-Index Nederlands (LIN) (NWO grant)"
    }
  ],
  "readme": "https://github.com/CentreForDigitalHumanities/tscan/blob/master/README.md",
  "issueTracker": "https://github.com/CentreForDigitalHumanities/tscan/issues",
  "contIntegration": "https://github.com/CentreForDigitalHumanities/tscan/actions",
  "releaseNotes": "https://github.com/CentreForDigitalHumanities/tscan/releases",
  "developmentStatus": "active",
  "keywords": [
    "nlp",
    "natural language processing",
    "readability",
    "feature extraction",
    "dutch"
  ],
  "referencePublication": [
    {
      "@type": "TechArticle",
      "name": "Handleiding T-Scan",
      "author": [
        "Henk Pander Maat",
        "Rogier Kraf",
        "Nick Dekker"
      ],
      "url": "https://github.com/CentreForDigitalHumanities/tscan/raw/master/docs/tscanhandleiding.pdf"
    },
    {
      "@id": "http://hdl.handle.net/2066/134833",
      "@type": "ScholarlyArticle",
      "name": "T-Scan: a new tool for analyzing Dutch text",
      "author": [
        "Henk Pander Maat",
        "Rogier Kraf",
        "Antal van den Bosch",
        "Maarten van Gompel",
        "Ko van der Sloot",
        "Nick Dekker",
        "Suzanne Kleijn",
        "Ted Sanders"
      ],
      "pageStart": "53",
      "pageEnd": 74,
      "isPartOf": {
        "@type": "PublicationIssue",
        "datePublished": "2014",
        "name": "Computational Linguistics in the Netherlands Journal",
        "issue": "4",
        "location": "Nijmegen, the Netherlands"
      },
      "url": "http://www.clinjournal.org/sites/default/files/05-PanderMaat-etal-CLIN2014.pdf"
    }
  ],
  "dateCreated": "2012-09-12",
  "entryPoints": [
    {
      "@type": "EntryPoint",
      "name": "tscan",
      "urlTemplate": "file:///tscan",
      "description": "Command-line interface",
      "interfaceType": "CLI"
    }
  ]
}

GitHub Events

Total
  • Create event: 3
  • Issues event: 1
  • Watch event: 1
  • Delete event: 3
  • Issue comment event: 3
  • Member event: 2
  • Push event: 9
  • Pull request review event: 1
  • Pull request review comment event: 4
  • Pull request event: 2
  • Fork event: 1
Last Year
  • Create event: 3
  • Issues event: 1
  • Watch event: 1
  • Delete event: 3
  • Issue comment event: 3
  • Member event: 2
  • Push event: 9
  • Pull request review event: 1
  • Pull request review comment event: 4
  • Pull request event: 2
  • Fork event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 81
  • Total pull requests: 9
  • Average time to close issues: 5 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 14
  • Total pull request authors: 4
  • Average comments per issue: 1.98
  • Average comments per pull request: 0.22
  • Merged pull requests: 9
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 18 minutes
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • oktaal (48)
  • lukavdplas (10)
  • proycon (6)
  • mhkuu (4)
  • janwillemb (3)
  • Duchadian (2)
  • naiaden (1)
  • AntheSevenants (1)
  • JozefienPiersoul (1)
  • peterATixly (1)
  • WillSkywalker (1)
  • kosloot (1)
  • joepfranssen (1)
Pull Request Authors
  • oktaal (6)
  • janwillemb (1)
  • mhkuu (1)
  • Duchadian (1)
Top Labels
Issue Labels
bug (12) enhancement (4) wontfix (4) help wanted (1)
Pull Request Labels

Dependencies

webservice/setup.py pypi
  • CLAM *
.github/workflows/cpp.yml actions
  • actions/checkout v2 composite
.github/workflows/webservice.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
Dockerfile docker
  • proycon/lamachine@sha256 8eacbcba4cbd2b73de2148f1353f0661bbbd7db4742b90684cc0ac3449f1774a build