alud

A Go package for deriving Universal Dependencies from Dutch sentences parsed with Alpino

https://github.com/rug-compling/alud

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    2 of 4 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (2.4%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

A Go package for deriving Universal Dependencies from Dutch sentences parsed with Alpino

Basic Info
  • Host: GitHub
  • Owner: rug-compling
  • License: other
  • Language: Go
  • Default Branch: master
  • Homepage:
  • Size: 982 KB
Statistics
  • Stars: 8
  • Watchers: 3
  • Forks: 1
  • Open Issues: 2
  • Releases: 19
Created almost 7 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog License Codemeta

README.md

Van Alpino naar Universal Dependencies

A Go package for deriving Universal Dependencies from Dutch sentences parsed with Alpino.

GoDoc Technology Readiness Stage 4/4 - The technology is complete, stable and deployed in production scenarios for end-users

Example input:

xml <?xml version="1.0" encoding="UTF-8"?> <alpino_ds version="1.6"> <node begin="0" cat="top" end="8" id="0" rel="top"> <node begin="0" cat="smain" end="7" id="1" rel="--"> <node begin="0" cat="pp" end="3" id="2" rel="mod"> <node begin="0" end="1" frame="preposition(in,[])" his="normal" his_1="decap" his_1_1="normal" id="3" lcat="pp" lemma="in" pos="prep" postag="VZ(init)" pt="vz" rel="hd" root="in" sense="in" vztype="init" word="In"/> <node begin="1" cat="np" end="3" id="4" rel="obj1"> <node begin="1" end="2" frame="determiner(het,nwh,nmod,pro,nparg,wkpro)" his="normal" his_1="normal" id="5" infl="het" lcat="detp" lemma="het" lwtype="bep" naamval="stan" npagr="evon" pos="det" postag="LID(bep,stan,evon)" pt="lid" rel="det" root="het" sense="het" wh="nwh" word="het"/> <node begin="2" end="3" frame="proper_name(sg,'LOC')" genus="onz" getal="ev" graad="basis" his="normal" his_1="names_dictionary" id="6" lcat="np" lemma="Waddengebied" naamval="stan" neclass="LOC" ntype="eigen" num="sg" pos="name" postag="N(eigen,ev,basis,onz,stan)" pt="n" rel="hd" rnum="sg" root="Waddengebied" sense="Waddengebied" word="Waddengebied"/> </node> </node> <node begin="3" end="4" frame="verb(unacc,sg_heeft,copula)" his="normal" his_1="normal" id="7" infl="sg_heeft" lcat="smain" lemma="zijn" pos="verb" postag="WW(pv,tgw,ev)" pt="ww" pvagr="ev" pvtijd="tgw" rel="hd" root="ben" sc="copula" sense="ben" stype="declarative" tense="present" word="is" wvorm="pv"/> <node begin="4" cat="np" end="6" id="8" rel="su"> <node begin="4" end="5" frame="determiner(de)" his="normal" his_1="normal" id="9" infl="de" lcat="detp" lemma="de" lwtype="bep" naamval="stan" npagr="rest" pos="det" postag="LID(bep,stan,rest)" pt="lid" rel="det" root="de" sense="de" word="de"/> <node begin="5" end="6" frame="noun(de,count,sg)" gen="de" genus="zijd" getal="ev" graad="basis" his="normal" his_1="normal" id="10" lcat="np" lemma="wind" naamval="stan" ntype="soort" num="sg" pos="noun" postag="N(soort,ev,basis,zijd,stan)" pt="n" rel="hd" rnum="sg" root="wind" sense="wind" word="wind"/> </node> <node aform="base" begin="6" buiging="zonder" end="7" frame="adjective(no_e(nonadv))" graad="basis" his="normal" his_1="normal" id="11" infl="no_e" lcat="ap" lemma="veranderlijk" pos="adj" positie="vrij" postag="ADJ(vrij,basis,zonder)" pt="adj" rel="predc" root="veranderlijk" sense="veranderlijk" vform="adj" word="veranderlijk"/> </node> <node begin="7" end="8" frame="punct(punt)" his="normal" his_1="normal" id="12" lcat="punct" lemma="." pos="punct" postag="LET()" pt="let" rel="--" root="." sense="." special="punt" word="."/> </node> <sentence sentid="knmi.1">In het Waddengebied is de wind veranderlijk .</sentence> </alpino_ds>

Output (reformatted):

```

source = knmi.1.xml

sent_id = knmi.1

text = In het Waddengebied is de wind veranderlijk.

auto = ALUD 1

1 In in ADP VZ|init _ 3 case 3:case _ 2 het het DET LID|bep|stan|evon Definite=Def 3 det 3:det _ 3 Waddengebied Waddengebied PROPN N|eigen|ev|basis|onz|stan Gender=Neut|Number=Sing 7 obl 7:obl:in _ 4 is zijn AUX WW|pv|tgw|ev Number=Sing|Tense=Pres|VerbForm=Fin 7 cop 7:cop _ 5 de de DET LID|bep|stan|rest Definite=Def 6 det 6:det _ 6 wind wind NOUN N|soort|ev|basis|zijd|stan Gender=Com|Number=Sing 7 nsubj 7:nsubj _ 7 veranderlijk veranderlijk ADJ ADJ|vrij|basis|zonder Degree=Pos 0 root 0:root SpaceAfter=No 8 . . PUNCT LET _ 7 punct 7:punct _

```

For visualisation of the result, see http://www.let.rug.nl/kleiweg/conllu/

Other functionality

  • Derive Universal Dependencies and insert into the alpino_ds format
  • Insert given Universal Dependencies into the alpino_ds format

Example programs

The package comes with two example programs, alud and alud-dact.

To install the demo program alud-dact, you need to have Oracle Berkeley DB XML installed. With alud-dact you can process dact files.

Install both:

make bin

... or install alud only:

make alud

... or install alud-dact only:

make alud-dact

You will find the programs in one of these directories:

  1. in $GOBIN if variable is set
  2. else in $GOPATH/bin if variable is set
  3. else in ~/go/bin

Owner

  • Name: Computational Linguistics, University of Groningen
  • Login: rug-compling
  • Kind: organization
  • Location: Groningen, The Netherlands

CodeMeta (codemeta.json)

{
  "@context": [
    "https://schema.org",
    "https://doi.org/10.5063/schema/codemeta-2.0",
    "https://w3id.org/software-iodata",
    "https://w3id.org/software-types",
    "https://raw.githubusercontent.com/jantman/repostatus.org/master/badges/latest/ontology.jsonld"
  ],
  "@id": "https://www.let.rug.nl/alfa/rdf/alud",
  "@type": "SoftwareSourceCode",
  "name": "alud",
  "description": "A Go package for deriving Universal Dependencies from Dutch sentences parsed with Alpino",
  "sameAs": "https://tools.clariah.nl/alud",
  "dateCreated": "2019-06-30",
  "datePublished": "2019-11-27",
  "dateModified": "2025-10-29",
  "version": "2.14.4",
  "releaseNotes": "https://github.com/rug-compling/alud/blob/master/Changes.txt",
  "codeRepository": "https://github.com/rug-compling/alud.git",
  "readme": "https://github.com/rug-compling/alud/blob/master/README.md",
  "issueTracker": "https://github.com/rug-compling/alud/issues",
  "license": "https://spdx.org/licenses/BSD-2-Clause",
  "developmentStatus": [
    "https://www.repostatus.org/#active",
    "https://w3id.org/research-technology-readiness-levels#Stage4Complete"
  ],
  "applicationCategory": [
    "https://w3id.org/nwo-research-fields#SoftwareForHumanities",
    "https://w3id.org/nwo-research-fields#Linguistics",
    "https://w3id.org/nwo-research-fields#ComputationalLinguisticsandPhilology",
    "https://vocabs.dariah.eu/tadirah/structuralAnalysis"
  ],
  "producer": {
    "@id": "_:cl",
    "@type": "Organization",
    "name": [
      {
        "@value": "Computationele Taalkunde, Faculteit der Letteren, Rijksuniversiteit Groningen",
        "@language": "nl"
      },
      {
        "@value": "Computational Linguistics, Faculty of Arts, Groningen University",
        "@language": "en"
      }
    ],
    "url": "https://www.rug.nl/research/clcg/research/cl/",
    "parentOrganization": {
      "@id": "_:rug",
      "@type": "Organization",
      "name": [
        {
          "@value": "Rijksuniversiteit Groningen",
          "@language": "nl"
        },
        {
          "@value": "Groningen University",
          "@language": "en"
        }
      ],
      "url": "https://www.rug.nl/",
      "sameAs": "http://www.wikidata.org/entity/Q850730",
      "location": {
        "@id": "_:groningen",
        "@type": "Place",
        "name": "Groningen",
        "address": {
          "@id": "_:grnNL",
          "@type": "PostalAddress",
          "addressLocality": "Groningen",
          "addressRegion": "Groningen",
          "addressCountry": "NL"
        },
        "sameAs": [
          "http://www.wikidata.org/entity/Q749",
          "https://sws.geonames.org/2755251/"
        ]
      }
    }
  },
  "downloadUrl": "https://github.com/rug-compling/alud/archive/refs/heads/master.zip",
  "keywords": [
    "UD: Universal Dependencies",
    "Alpino"
  ],
  "programmingLanguage": [
    "Go"
  ],
  "operatingSystem": [
    "aix",
    "android",
    "darwin",
    "dragonfly",
    "freebsd",
    "illumos",
    "ios",
    "js",
    "linux",
    "netbsd",
    "openbsd",
    "plan9",
    "solaris",
    "windows"
  ],
  "softwareHelp": [
    {
      "@type": "WebSite",
      "name": "alud package - github.com/rug-compling/alud/v2 - Go Packages",
      "url": "https://pkg.go.dev/github.com/rug-compling/alud/v2"
    }
  ],
  "author": [
    {
      "@type": "Person",
      "@id": "_:gbouma",
      "sameAs": "https://orcid.org/0000-0003-1106-3181",
      "givenName": "Gosse",
      "familyName": "Bouma",
      "email": "mailto:g.bouma@rug.nl",
      "affiliation": {
        "@id": "_:cl"
      }
    },
    {
      "@type": "Person",
      "@id": "_:pkleiweg",
      "sameAs": "https://orcid.org/0000-0001-8364-3201",
      "givenName": "Peter",
      "familyName": "Kleiweg",
      "email": "mailto:p.c.j.kleiweg@rug.nl",
      "affiliation": {
        "@id": "_:cl"
      }
    }
  ],
  "maintainer": [
    {
      "@id": "_:gbouma"
    },
    {
      "@id": "_:pkleiweg"
    }
  ],
  "targetProduct": [
    {
      "@type": "CommandLineApplication",
      "name": "alud",
      "executableName": "alud"
    },
    {
      "@type": "SoftwareLibrary",
      "name": "github.com/rug-compling/alud"
    }
  ],
  "referencePublication": [
    {
      "@type": "ScholarlyArticle",
      "sameAs": "https://research.rug.nl/en/publications/increasing-return-on-annotation-investment-the-automatic-construc",
      "name": "Increasing return on annotation investment: the automatic construction of a Universal Dependency treebank for Dutch",
      "author": [
        "Gosse Bouma",
        "Gertjan van Noord"
      ],
      "isPartOf": {
        "@type": "PublicationIssue",
        "datePublished": "2017",
        "name": "Proceedings of the NoDaLiDa 2017 Workshop on Universal Dependencies (UDW 2017)",
        "pageStart": "19",
        "pageEnd": "26"
      },
      "url": "https://research.rug.nl/files/50364749/Increasing_return_on_annotation_investment.pdf"
    },
    {
      "@type": "ScholarlyArticle",
      "sameAs": "https://research.rug.nl/en/publications/comparing-two-methods-for-adding-enhanced-dependencies-to-ud-tree",
      "name": "Comparing two methods for adding Enhanced Dependencies to UD treebanks",
      "author": "Gosse Bouma",
      "isPartOf": {
        "@type": "PublicationIssue",
        "datePublished": "2018",
        "name": "Proceedings of the 17th International Workshop on Treebanks and Linguistic Theories (TLT 2018), December 13–14, 2018, Oslo University, Norway",
        "location": "Linköping, Sweden",
        "pageStart": "17",
        "pageEnd": "30"
      }
    }
  ],
  "funding": {
    "@type": "Grant",
    "name": "Projectnummer CP-WP3-19-004, PaQuUd, Parse and Query with Universal Dependencies",
    "url": "https://www.clariah.nl/wp3-linguistics",
    "funder": {
      "@type": "Organization",
      "name": "Clariah",
      "url": "https://www.clariah.nl/"
    }
  }
}

GitHub Events

Total
  • Watch event: 1
  • Push event: 1
  • Fork event: 1
Last Year
  • Watch event: 1
  • Push event: 1
  • Fork event: 1

Committers

Last synced: about 3 years ago

All Time
  • Total Commits: 262
  • Total Committers: 4
  • Avg Commits per committer: 65.5
  • Development Distribution Score (DDS): 0.206
Top Committers
Name Email Commits
Peter Kleiweg p****g@r****l 208
Gosse Bouma g****a@r****l 39
Peter Kleiweg k****g@z****l 14
Peter Kleiweg p****g@x****l 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: almost 3 years ago

All Time
  • Total issues: 13
  • Total pull requests: 0
  • Average time to close issues: about 1 month
  • Average time to close pull requests: N/A
  • Total issue authors: 3
  • Total pull request authors: 0
  • Average comments per issue: 1.15
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: 13 days
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 2.5
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • pebbe (10)
  • lemontheme (2)
  • danieldk (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 3
  • Total downloads: unknown
  • Total dependent packages: 2
    (may contain duplicates)
  • Total dependent repositories: 2
    (may contain duplicates)
  • Total versions: 43
proxy.golang.org: github.com/rug-compling/alud/v2/cmd/alud-dact
  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 1.6%
Average: 4.1%
Dependent packages count: 6.5%
Last synced: 9 months ago
proxy.golang.org: github.com/rug-compling/alud
  • Versions: 20
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 7.0%
Average: 8.2%
Dependent repos count: 9.3%
Last synced: 9 months ago
proxy.golang.org: github.com/rug-compling/alud/v2

Package alud derives Universal Dependencies from sentences parsed with Alpino. Usually, the input is XML in the alpino_ds format. The output is in the CoNLL-U format, or the Universal Dependencies can be embedded into the alpino_ds format (version 1.10), making them available for XPath queries. It is also possible to embed a user provided file in the CoNLL-U format, and embed this into the alpino_ds format. When empty heads are reconstructed (resulting in lines with an ID with a dot), the ID of the original line is added in the last field of the CoNLL-U format, in the form CopiedFrom=ID. This information is necessary for correct embedding into the alpino_ds format. ---- The package is based on a translation of an xquery script written by Gosse Bouma. See Alpino: https://www.let.rug.nl/vannoord/alp/Alpino/ See Universal Dependencies: https://universaldependencies.org/ See CoNLL-U: https://universaldependencies.org/format.html See xquery script: https://github.com/gossebouma/lassy2ud

  • Versions: 21
  • Dependent Packages: 2
  • Dependent Repositories: 2
Rankings
Dependent repos count: 3.5%
Dependent packages count: 4.2%
Average: 9.4%
Stargazers count: 11.1%
Forks count: 18.7%
Last synced: 9 months ago

Dependencies

v2/cmd/alud-dact/go.mod go
  • github.com/pebbe/dbxml v1.3.1
  • github.com/rug-compling/alpinods v1.12.4
  • github.com/rug-compling/alud/v2 v2.0.0-00010101000000-000000000000
v2/cmd/alud-dact/go.sum go
  • github.com/pebbe/dbxml v1.3.1
  • github.com/rug-compling/alpinods v1.12.4
v2/go.mod go
  • github.com/rug-compling/alpinods v1.12.4
v2/go.sum go
  • github.com/rug-compling/alpinods v1.12.4