https://github.com/acdh-oeaw/basex-utils

This module contains a variety of utillity functions that proved to be useful in some of the projects at ACDH-CH

https://github.com/acdh-oeaw/basex-utils

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 1 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.0%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

This module contains a variety of utillity functions that proved to be useful in some of the projects at ACDH-CH

Basic Info
Statistics
  • Stars: 0
  • Watchers: 4
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created over 5 years ago · Last pushed over 5 years ago
Metadata Files
Readme License

README.md

BaseX utility functions

This module contains a variety of utillity functions that proved to be useful in some of the projects at ACDH-CH

Another kind of eval functions

This module contains a few wrappers aroung jobs:eval and jobs:wait to make it easy to write small snippets of XQuery containing a lot or all otherwise variable data as literals. The reason this is useful is BaseX' straight forward and easy to understand locking mechanism: Whenever the parser can't determine which databases are actually used by a query a global read lock is acquired. In case of updating queries a global write lock is acquired. In RestXQ functions this can severly impect how you can design your RESTful API. Furthermore it is easy to overlook holding a global lock so no other write or read operations can take place. The API seems stuck.

BaseX' locking design is sane and easy to understand so usually it is best to be left alone. The parser sometimes seems to give up on finding the collection/DB actually used very fast but looking as a human through ones code is not the parsers perspective so this is also good enough most of the time.

The eval functions in this module nevertheless allow a batch of smaller XQueries to be scripted using a larger XQuery and so the RestXQ API design is no longer dictated by the BaseX' locking. Also it is more likely to hold only a lock to a particular database. At least such XQuery snippets most of the time can be created.

This module actually predates BaseX' enforce index feature so it was also used to make it easier for the parser to "see" that it can use indexes of some DB.

There are also two evals functions that execute a sequence of similar (probably generated) XQueries in batches and return the result er errors of all of the XQueries passed to the function. One use of this is for example to query a few hundred DBs containing similar data that was split so updates to a particular part of the data will not exhaust resources and/or take forever to finish (e. g. because rebuilding the indices takes a long time for all the data).

RestXQ utility functions

This module contains

  • a function to decode a Basic Auth header to username:password
  • a function to get the correct base URI when BaseX is behind a reverse proxy
  • a function to get the correct scheme and hostname when BaseX is behind a reverse proxy

Functions to deal with a huge amount of nodes returned from a query

We had a case where a RestXQ initiated query would in the worst case return a few million result nodes (from a few hundred DBs). This would normally lead to all the nodes being serialized in memory an so would exhaust the memory.

To handle this situation there are functions in this module to convert result nodes to XML that represents the references (pre numbers) to those nodes plus some small data to sort or filter them (dehydrate). Then only a small subsequence of those nodes is actually read from the DBs and presented to the user (hydrate).

Get some XML document node or else another without global locking

Two contained functions get one of two documents given as parameter without needing a global lock.

Owner

  • Name: Austrian Centre for Digital Humanities & Cultural Heritage
  • Login: acdh-oeaw
  • Kind: organization
  • Email: acdh@oeaw.ac.at
  • Location: Vienna, Austria

GitHub Events

Total
Last Year

Committers

Last synced: 11 months ago

All Time
  • Total Commits: 3
  • Total Committers: 1
  • Avg Commits per committer: 3.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Omar Siam O****m@o****t 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels