https://github.com/dasch-swiss/ark-resolver
DSP ARK Resolver
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary
Repository
DSP ARK Resolver
Basic Info
Statistics
- Stars: 2
- Watchers: 9
- Forks: 0
- Open Issues: 7
- Releases: 34
Metadata Files
README.md
The DSP ARK Resolver
Resolves ARK URLs referring to resources in DSP (formerly called Knora) repositories.
Project Status
The DSP ARK Resolver is a hybrid Python/Rust application currently undergoing migration from Python to Rust in three phases:
- Phase 1 (Current): Add functionality to Rust and run in parallel with Python implementation to verify correct behavior in production, while Python behavior remains user-facing. Rust functions are exposed as Python extensions via PyO3/Maturin.
- Phase 2: Change user-facing behavior to Rust implementation and start removing Python components.
- Phase 3: Refactor Rust code into a standalone service using Axum, completely removing Python dependencies.
Architecture
- Python (Sanic): Main HTTP server, routing, and business logic
- Rust (PyO3): Performance-critical functions exposed as Python extensions
- Environment-driven configuration: Uses environment variables with defaults, registry loaded from
ARK_REGISTRY- HTTPS via Rustls: Rust HTTP client uses
rustlswith embedded Mozilla roots, avoiding dependency on system CA bundles
- HTTPS via Rustls: Rust HTTP client uses
Modes of operation
The program ark.py has two modes of operation:
- When run as an HTTP server, it resolves DSP ARK URLs by redirecting to the actual location of each resource. Redirect URLs are generated from templates in a configuration file. The hostname used in the redirect URL, as well as the whole URL template, can be configured per project.
To start the ark-resolver as server, type:
bash
python ark.py -s
- The ark-resolver can also be used as a command-line tool for converting between resource IRIs and ARK URLs, using the same configuration file.
For usage information, run ./ark.py --help. The application is configured entirely through environment variables, with a sample registry file available at tests/ark-registry.ini for local testing.
Environment Variables
The application can be configured using the following environment variables:
ARK_EXTERNAL_HOST: External hostname used in ARK URLs (default:ark.example.org)ARK_INTERNAL_HOST: Internal hostname for the server (default:0.0.0.0)ARK_INTERNAL_PORT: Port for the server to bind to (default:3336)ARK_NAAN: Name Assigning Authority Number (default:00000)ARK_HTTPS_PROXY: Whether behind HTTPS proxy (default:true)ARK_REGISTRY: Path or URL to the project registry file (required)ARK_GITHUB_SECRET: Secret for GitHub webhook authentication
Rust HTTP Client Configuration (Advanced)
Additional environment variables for debugging and timeout control in containerized environments:
ARK_RUST_LOAD_TIMEOUT_MS: Application-level timeout for settings loading (default:15000) - prevents container SIGTERMARK_RUST_HTTP_TIMEOUT_MS: HTTP request total timeout in milliseconds (default:10000)ARK_RUST_HTTP_CONNECT_TIMEOUT_MS: HTTP connection timeout in milliseconds (default:5000)ARK_RUST_FORCE_IPV4: Force IPv4-only connections, disable IPv6 (default:false) - fixes container IPv6 connectivity issuesRUST_LOG: Controls tracing verbosity (e.g.,RUST_LOG=ark_resolver=debug,reqwest=debug,hyper=debug)ARK_SENTRY_DEBUG: Enable Sentry debug mode (default:false) - accepts "true"/"1"/"yes"/"on" for true
The Rust HTTP client also supports standard proxy environment variables (HTTPS_PROXY, HTTP_PROXY, ALL_PROXY).
For production deployments, ARK_REGISTRY should point to the appropriate registry file from the ark-resolver-data repository.
In the sample registry file, the redirect URLs are DSP-API URLs, but it is recommended that in production, redirect URLs should refer to human-readable representations provided by a user interface.
Requirements / local setup
First, install uv, which will automatically handle your Python installations,
virtual environments, and dependencies:
bash
curl -LsSf https://astral.sh/uv/install.sh | sh
Then, create the virtual environment and install the dependencies with:
bash
uv sync
Local Development
For local development and testing, set the registry file environment variable:
bash
export ARK_REGISTRY="tests/ark-registry.ini"
You can then run the server locally:
bash
./ark.py -s
Or use the convenient just command:
bash
just run
Examples for using the ark-resolver on the command-line
Converting a DSP resource IRI to an ARK URL
$ ./ark.py -i http://rdfh.ch/0002/70aWaB2kWsuiN6ujYgM0ZQ
https://ark.example.org/ark:/00000/1/0002/70aWaB2kWsuiN6ujYgM0ZQD
Converting a DSP value IRI to an ARK URL with Timestamp
$ ./ark.py -i http://rdfh.ch/0002/70aWaB2kWsuiN6ujYgM0ZQ -d 20220119T101727886178Z
https://ark.example.org/ark:/00000/1/0002/70aWaB2kWsuiN6ujYgM0ZQD.20220119T101727886178Z
Converting an ARK URL from a project on salsah.org to a custom resource IRI for import into DSP
$ ./ark.py -a http://ark.example.org/ark:/00000/0002-751e0b8a-6.2021519 -r
http://rdfh.ch/0002/70aWaB2kWsuiN6ujYgM0ZQ
Redirecting an ARK URL from a resource created on salsah.org to the location of the resource on DSP
$ ./ark.py -a http://ark.example.org/ark:/00000/0002-751e0b8a-6.2021519
http://0.0.0.0:4200/resource/0002/70aWaB2kWsuiN6ujYgM0ZQ
A note about the creation of Resource IRIs from Salsah ARK URLs
As permanent identifiers, ARKs need to be valid for an unlimited period of time. So, after resources have been migrated from salsah.org to DSP, their ARK URLs need to stay valid. This means that the same ARK URL that formerly was redirected to a resource on salsah.org, now has to be redirected to the same resource on DSP.
To enable the correct redirection of ARK URLs coming from salsah.org to resources on DSP the DSP resource IRI
(which contains a UUID) needs to be calculated from the resource ID provided in the ARK. To do so, UUIDs of version 5
are used. The DaSCH specific namespace used for the creation of UUIDs is cace8b00-717e-50d5-bcb9-486f39d733a2. It is
created from the generic uuid.NAMESPACE_URL the Python library uuid
provides and the string https://dasch.swiss and is therefore itself a UUID version 5.
Projects migrated from salsah.org to DSP need to have parameter AllowVersion0 set to true in their project
configuration (registry file). Otherwise, the ARK URLs of version 0 are rejected.
Server routes
GET /config
Returns the server's configuration, including the project registry, but not
including ArkGitHubSecret.
POST /reload
Accepts a GitHub webhook request in JSON, and validates it according to
Securing your webhooks, using
the secret configured as ArkGitHubSecret. If the request is valid, reloads the
configuration, including the project registry. Changes to ArkInternalHost and
ArkInternalPort are not taken into account.
All other GET requests are interpreted as ARK URLs.
Using Docker
Images are published to the daschswiss/ark-resolver Docker Hub repository.
Basic Usage
bash
docker run -p 3336:3336 daschswiss/ark-resolver
Environment Configuration
The Docker container can be configured using environment variables:
bash
docker run -p 3336:3336 \
-e ARK_EXTERNAL_HOST="ark.example.org" \
-e ARK_INTERNAL_HOST="0.0.0.0" \
-e ARK_INTERNAL_PORT="3336" \
-e ARK_NAAN="72163" \
-e ARK_HTTPS_PROXY="true" \
-e ARK_REGISTRY="tests/ark-registry.ini" \
-e ARK_GITHUB_SECRET="your-webhook-secret" \
daschswiss/ark-resolver
Production Deployment
For staging and production deployments, set the registry file to load from the external repository:
```bash
Staging
docker run -p 3336:3336 \ -e ARKREGISTRY="https://raw.githubusercontent.com/dasch-swiss/ark-resolver-data/master/data/dascharkregistrystaging.ini" \ daschswiss/ark-resolver
Note on TLS: The Rust settings loader fetches the registry over HTTPS using reqwest with rustls. No ca-certificates package is required in the runtime image.
Note on SIGTERM Prevention: The Rust HTTP client includes application-level timeouts (15s default) to prevent container orchestrators from killing the service during slow HTTP requests. Use ARK_RUST_LOAD_TIMEOUT_MS to adjust if needed.
Production
docker run -p 3336:3336 \ -e ARKREGISTRY="https://raw.githubusercontent.com/dasch-swiss/ark-resolver-data/master/data/dascharkregistryprod.ini" \ daschswiss/ark-resolver ```
Docker Compose
See docker-compose.yml for a complete example configuration.
Building Images
Multi-architecture images can be built using the provided just commands:
```bash
For linux/amd64
just docker-build-intel
For linux/arm64
just docker-build-arm ```
Owner
- Name: DaSCH - Swiss National Data and Service Center for the Humanities
- Login: dasch-swiss
- Kind: organization
- Email: info@dasch.swiss
- Location: Switzerland
- Website: https://dasch.swiss
- Twitter: DaSCHSwiss
- Repositories: 35
- Profile: https://github.com/dasch-swiss
Development repositories of the DaSCH.
GitHub Events
Total
- Release event: 5
- Delete event: 77
- Member event: 2
- Issue comment event: 67
- Push event: 305
- Pull request review comment event: 20
- Pull request review event: 59
- Pull request event: 109
- Create event: 88
Last Year
- Release event: 5
- Delete event: 77
- Member event: 2
- Issue comment event: 67
- Push event: 305
- Pull request review comment event: 20
- Pull request review event: 59
- Pull request event: 109
- Create event: 88
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 0
- Total pull requests: 68
- Average time to close issues: N/A
- Average time to close pull requests: 6 days
- Total issue authors: 0
- Total pull request authors: 5
- Average comments per issue: 0
- Average comments per pull request: 0.82
- Merged pull requests: 43
- Bot issues: 0
- Bot pull requests: 12
Past Year
- Issues: 0
- Pull requests: 68
- Average time to close issues: N/A
- Average time to close pull requests: 6 days
- Issue authors: 0
- Pull request authors: 5
- Average comments per issue: 0
- Average comments per pull request: 0.82
- Merged pull requests: 43
- Bot issues: 0
- Bot pull requests: 12
Top Authors
Issue Authors
Pull Request Authors
- subotic (34)
- dependabot[bot] (16)
- daschbot (6)
- BalduinLandolt (5)
- SamuelBoerlin (5)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Sanic-Cors ==1.0.1
- Sanic-Plugins-Framework *
- aiofiles ==0.7.0
- certifi ==2021.5.30
- chardet ==4.0.0
- h11 ==0.9.0
- httpcore ==0.11.1
- httptools ==0.3.0
- httpx ==0.15.4
- idna ==2.10
- multidict ==5.1.0
- requests ==2.25.1
- rfc3986 ==1.5.0
- sanic ==21.9.1
- sanic-routing *
- sniffio ==1.2.0
- toml ==0.10.2
- types-requests ==2.25.9
- typing-extensions ==3.10.0.2
- ujson ==4.2.0
- urllib3 ==1.26.7
- uvloop ==0.16.0
- websockets ==10.0