pathling

Tools that make it easier to use FHIR® and clinical terminology within data analytics, built on Apache Spark.

https://github.com/aehrc/pathling

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    5 of 13 committers (38.5%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.3%) to scientific vocabulary

Keywords

analytics fhir spark standards terminology

Keywords from Contributors

interactive annotation optimizer xunit-test xunit-framework yolov5s embedded mesh cryptocurrencies graph-generation
Last synced: 6 months ago · JSON representation

Repository

Tools that make it easier to use FHIR® and clinical terminology within data analytics, built on Apache Spark.

Basic Info
  • Host: GitHub
  • Owner: aehrc
  • License: apache-2.0
  • Language: Java
  • Default Branch: main
  • Homepage: https://pathling.csiro.au
  • Size: 169 MB
Statistics
  • Stars: 108
  • Watchers: 12
  • Forks: 15
  • Open Issues: 74
  • Releases: 39
Topics
analytics fhir spark standards terminology
Created almost 6 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

Pathling logo

Quality Gate Status

Pathling is a set of tools that make it easier to use FHIR® and clinical terminology within health data analytics. It is built on Apache Spark, and it implements the SQL on FHIR view specification and the Bulk Data Access implementation guide.

Read the documentation →

What can it do?

Query and transformation of FHIR data

FHIR R4 is the dominant standard for exchanging health data. It comes in both JSON or XML formats, and can contain over 140 different types of resources, such as Patient, Observation, Condition, Procedure, and many more.

Pathling is capable of reading all the different types of FHIR resources into a format suitable for data analysis tasks. This makes the following things possible:

  • Creating SQL-friendly views from FHIR data
  • Transforming data into other formats, such as CSV or Parquet
  • Performing terminology queries against coded fields within the FHIR data

See Data in and out and Running queries for more information.

Terminology queries

Health data often contains codes from systems such as SNOMED CT, LOINC or ICD. These codes contain a great deal of information about diagnoses, procedures, observations and many other aspects of a patient's clinical record.

It is common to group these codes based upon their properties, relationships to other codes, or membership within a pre-defined set. Pathling can automate the task of calling out to a FHIR terminology server to ask questions about the codes within your data.

Examples of the types of questions that can be answered include:

  • Is this SNOMED CT procedure code a type of endoscopy?
  • Does this LOINC test result code have an analyte of bilirubin?
  • Is this ICD-10 code within the pre-defined list of codes within my cohort definition?

See Terminology functions for more information.

What happened to the server and some of the query functionality?

As part of the version 8 release, we took the decision to significantly change the focus and scope of Pathling with the purpose of rebuilding it around the SQL on FHIR specification.

This will mean that the server implementation will be temporarily removed. It will also mean that the scope of FHIRPath functions will be temporarily reduced to the minimal FHIRPath subset defined within the SQL on FHIR Shareable View Definition specification (with the exception of the terminology functions).

We have released this functionality as version 8, and we have spawned three work streams to build upon this new foundation:

  • Implementation of a new server focused upon the Bulk Data Access IG and the draft SQL on FHIR server API. This server will not include the aggregate or extract operations.
  • Expansion of the scope of the FHIRPath implementation to achieve full or close to full coverage of the FHIRPath spec.
  • Implementation of Parquet on FHIR as the new schema for lossless persistence of FHIR data for analytics.

We think that this is the best way to align Pathling to user needs, and also to make sure that the code base is sustainable going forwards.

If you are a current user of the server, aggregate or extract operations, please continue using the v7.x series. We are happy to continue maintaining and accepting contributions to this series as requested by users, but will be focusing our enhancement efforts on v8.

Artifact signing

Published Maven artifacts are signed with the following GPG key:

  • Key ID: ED48678D
  • Fingerprint: F814 751C 64B5 F5E7 08A8 C73F C3C6 291F ED48 678D
  • User ID: Pathling Developers <pathling@csiro.au>

The public key is available on keys.openpgp.org.

Licensing and attribution

Pathling is copyright 2018-2025, Commonwealth Scientific and Industrial Research Organisation (CSIRO) ABN 41 687 119 230. Licensed under the Apache License, version 2.0.

This means that you are free to use, modify and redistribute the software as you wish, even for commercial purposes.

If you use this software in your research, please consider citing our paper, Pathling: analytics on FHIR.

Pathling is experimental software, use it at your own risk! You can get a full description of the current set of known issues here.

Owner

  • Name: The Australian e-Health Research Centre
  • Login: aehrc
  • Kind: organization

The Australian e-Health Research Centre (AEHRC) is CSIRO’s digital health research program.

GitHub Events

Total
  • Create event: 250
  • Release event: 4
  • Issues event: 178
  • Watch event: 17
  • Delete event: 332
  • Issue comment event: 434
  • Member event: 1
  • Push event: 470
  • Pull request review event: 56
  • Pull request review comment event: 39
  • Pull request event: 564
  • Fork event: 3
Last Year
  • Create event: 250
  • Release event: 4
  • Issues event: 178
  • Watch event: 17
  • Delete event: 332
  • Issue comment event: 434
  • Member event: 1
  • Push event: 470
  • Pull request review event: 56
  • Pull request review comment event: 39
  • Pull request event: 564
  • Fork event: 3

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 3,034
  • Total Committers: 13
  • Avg Commits per committer: 233.385
  • Development Distribution Score (DDS): 0.268
Past Year
  • Commits: 105
  • Committers: 4
  • Avg Commits per committer: 26.25
  • Development Distribution Score (DDS): 0.114
Top Committers
Name Email Commits
John Grimes J****s@c****u 2,221
Piotr Szul p****l@c****u 486
dependabot[bot] 4****] 214
dependabot-preview[bot] 2****] 68
Sean Fong y****g@g****m 14
Burgess, Mark (H&B, Black Mountain) M****s@c****u 13
Lorenz Kapsner l****r@u****e 4
Jens Kristian Villadsen j****n@g****m 4
chgl c****l 3
Alejandro Metke a****e@c****u 3
Kai Kewley k****y@g****m 2
dionmcm d****e@g****m 1
Mark Burgess m****s@c****u 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 193
  • Total pull requests: 1,385
  • Average time to close issues: 11 months
  • Average time to close pull requests: about 1 month
  • Total issue authors: 15
  • Total pull request authors: 6
  • Average comments per issue: 0.77
  • Average comments per pull request: 0.79
  • Merged pull requests: 94
  • Bot issues: 1
  • Bot pull requests: 1,263
Past Year
  • Issues: 132
  • Pull requests: 526
  • Average time to close issues: 3 months
  • Average time to close pull requests: 12 days
  • Issue authors: 7
  • Pull request authors: 5
  • Average comments per issue: 0.35
  • Average comments per pull request: 0.7
  • Merged pull requests: 71
  • Bot issues: 0
  • Bot pull requests: 439
Top Authors
Issue Authors
  • johngrimes (122)
  • piotrszul (51)
  • chgl (5)
  • lakime (2)
  • liquid36 (2)
  • MartinBernstorff (2)
  • cyrilzakka (1)
  • paulolaup (1)
  • yehtunkhine (1)
  • jkiddo (1)
  • sentry-io[bot] (1)
  • fhnaumann (1)
  • bwalsh (1)
  • brucosper (1)
  • alejosv (1)
Pull Request Authors
  • dependabot[bot] (1,263)
  • piotrszul (82)
  • johngrimes (35)
  • jkiddo (3)
  • chgl (1)
  • fhnaumann (1)
Top Labels
Issue Labels
new feature (80) fhirpath (59) bug (41) refactoring (11) testing (9) optimisation (7) ci (5) breaking change (4) server (4) encoders (4) library-api (3) R (3) documentation (3) deprecation (2) dependencies (2) python (1) helm (1)
Pull Request Labels
dependencies (1,262) javascript (696) java (471) python (96) bug (33) new feature (31) fhirpath (14) testing (10) release (9) refactoring (7) breaking change (6) optimisation (5) deprecation (5) ci (5) encoders (2) library-api (1) documentation (1)

Packages

  • Total packages: 14
  • Total downloads:
    • npm 26 last-month
    • pypi 1,616 last-month
  • Total dependent packages: 22
    (may contain duplicates)
  • Total dependent repositories: 3
    (may contain duplicates)
  • Total versions: 300
  • Total maintainers: 1
pypi.org: pathling

Python API for Pathling

  • Versions: 61
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 1,616 Last month
Rankings
Stargazers count: 8.5%
Dependent packages count: 10.1%
Forks count: 11.4%
Downloads: 12.8%
Average: 12.9%
Dependent repos count: 21.6%
Maintainers (1)
Last synced: 6 months ago
npmjs.org: pathling-client

Client library for the Pathling FHIR API.

  • Versions: 23
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 24 Last month
Rankings
Stargazers count: 8.3%
Forks count: 8.6%
Average: 15.8%
Dependent packages count: 16.2%
Downloads: 20.8%
Dependent repos count: 25.3%
Maintainers (1)
Last synced: 6 months ago
npmjs.org: pathling-import

A set of functions for performing bulk export from a FHIR server and importing to Pathling.

  • Versions: 21
  • Dependent Packages: 1
  • Dependent Repositories: 0
  • Downloads: 2 Last month
Rankings
Stargazers count: 8.3%
Forks count: 8.6%
Average: 16.0%
Dependent packages count: 16.2%
Downloads: 21.5%
Dependent repos count: 25.3%
Maintainers (1)
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:utilities

Utility functions used by different components of Pathling.

  • Versions: 27
  • Dependent Packages: 6
  • Dependent Repositories: 0
Rankings
Dependent packages count: 11.0%
Stargazers count: 22.7%
Average: 23.1%
Forks count: 26.8%
Dependent repos count: 32.0%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:terminology

Interact with a FHIR terminology server from Spark.

  • Versions: 26
  • Dependent Packages: 4
  • Dependent Repositories: 0
Rankings
Dependent packages count: 16.9%
Stargazers count: 22.7%
Average: 24.6%
Forks count: 26.8%
Dependent repos count: 32.0%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:encoders

Encoders for transforming FHIR data into Spark Datasets.

  • Versions: 30
  • Dependent Packages: 4
  • Dependent Repositories: 1
Rankings
Dependent packages count: 13.9%
Dependent repos count: 20.7%
Average: 26.2%
Stargazers count: 31.3%
Forks count: 38.8%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:library-api

An API that exposes Pathling functionality to language libraries.

  • Versions: 26
  • Dependent Packages: 1
  • Dependent Repositories: 0
Rankings
Stargazers count: 22.7%
Forks count: 26.8%
Average: 28.4%
Dependent repos count: 32.0%
Dependent packages count: 32.0%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:fhir-server

A server that exposes Pathling functionality through a FHIR API.

  • Versions: 24
  • Dependent Packages: 1
  • Dependent Repositories: 1
Rankings
Dependent repos count: 20.7%
Average: 30.9%
Stargazers count: 31.3%
Dependent packages count: 33.0%
Forks count: 38.8%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:pathling

A set of tools that make it easier to use FHIR® within data analytics, built on Apache Spark.

  • Versions: 30
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Stargazers count: 22.2%
Forks count: 26.8%
Dependent repos count: 32.0%
Average: 32.5%
Dependent packages count: 48.9%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:python

A library for using Pathling with Python.

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 32.3%
Average: 39.2%
Dependent packages count: 46.2%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:r

A library for using Pathling with R.

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 32.3%
Average: 39.2%
Dependent packages count: 46.2%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:site

A website that contains documentation for Pathling.

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 32.3%
Average: 39.2%
Dependent packages count: 46.2%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:fhirpath

A library that can translate FHIRPath expressions into Spark queries.

  • Versions: 14
  • Dependent Packages: 3
  • Dependent Repositories: 0
Rankings
Dependent repos count: 35.3%
Average: 42.7%
Dependent packages count: 50.1%
Last synced: 6 months ago
repo1.maven.org: au.csiro.pathling:library-runtime

A Spark package that bundles the Pathling Library API and its runtime dependencies for use in applications and clusters.

  • Versions: 12
  • Dependent Packages: 1
  • Dependent Repositories: 0
Rankings
Dependent repos count: 35.3%
Average: 42.7%
Dependent packages count: 50.1%
Last synced: 6 months ago