https://github.com/althonos/torch-treecrf

A PyTorch implementation of Tree-structured Conditional Random Fields.

https://github.com/althonos/torch-treecrf

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 6 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
    1 of 1 committers (100.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.0%) to scientific vocabulary

Keywords

conditional-random-fields machine-learning python-library pytorch torch
Last synced: 5 months ago · JSON representation

Repository

A PyTorch implementation of Tree-structured Conditional Random Fields.

Basic Info
  • Host: GitHub
  • Owner: althonos
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 50.8 KB
Statistics
  • Stars: 7
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Topics
conditional-random-fields machine-learning python-library pytorch torch
Created about 3 years ago · Last pushed 10 months ago
Metadata Files
Readme Changelog Contributing License

README.md

🌲 torch-treecrf

A PyTorch implementation of Tree-structured Conditional Random Fields.

Actions Coverage License PyPI Wheel Python Versions Python Implementations Source GitHub issues Changelog Downloads

🗺️ Overview

Conditional Random Fields (CRF) are a family of discriminative graphical learning models that can be used to model the dependencies between variables. The most common form of CRFs are Linear-chain CRF, where a prediction depends on an observed variable, as well as the prediction before and after it (the context). Linear-chain CRFs are widely used in Natural Language Processing.

$$ P(Y | X) = \frac{1}{Z(X)} \prod{i=1}^n{ \Psii(yi, xi) } \prod{i=2}^n{ \Psi{i-1,i}(y{i-1}, yi)} $$

In 2006, Tang et al.[1] introduced Tree-structured CRFs to model hierarchical relationships between predicted variables, allowing dependencies between a prediction variable and its parents and children.

$$ P(Y | X) = \frac{1}{Z(X)} \prod{i=1}^{n}{ \Psii(yi, xi) } \prod{j \in \mathcal{N}(i)}{ \Psi{j,i}(yj, yi)} $$

This package implements a generic Tree-structured CRF layer in PyTorch. The layer can be stacked on top of a linear layer to implement a proper Tree-structured CRF, or on any other kind of model producing emission scores in log-space for every class of each label. Computation of marginals is implemented using Belief Propagation[2], allowing for exact inference on trees[3]:

$$ \begin{aligned} P(yi | X) & = \frac{1}{Z(X)} \Psii(yi, xi) & \underbrace{\prod{j \in \mathcal{C}(i)}{\mu{j \to i}(yi)}} & & \underbrace{\prod{j \in \mathcal{P}(i)}{\mu{j \to i}(yi)}} \ & = \frac1Z \Psii(yi, xi) & \alphai(yi) & & \betai(y_i) \ \end{aligned} $$

where for every node $i$, the message from the parents $\mathcal{P}(i)$ and the children $\mathcal{C}(i)$ is computed recursively with the sum-product algorithm[4]:

$$ \begin{aligned} \forall j \in \mathcal{C}(i), \mu{j \to i}(yi) = \sum{yj}{ \Psi{i,j}(yi, yj) \Psij(yj, xj) \prod{k \in \mathcal{C}(j)}{\mu{k \to j}(yj)} } \ \forall j \in \mathcal{P}(i), \mu{j \to i}(yi) = \sum{yj}{ \Psi{i,j}(yi, yj) \Psij(yj, xj) \prod{k \in \mathcal{P}(j)}{\mu{k \to j}(yj)} } \ \end{aligned} $$

The implementation should be generic enough that any kind of Directed acyclic graph can be used as a label hierarchy, not just trees.

🔧 Installing

Install the torch-treecrf package directly from PyPi which hosts universal wheels that can be installed with pip: console $ pip install torch-treecrf

📋 Features

  • Encoding of directed graphs in an adjacency matrix, with $\mathcal{O}(1)$ retrieval of children and parents for any node, and $\mathcal{O}(N+E)$ storage.
  • Support for any acyclic hierarchy representable as a Directed Acyclic Graph and not just directed trees, allowing prediction of classes such as the Gene Ontology.
  • Multiclass output, provided all the target labels have the same number of classes: $Y \in \left\{ 0, .., C \right\}^L$.
  • Minibatch support, with vectorized computation of the messages $\alphai(yi)$ and $\betai(yi)$.

💡 Example

To create a Tree-structured CRF, you must first define the tree encoding the relationships between variables. Let's build a simple CRF for a root variable with two children:

First, define an adjacency matrix $M$ representing the hierarchy, such that $M_{i,j}$ is $1$ if $j$ is a parent of $i$: python adjacency = torch.tensor([ [0, 0, 0], [1, 0, 0], [1, 0, 0] ])

Then create the CRF by giving it the adjacency matrix as the hyperparameter: python crf = torch_treecrf.TreeCRF(adjacency)

The TreeCRF expects local emission scores as a tensor of shape $(\star, L)$ where $\star$ is the minibatch size and $L$ the number of labels, and returns a tensor of logits of the same shape.

You can also use the CRF layer for cases where labels have more than two classes; in which case use the TreeCRFLayer module, which expects an emission tensor of shape $(\star, C, L)$, where $\star$ is the minibatch size, $L$ the number of labels and $C$ the number of class per label, and returns a tensor $log P(Y | X)$ of the same shape.

💭 Feedback

⚠️ Issue Tracker

Found a bug ? Have an enhancement request ? Head over to the GitHub issue tracker if you need to report or ask something. If you are filing in on a bug, please include as much information as you can about the issue, and try to recreate the same bug in a simple, easily reproducible situation.

🏗️ Contributing

Contributions are more than welcome! See CONTRIBUTING.md for more details.

⚖️ License

This library is provided under the MIT License.

This library was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.

📚 References

  • [1] Tang, Jie, Mingcai Hong, Juanzi Li, and Bangyong Liang. ‘Tree-Structured Conditional Random Fields for Semantic Annotation’. In The Semantic Web - ISWC 2006, edited by Isabel Cruz, Stefan Decker, Dean Allemang, Chris Preist, Daniel Schwabe, Peter Mika, Mike Uschold, and Lora M. Aroyo, 640–53. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer, 2006. doi:10.1007/11926078_46.
  • [2] Pearl, Judea. ‘Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach’. In Proceedings of the Second AAAI Conference on Artificial Intelligence, 133–136. AAAI’82. Pittsburgh, Pennsylvania: AAAI Press, 1982.
  • [3] Bach, Francis, and Guillaume Obozinski. ‘Sum Product Algorithm and Hidden Markov Model’, ENS Course Material, 2016. http://imagine.enpc.fr/%7Eobozinsg/teaching/mvagm/lecturenotes/lecture7.pdf.
  • doi:10.1109/18.910572.

Owner

  • Name: Martin Larralde
  • Login: althonos
  • Kind: user
  • Location: Heidelberg, Germany
  • Company: EMBL / LUMC, @zellerlab

PhD candidate in Bioinformatics, passionate about programming, SIMD-enthusiast, Pythonista, Rustacean. I write poems, and sometimes they are executable.

GitHub Events

Total
  • Watch event: 1
  • Push event: 7
Last Year
  • Watch event: 1
  • Push event: 7

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 47
  • Total Committers: 1
  • Avg Commits per committer: 47.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 7
  • Committers: 1
  • Avg Commits per committer: 7.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Martin Larralde m****e@e****e 47
Committer Domains (Top 20 + Academic)
embl.de: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 13 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
pypi.org: torch-treecrf

A PyTorch implementation of Tree-structured Conditional Random Fields.

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 13 Last month
Rankings
Dependent packages count: 6.6%
Stargazers count: 25.5%
Average: 26.9%
Forks count: 30.5%
Dependent repos count: 30.6%
Downloads: 41.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/test.yml actions
  • actions/checkout v2 composite
  • actions/checkout v1 composite
  • actions/download-artifact v2 composite
  • actions/setup-python v2 composite
  • actions/upload-artifact v2 composite
  • codecov/codecov-action v1 composite
  • pypa/gh-action-pypi-publish master composite
  • rasmus-saks/release-a-changelog-action v1.0.1 composite
.github/workflows/requirements.txt pypi
  • auditwheel *
  • codecov *
  • coverage *
  • setuptools >=46.4
  • wheel *
setup.py pypi