https://github.com/cthoyt/embeddingdb

A database for storing and comparing entity embeddings

https://github.com/cthoyt/embeddingdb

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.0%) to scientific vocabulary

Keywords

database network-representation-learning representation-learning

Keywords from Contributors

ontologies
Last synced: 5 months ago · JSON representation

Repository

A database for storing and comparing entity embeddings

Basic Info
  • Host: GitHub
  • Owner: cthoyt
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 32.2 KB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 1
Topics
database network-representation-learning representation-learning
Created over 6 years ago · Last pushed over 6 years ago
Metadata Files
Readme License

README.rst

Embedding Database |zenodo|
===========================
This package provides a database schema and Python wrapper
for storing the embeddings generated through various representation
learning packages.

Currently, this package focuses on using a SQL database with SQLAlchemy,
but might be extended to use a NoSQL database as an alternative.

Installation
------------
Install ``embeddingdb`` from `PyPI `_ with:

.. code-block:: sh

   $ pip install embeddingdb

Alternatively, install the latest development version of ``embeddingdb`` directly
from GitHub with:

.. code-block:: sh

   $ pip install git+https://github.com/cthoyt/embeddingdb

For developers, install ``embeddingdb`` in development mode from GitHub with:

.. code-block:: sh

   $ git clone https://github.com/cthoyt/embeddingdb.git
   $ cd embeddingdb
   $ pip install -e .

Set the environment variable ``EMBEDDINGDB_CONNECTION`` to a valid
SQLAlchemy connection string for a PostgreSQL instance, as this package uses
the PostgreSQL-specific ``ARRAY`` type.

Command Line Interface
----------------------
This package installs an entrypoint ``embeddingdb`` that can be used directly from
the shell.

Uploading Entity Embeddings
~~~~~~~~~~~~~~~~~~~~~~~~~~~
Entities can be embedded and stored from various types of representation learning,
including network representation learning, knowledge graph embedding, and textual
learning.

Upload embeddings generated by ``word2vec`` by specifying the file path with:

.. code-block:: sh

   $ embeddingdb upload --fmt word2vec --path ~/path/to/file.txt

Upload embeddings generated by ``pykeen`` by specifying the output directory
with:

.. code-block:: sh

   $ embeddingdb upload --fmt keen --path ~/path/to/directory/

Listing Entity Embeddings
~~~~~~~~~~~~~~~~~~~~~~~~~
After uploading, the collections can be listed with:

.. code-block:: sh

   $ embeddingdb ls

Analyzing Entity Embeddings' Correlations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
One of the motivations for building this repository was to make a convenient way to
compare the embeddings for entities generated through orthogonal embedding tecnhiques.
For example, we wanted to know to what extent the embeddings for proteins generated from
their sequences with ``ratvec`` contained the same information as the embeddings generated
from protein-protein interaction networks with ``pykeen`` or ``nrl``.

The two positional arguments correspond to the collection identifiers in the database.

.. code-block:: sh

   $ embeddingdb analyze 1 2

Running with Docker
-------------------
After installing Docker, the entire web application can be instantiated with:

.. code-block:: sh

   $ docker-compose up

Get the endpoint ``/test`` to instantiate the database and add a test collection.

.. |zenodo| image:: https://zenodo.org/badge/192898201.svg
   :target: https://zenodo.org/badge/latestdoi/192898201

Owner

  • Name: Charles Tapley Hoyt
  • Login: cthoyt
  • Kind: user
  • Location: Bonn, Germany
  • Company: RWTH Aachen University

GitHub Events

Total
Last Year

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 15
  • Total Committers: 2
  • Avg Commits per committer: 7.5
  • Development Distribution Score (DDS): 0.067
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Charles Tapley Hoyt c****t@g****m 14
Charles Hoyt c****t@m****p 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 10 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 1
  • Total maintainers: 1
pypi.org: embeddingdb

A package for storing and querying knowledge graph embeddings

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 10 Last month
Rankings
Dependent packages count: 10.0%
Dependent repos count: 21.7%
Average: 25.9%
Downloads: 46.0%
Maintainers (1)
Last synced: 6 months ago

Dependencies

Dockerfile docker
  • python 3.7 build
docker-compose.yml docker
  • postgres 11.1