Efficiently Learning Relative Similarity Embeddings with Crowdsourcing

Efficiently Learning Relative Similarity Embeddings with Crowdsourcing - Published in JOSS (2023)

https://github.com/stsievert/salmon

Science Score: 98.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

active-learning crowdsourcing embedding machine-learning triplet-loss triplets

Scientific Fields

Computer Science Computer Science - 84% confidence
Mathematics Computer Science - 84% confidence
Last synced: 4 months ago · JSON representation ·

Repository

A tool to collect triplet queries

Basic Info
Statistics
  • Stars: 8
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 78
Topics
active-learning crowdsourcing embedding machine-learning triplet-loss triplets
Created about 6 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Citation

README.md

Salmon

DOI badge

Salmon is a tool for efficiently generating ordinal embeddings. It relies on "active" machine learning algorithms to choose the most informative queries for humans to answer.

Documentation

This documentation is available at these locations:

Please file an issue if you can not access the documentation.

Running Salmon offline

Visit the documentation at https://docs.stsievert.com/salmon/offline.html. Briefly, this should work:

shell $ cd path/to/salmon $ conda env create -f salmon.lock.yml $ conda activate salmon (salmon) $ pip install -e .

The documentation online mentions more about how to generate an embedding offline: https://docs.stsievert.com/salmon/offline.html#generate-embeddings

With this, it's also possible to create a script that uses and imports Salmon:

``` python from salmon.triplets.samplers import TSTE import numpy as np

n, d = 85, 2 sampler = TSTE(n=n, d=d)

eminit = np.array([[i, -i] for i in range(n)]) sampler.opt.initialize(embedding=eminit)

queries, scores, meta = sampler.getqueries(num=10000) ```

This script allows the data scientist to score queries for an embedding they specify.

Owner

  • Name: Scott Sievert
  • Login: stsievert
  • Kind: user

JOSS Publication

Efficiently Learning Relative Similarity Embeddings with Crowdsourcing
Published
April 17, 2023
Volume 8, Issue 84, Page 4517
Authors
Scott Sievert ORCID
University of Wisconsin--Madison
Robert Nowak
University of Wisconsin--Madison
Timothy Rogers ORCID
University of Wisconsin--Madison
Editor
Andrew Stewart ORCID
Tags
crowdsourcing active machine learning relatively similarity adaptive sampling

Citation (CITATION.cff)

cff-version: "1.2.0"
authors:
- family-names: Sievert
  given-names: Scott
  orcid: "https://orcid.org/0000-0002-4275-3452"
- family-names: Nowak
  given-names: Robert
- family-names: Rogers
  given-names: Timothy
  orcid: "https://orcid.org/0000-0001-6304-755X"
doi: 10.5281/zenodo.7832431
message: If you use this software, please cite our article in the
  Journal of Open Source Software.
preferred-citation:
  authors:
  - family-names: Sievert
    given-names: Scott
    orcid: "https://orcid.org/0000-0002-4275-3452"
  - family-names: Nowak
    given-names: Robert
  - family-names: Rogers
    given-names: Timothy
    orcid: "https://orcid.org/0000-0001-6304-755X"
  date-published: 2023-04-17
  doi: 10.21105/joss.04517
  issn: 2475-9066
  issue: 84
  journal: Journal of Open Source Software
  publisher:
    name: Open Journals
  start: 4517
  title: Efficiently Learning Relative Similarity Embeddings with
    Crowdsourcing
  type: article
  url: "https://joss.theoj.org/papers/10.21105/joss.04517"
  volume: 8
title: Efficiently Learning Relative Similarity Embeddings with
  Crowdsourcing

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 163
  • Total Committers: 3
  • Avg Commits per committer: 54.333
  • Development Distribution Score (DDS): 0.012
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Scott Sievert s****t 161
Jason Sicotte s****n@g****m 1
Christopher Cox c****c@g****m 1

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 31
  • Total pull requests: 69
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 1.29
  • Average comments per pull request: 0.46
  • Merged pull requests: 68
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • stsievert (31)
Pull Request Authors
  • stsievert (70)
Top Labels
Issue Labels
enhancement (2) bug (1)
Pull Request Labels

Dependencies

.github/workflows/docs.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • peaceiris/actions-gh-pages v3 composite
.github/workflows/python-publish.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
.github/workflows/test.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • docker/setup-buildx-action v1 composite
.github/workflows/test_offline.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/test_pip.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • docker/setup-buildx-action v1 composite
Dockerfile docker
  • continuumio/miniconda3 4.10.3 build
docker/redismonitor/Dockerfile docker
  • redislabs/rejson latest build
docker-compose.yml docker
  • prom/prometheus latest
  • redislabs/rejson latest
requirements.txt pypi
  • Cython *
  • aiofiles *
  • altair *
  • autodoc_pydantic *
  • blosc *
  • bokeh ==2.0.1
  • cloudpickle *
  • cytoolz *
  • dask >=2021.02.0
  • dask-ml *
  • distributed >=2021.02.0
  • fastapi *
  • fastparquet *
  • gunicorn *
  • httpx *
  • ipykernel *
  • jinja2 <3.1.0
  • jupyter-server-proxy *
  • lz4 *
  • matplotlib *
  • numpy >=1.18.0
  • numpydoc *
  • pandas >=1.0.1
  • pyarrow *
  • pytest *
  • python-multipart *
  • pyyaml *
  • rejson *
  • scikit-learn *
  • scipy *
  • seaborn *
  • skorch >=0.8.0
  • sphinx >=4.0.0
  • sphinx_rtd_theme *
  • starlette-prometheus *
  • torch *
  • ujson *
setup.py pypi
  • for *
docs/requirements.txt pypi
  • numpydoc *