Efficiently Learning Relative Similarity Embeddings with Crowdsourcing
Efficiently Learning Relative Similarity Embeddings with Crowdsourcing - Published in JOSS (2023)
Science Score: 98.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Scientific Fields
Repository
A tool to collect triplet queries
Basic Info
- Host: GitHub
- Owner: stsievert
- License: bsd-3-clause
- Language: Python
- Default Branch: master
- Homepage: https://docs.stsievert.com/salmon/
- Size: 119 MB
Statistics
- Stars: 8
- Watchers: 1
- Forks: 2
- Open Issues: 0
- Releases: 78
Topics
Metadata Files
README.md
Salmon
Salmon is a tool for efficiently generating ordinal embeddings. It relies on "active" machine learning algorithms to choose the most informative queries for humans to answer.
Documentation
This documentation is available at these locations:
- Primary source: https://docs.stsievert.com/salmon/
- Secondary source: as a raw PDF (and as a slower loading PDF).
- Secondary source: as zipped HTML directory, which requires unzipping the directory
then opening up
index.html.
Please file an issue if you can not access the documentation.
Running Salmon offline
Visit the documentation at https://docs.stsievert.com/salmon/offline.html. Briefly, this should work:
shell
$ cd path/to/salmon
$ conda env create -f salmon.lock.yml
$ conda activate salmon
(salmon) $ pip install -e .
The documentation online mentions more about how to generate an embedding offline: https://docs.stsievert.com/salmon/offline.html#generate-embeddings
With this, it's also possible to create a script that uses and imports Salmon:
``` python from salmon.triplets.samplers import TSTE import numpy as np
n, d = 85, 2 sampler = TSTE(n=n, d=d)
eminit = np.array([[i, -i] for i in range(n)]) sampler.opt.initialize(embedding=eminit)
queries, scores, meta = sampler.getqueries(num=10000) ```
This script allows the data scientist to score queries for an embedding they specify.
Owner
- Name: Scott Sievert
- Login: stsievert
- Kind: user
- Website: https://stsievert.com
- Repositories: 45
- Profile: https://github.com/stsievert
JOSS Publication
Efficiently Learning Relative Similarity Embeddings with Crowdsourcing
Authors
University of Wisconsin--Madison
Tags
crowdsourcing active machine learning relatively similarity adaptive samplingCitation (CITATION.cff)
cff-version: "1.2.0"
authors:
- family-names: Sievert
given-names: Scott
orcid: "https://orcid.org/0000-0002-4275-3452"
- family-names: Nowak
given-names: Robert
- family-names: Rogers
given-names: Timothy
orcid: "https://orcid.org/0000-0001-6304-755X"
doi: 10.5281/zenodo.7832431
message: If you use this software, please cite our article in the
Journal of Open Source Software.
preferred-citation:
authors:
- family-names: Sievert
given-names: Scott
orcid: "https://orcid.org/0000-0002-4275-3452"
- family-names: Nowak
given-names: Robert
- family-names: Rogers
given-names: Timothy
orcid: "https://orcid.org/0000-0001-6304-755X"
date-published: 2023-04-17
doi: 10.21105/joss.04517
issn: 2475-9066
issue: 84
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 4517
title: Efficiently Learning Relative Similarity Embeddings with
Crowdsourcing
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.04517"
volume: 8
title: Efficiently Learning Relative Similarity Embeddings with
Crowdsourcing
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Scott Sievert | s****t | 161 |
| Jason Sicotte | s****n@g****m | 1 |
| Christopher Cox | c****c@g****m | 1 |
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 31
- Total pull requests: 69
- Average time to close issues: about 2 months
- Average time to close pull requests: 5 days
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 1.29
- Average comments per pull request: 0.46
- Merged pull requests: 68
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- stsievert (31)
Pull Request Authors
- stsievert (70)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v2 composite
- actions/setup-python v2 composite
- peaceiris/actions-gh-pages v3 composite
- actions/checkout v3 composite
- actions/setup-python v3 composite
- pypa/gh-action-pypi-publish 27b31702a0e7fc50959f5ad993c78deac1bdfc29 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- docker/setup-buildx-action v1 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- docker/setup-buildx-action v1 composite
- continuumio/miniconda3 4.10.3 build
- redislabs/rejson latest build
- prom/prometheus latest
- redislabs/rejson latest
- Cython *
- aiofiles *
- altair *
- autodoc_pydantic *
- blosc *
- bokeh ==2.0.1
- cloudpickle *
- cytoolz *
- dask >=2021.02.0
- dask-ml *
- distributed >=2021.02.0
- fastapi *
- fastparquet *
- gunicorn *
- httpx *
- ipykernel *
- jinja2 <3.1.0
- jupyter-server-proxy *
- lz4 *
- matplotlib *
- numpy >=1.18.0
- numpydoc *
- pandas >=1.0.1
- pyarrow *
- pytest *
- python-multipart *
- pyyaml *
- rejson *
- scikit-learn *
- scipy *
- seaborn *
- skorch >=0.8.0
- sphinx >=4.0.0
- sphinx_rtd_theme *
- starlette-prometheus *
- torch *
- ujson *
- for *
- numpydoc *
