oasis

A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).

https://github.com/ngmarchant/oasis

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.1%) to scientific vocabulary

Keywords

classification entity-resolution evaluation-method record-linkage sampling-schemes
Last synced: 6 months ago · JSON representation

Repository

A Python package for efficient evaluation based on OASIS (Optimal Asymptotic Sequential Importance Sampling).

Basic Info
  • Host: GitHub
  • Owner: ngmarchant
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 16.3 MB
Statistics
  • Stars: 15
  • Watchers: 3
  • Forks: 3
  • Open Issues: 0
  • Releases: 0
Topics
classification entity-resolution evaluation-method record-linkage sampling-schemes
Created almost 9 years ago · Last pushed over 4 years ago
Metadata Files
Readme License

README.rst

=====
OASIS
=====

.. image:: https://travis-ci.org/ngmarchant/oasis.svg?branch=master
    :target: https://travis-ci.org/ngmarchant/oasis
.. image:: https://img.shields.io/badge/License-MIT-yellow.svg
    :target: https://opensource.org/licenses/MIT
.. image:: https://badge.fury.io/py/oasis.svg
    :target: https://pypi.python.org/pypi/oasis

OASIS is a tool for evaluating binary classifiers when ground truth class
labels are not immediately available, but can be obtained at some cost (e.g.
by asking humans). The tool takes an unlabelled test set as input and
intelligently selects items to label so as to provide a *precise* estimate of
the classifier's performance, whilst *minimising* the amount of labelling
required. The underlying strategy for selecting the items to label is based on
a technique called *adaptive importance sampling*, which is optimised for the
classifier performance measure of interest. Currently, OASIS supports
estimation of the weighted F-measure, which includes the F1-score, precision
and recall.

Important links
===============
Documentation: https://ngmarchant.github.io/oasis

Source: https://www.github.com/ngmarchant/oasis

Technical paper: https://arxiv.org/pdf/1703.00617.pdf

Example
=======
See the Jupyter notebook under ``docs/tutorial/tutorial.ipynb``::

    >>> import oasis
    >>> data = oasis.Data()
    >>> data.read_h5('Amazon-GoogleProducts-test.h5')
    >>> def oracle(idx):
    >>>     return data.labels[idx]
    >>> smplr = oasis.OASISSampler(alpha, data.preds, data.scores, oracle)
    >>> smplr.sample_distinct(5000) #: query labels for 5000 distinct items
    >>> print("Current estimate is {}.".format(smplr.estimate_[smplr.t_ - 1]))


License and disclaimer
======================
The code is released under the MIT license. Please see the LICENSE file for
details.

Owner

  • Name: Neil Marchant
  • Login: ngmarchant
  • Kind: user
  • Location: Melbourne, Australia
  • Company: University of Melbourne

GitHub Events

Total
Last Year

Committers

Last synced: over 2 years ago

All Time
  • Total Commits: 46
  • Total Committers: 2
  • Avg Commits per committer: 23.0
  • Development Distribution Score (DDS): 0.022
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Neil Marchant n****t@g****m 45
Neil Marchant n****c@g****m 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: over 2 years
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • benxiao (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 133 last-month
  • Total docker downloads: 16
  • Total dependent packages: 1
  • Total dependent repositories: 3
  • Total versions: 2
  • Total maintainers: 1
pypi.org: oasis

Optimal Asymptotic Sequential Importance Sampling

  • Versions: 2
  • Dependent Packages: 1
  • Dependent Repositories: 3
  • Downloads: 133 Last month
  • Docker Downloads: 16
Rankings
Docker downloads count: 3.6%
Dependent packages count: 4.7%
Dependent repos count: 9.0%
Average: 12.9%
Stargazers count: 16.0%
Forks count: 16.9%
Downloads: 27.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

setup.py pypi
  • numpy *
  • scipy *
  • sklearn *
  • tables *