robust-outlier-rejection

Robust Chauvenet Outlier Rejection: RCR is advanced, but easy to use, outlier rejection.

https://github.com/nickk124/robust-outlier-rejection

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.3%) to scientific vocabulary

Keywords

outlier-detection outlier-rejection regression robust robust-statistics
Last synced: 6 months ago · JSON representation

Repository

Robust Chauvenet Outlier Rejection: RCR is advanced, but easy to use, outlier rejection.

Basic Info
  • Host: GitHub
  • Owner: nickk124
  • License: other
  • Language: C++
  • Default Branch: master
  • Homepage: https://rcr.readthedocs.io
  • Size: 18.6 MB
Statistics
  • Stars: 7
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 2
Topics
outlier-detection outlier-rejection regression robust robust-statistics
Created almost 6 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.rst

Robust Chauvenet Outlier Rejection (RCR)
========================================
.. image:: https://readthedocs.org/projects/rcr/badge/?version=latest
   :target: https://rcr.readthedocs.io/en/latest/?badge=latest
   :alt: Documentation Status

.. image:: https://static.pepy.tech/badge/rcr
   :target: https://pepy.tech/project/rcr
   :alt: Downloads

.. image:: https://travis-ci.com/nickk124/RCR.svg?branch=master
    :target: https://travis-ci.com/nickk124/RCR
    :alt: Build Status
    
.. image:: https://img.shields.io/badge/arXiv-1807.05276-orange.svg?style=flat
    :target: https://arxiv.org/abs/1807.05276
    :alt: arXiv Paper

.. image:: https://zenodo.org/badge/246971427.svg
   :target: https://zenodo.org/badge/latestdoi/246971427
   :alt: Zenodo (DOI)

.. image:: https://img.shields.io/badge/ascl-2302.006-blue.svg?colorB=262255
   :target: https://ascl.net/2302.006
   :alt: astrophysics source code library:2302.006

What is RCR?
============
RCR is advanced, but easy to use, outlier rejection.

The simplest form of outlier rejection is sigma clipping, where measurements that are more than a specified number of standard deviations from the mean are rejected from the sample. This number of standard deviations should not be chosen arbitrarily, but is a function of your sample’s size. A simple prescription for this was introduced by William Chauvenet in 1863. Sigma clipping plus this prescription, applied iteratively, is what we call traditional Chauvenet rejection.

However, both sigma clipping and traditional Chauvenet rejection make use of non-robust quantities: the mean and the standard deviation are both sensitive to the very outliers that they are being used to reject. This limits such techniques to samples with small contaminants or small contamination fractions.

Robust Chauvenet Rejection (RCR) instead first makes use of robust replacements for the mean, such as the median and the half-sample mode, and similar robust replacements that we have developed for the standard deviation.

RCR has been carefully calibrated, and extensively simulated (see the full paper `here `_). It can be applied to samples with both large contaminants and large contaminant fractions (sometimes in excess of 90% contaminated).

We have also posted a short preprint that covers the essentials of the algorithm `here `_.

Documentation/How to Use RCR
============================

The `documentation `_ covers all of the RCR API, and provides thorough examples for using RCR in all of its forms.

We've also built a web calculator for quick use of RCR, including interactive visualizations. The calculator can be used for either 

1. `one-dimensional dataset outlier rejection `_ or 
2.  `outlier rejection combined with model fitting `_.

Installation
============

Linux and macOS
---------------

RCR can be used most easily via Python, installed using ``python3 -m pip install rcr`` in the command line.

The C++ source code is also included here in ``/src``, with documentation in ``/docs/cpp_docs``.

Windows
-------

Before installing, you'll need to have **Microsoft Visual C++ 14.0**, found under the `Microsoft Visual C++ Build Tools `_. If that doesn't work, you may need the latest `Windows SDK `_. (Both can be installed through the Visual Studio Installer.)

After that, run ``python3 -m pip install rcr`` in the command line.


Licensing and Citation
======================

RCR is free to use for academic and non-commercial applications (see license in this repository). We only ask that you cite `Maples et al. 2018 `_ as:

.. code-block:: bib

   @article{maples2018robust,
       title={Robust Chauvenet Outlier Rejection},
       author={{Maples}, M.P. and {Reichart}, D.E. and {Konz}, N.C. and {Berger}, T.A. and {Trotter}, A.S. and {Martin}, J.R. and {Dutton}, D.A. and {Paggen}, M.L. and {Joyner}, R.E. and {Salemi}, C.P.},
       journal={The Astrophysical Journal Supplement Series},
       volume={238},
       number={1},
       pages={2},
       year={2018},
       publisher={IOP Publishing}
   }

For commercial applications, or consultation, feel free to contact us.

There is no more fundamental act in science than measurement. There is no more fundamental problem in science than contaminated measurements. RCR is not a complete solution...but it is very close! We hope that you enjoy it.

Nick Konz, Dan Reichart, Michael Maples

Department of Physics and Astronomy

University of North Carolina at Chapel Hill

Owner

  • Name: Nick Konz
  • Login: nickk124
  • Kind: user
  • Location: Duke University
  • Company: Mazurowski Lab

Machine learning Ph.D. student with medical image analysis applications.

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • pybind11 *
setup.py pypi