coffea

Basic tools and wrappers for enabling not-too-alien syntax when running columnar Collider HEP analysis.

https://github.com/scikit-hep/coffea

Science Score: 46.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    34 of 74 committers (45.9%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.0%) to scientific vocabulary

Keywords from Contributors

scikit-hep hep histogram hep-ex root-cern file-format root hep-py bigdata high-energy-physics
Last synced: 7 months ago · JSON representation

Repository

Basic tools and wrappers for enabling not-too-alien syntax when running columnar Collider HEP analysis.

Basic Info
Statistics
  • Stars: 137
  • Watchers: 11
  • Forks: 132
  • Open Issues: 77
  • Releases: 165
Created over 7 years ago · Last pushed 7 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.rst

.. image:: docs/source/logo/coffea_logo.svg
    :align: center
    :width: 250px
    :alt: logo


coffea - Columnar Object Framework For Effective Analysis
=========================================================

.. image:: https://zenodo.org/badge/159673139.svg
   :target: https://zenodo.org/badge/latestdoi/159673139

.. image:: https://github.com/scikit-hep/coffea/actions/workflows/ci.yml/badge.svg
    :target: https://github.com/scikit-hep/coffea/actions?query=workflow%3ACI%2FCD+event%3Aschedule+branch%3Amaster

.. image:: https://codecov.io/gh/scikit-hep/coffea/branch/master/graph/badge.svg?event=schedule
    :target: https://codecov.io/gh/scikit-hep/coffea

.. image:: https://badge.fury.io/py/coffea.svg
    :target: https://badge.fury.io/py/coffea

.. image:: https://img.shields.io/pypi/dm/coffea.svg
    :target: https://img.shields.io/pypi/dm/coffea

.. image:: https://img.shields.io/conda/vn/conda-forge/coffea.svg
    :target: https://anaconda.org/conda-forge/coffea

.. image:: https://badges.gitter.im/scikit-hep/coffea.svg
    :target: https://matrix.to/#/#coffea-hep_community:gitter.im

.. image:: https://mybinder.org/badge_logo.svg
   :target: https://mybinder.org/v2/gh/scikit-hep/coffea/master?filepath=binder/

.. inclusion-marker-1-do-not-remove

Basic tools and wrappers for enabling not-too-alien syntax when running columnar Collider HEP analysis.

.. inclusion-marker-1-5-do-not-remove

coffea is a prototype package for pulling together all the typical needs
of a high-energy collider physics (HEP) experiment analysis using the scientific
python ecosystem. It makes use of `uproot `_
and `awkward-array `_ to provide an
array-based syntax for manipulating HEP event data in an efficient and numpythonic
way. There are sub-packages that implement histogramming, plotting, and look-up
table functionalities that are needed to convey scientific insight, apply transformations
to data, and correct for discrepancies in Monte Carlo simulations compared to data.

coffea also supplies facilities for horizontally scaling an analysis in order to reduce
time-to-insight in a way that is largely independent of the resource the analysis
is being executed on. By making use of modern *big-data* technologies like
`Apache Spark `_,  `parsl `_,
`Dask `_ , and `Work Queue `_,
it is possible with coffea to scale a HEP analysis from a testing
on a laptop to: a large multi-core server, computing clusters, and super-computers without
the need to alter or otherwise adapt the analysis code itself.

coffea is a HEP community project collaborating with `iris-hep `_
and is currently a prototype. We welcome input to improve its quality as we progress towards
a sensible refactorization into the scientific python ecosystem and a first release. Please
feel free to contribute at our `github repo `_!

.. inclusion-marker-2-do-not-remove

Installation
============

Install coffea like any other Python package:

.. code-block:: bash

    pip install coffea

or similar (use ``sudo``, ``--user``, ``virtualenv``, or pip-in-conda if you wish).
For more details, see the `Installing coffea `_ section of the documentation.

Strict dependencies
===================

- `Python `__ (3.9+)

The following are installed automatically when you install coffea with pip:

- `numpy `__ (1.22+);
- `uproot `__ for interacting with ROOT files and handling their data transparently;
- `awkward-array `__ to manipulate complex-structured columnar data, such as jagged arrays;
- `numba `__ just-in-time compilation of python functions;
- `scipy `__ for many statistical functions;
- `matplotlib `__ as a plotting backend;
- and other utility packages, as enumerated in ``pyproject.toml``.

.. inclusion-marker-3-do-not-remove

Documentation
=============
All documentation is hosted at https://coffea-hep.readthedocs.io/

Citation
========
If you would like to cite this code in your work, you can use the zenodo DOI indicated in ``CITATION.cff``, or the `latest DOI `__. You may also cite the proceedings:

- "N. Smith et al 2020 EPJ Web Conf. 245 06012"
- "L. Gray et al 2023 J. Phys.: Conf. Ser. 2438 012033"

Owner

  • Name: Scikit-HEP Project
  • Login: scikit-hep
  • Kind: organization
  • Email: scikit-hep-forum@googlegroups.com

A community project for High Energy Physics data analysis in Python

GitHub Events

Total
  • Fork event: 4
  • Create event: 89
  • Commit comment event: 2
  • Release event: 17
  • Issues event: 49
  • Watch event: 4
  • Delete event: 62
  • Member event: 1
  • Issue comment event: 255
  • Push event: 304
  • Pull request review comment event: 42
  • Pull request review event: 45
  • Pull request event: 219
Last Year
  • Fork event: 4
  • Create event: 89
  • Commit comment event: 2
  • Release event: 17
  • Issues event: 49
  • Watch event: 4
  • Delete event: 62
  • Member event: 1
  • Issue comment event: 255
  • Push event: 304
  • Pull request review comment event: 42
  • Pull request review event: 45
  • Pull request event: 219

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 2,913
  • Total Committers: 74
  • Avg Commits per committer: 39.365
  • Development Distribution Score (DDS): 0.681
Past Year
  • Commits: 348
  • Committers: 16
  • Avg Commits per committer: 21.75
  • Development Distribution Score (DDS): 0.739
Top Committers
Name Email Commits
Lindsey Gray l****y@g****m 929
Nick Smith n****h@c****h 403
pre-commit-ci[bot] 6****] 210
Lindsey Gray L****y@c****h 209
Benjamin Tovar b****r@n****u 141
Yi-Mu Chen e****l@g****m 126
iasonkrom i****m@g****m 103
andrzejnovak n****j@g****m 58
Nikolai Hartmann n****e@p****e 54
Jayjeet Chakraborty j****b@r****m 54
Peter Fackeldey p****y@r****e 50
dependabot[bot] 4****] 48
Saransh Chopra s****1@g****m 37
Ryan Simeon r****4@g****m 34
Davide Valsecchi d****i@c****h 33
Douglas Thain d****n@n****u 32
Gordon Watts g****n@g****t 32
prayagyadav p****6@g****m 27
Giordon Stark k****g@g****m 25
Jonas Rübenach j****h@d****e 25
Ben Galewsky b****n@p****t 23
Benjamin Fischer b****r@c****h 20
Devin Taylor d****r@u****u 17
maly m****i@g****m 14
Joosep Pata j****a@g****m 13
Angus Hollands g****5@g****m 13
Paul Gessinger h****o@p****m 12
camicarballo c****l@n****u 12
Piero Viscone p****e@g****m 11
Anna Elizabeth Woodard a****d@n****u 10
and 44 more...

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 32
  • Total pull requests: 204
  • Average time to close issues: 4 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 18
  • Total pull request authors: 18
  • Average comments per issue: 1.19
  • Average comments per pull request: 0.82
  • Merged pull requests: 155
  • Bot issues: 0
  • Bot pull requests: 57
Past Year
  • Issues: 31
  • Pull requests: 203
  • Average time to close issues: 20 days
  • Average time to close pull requests: 5 days
  • Issue authors: 18
  • Pull request authors: 17
  • Average comments per issue: 1.1
  • Average comments per pull request: 0.77
  • Merged pull requests: 155
  • Bot issues: 0
  • Bot pull requests: 57
Top Authors
Issue Authors
  • ikrommyd (8)
  • NJManganelli (5)
  • kratsg (3)
  • alexander-held (2)
  • pfackeldey (1)
  • sebastien-rettie (1)
  • GaetanLepage (1)
  • toicca (1)
  • jpearkes (1)
  • andresailer (1)
  • ParticleChef (1)
  • miozdemi (1)
  • oshadura (1)
  • JuanDuarte2003 (1)
  • rkansal47 (1)
Pull Request Authors
  • ikrommyd (65)
  • lgray (41)
  • pre-commit-ci[bot] (39)
  • dependabot[bot] (22)
  • pfackeldey (7)
  • btovar (6)
  • NJManganelli (6)
  • yimuchen (4)
  • prayagyadav (4)
  • matthewfeickert (3)
  • andrzejnovak (2)
  • nsmith- (2)
  • maxymnaumchyk (2)
  • ariostas (2)
  • lauridsj (1)
Top Labels
Issue Labels
bug (20) enhancement (7) question (1) dependencies (1)
Pull Request Labels
dependencies (22) github_actions (8)

Packages

  • Total packages: 1
  • Total downloads: unknown
  • Total dependent packages: 0
  • Total dependent repositories: 11
  • Total versions: 28
conda-forge.org: coffea
  • Versions: 28
  • Dependent Packages: 0
  • Dependent Repositories: 11
Rankings
Dependent repos count: 10.7%
Forks count: 18.3%
Average: 28.4%
Stargazers count: 32.8%
Dependent packages count: 51.6%
Last synced: 7 months ago

Dependencies

binder/environment.yml conda
  • coffea >=0.7.0
  • xrootd
  • xxhash
.github/workflows/ci.yml actions
  • actions/checkout v3 composite
  • actions/create-release v1 composite
  • actions/setup-java v3 composite
  • actions/setup-python v4 composite
  • conda-incubator/setup-miniconda v2.2.0 composite
  • crazy-max/ghaction-github-pages v3.1.0 composite
  • kamiazya/setup-graphviz v1 composite
  • pypa/gh-action-pypi-publish v1.6.4 composite
  • r-lib/actions/setup-pandoc v2 composite
docker/kubernetes/spark/Dockerfile docker
  • $base_img latest build
docker/skyhook/Dockerfile docker
  • uccross/skyhookdm-arrow v0.4.0 build