pytables

A Python package to manage extremely large amounts of data

https://github.com/pytables/pytables

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (16.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

A Python package to manage extremely large amounts of data

Basic Info

Host: GitHub
Owner: PyTables
License: bsd-3-clause
Language: Python
Default Branch: master
Homepage: http://www.pytables.org
Size: 39.4 MB

Statistics

Stars: 1,341
Watchers: 61
Forks: 276
Open Issues: 156
Releases: 19

Created about 15 years ago · Last pushed 12 months ago

Metadata Files

Readme Contributing Funding License Code of conduct Citation Security

README.rst

===========================================
 PyTables: hierarchical datasets in Python
===========================================

.. image:: https://badges.gitter.im/Join%20Chat.svg
   :alt: Join the chat at https://gitter.im/PyTables/PyTables
   :target: https://gitter.im/PyTables/PyTables

.. image:: https://github.com/PyTables/PyTables/workflows/CI/badge.svg
   :target: https://github.com/PyTables/PyTables/actions?query=workflow%3ACI

.. image:: https://img.shields.io/pypi/v/tables.svg
  :target: https://pypi.org/project/tables/

.. image:: https://img.shields.io/pypi/pyversions/tables.svg
  :target: https://pypi.org/project/tables/

.. image:: https://img.shields.io/pypi/l/tables
  :target: https://github.com/PyTables/PyTables/


:URL: http://www.pytables.org/


PyTables is a package for managing hierarchical datasets, designed
to efficiently cope with extremely large amounts of data.

It is built on top of the HDF5 library and the NumPy package. It
features an object-oriented interface that, combined with C extensions
for the performance-critical parts of the code (generated using
Cython), makes it a fast, yet extremely easy to use tool for
interactively saving and retrieving very large amounts of data. One
important feature of PyTables is that it optimizes memory and disk
resources so that they take much less space (between 3 to 5 times
and more if the data is compressible) than other solutions, like for
example, relational or object-oriented databases.

State-of-the-art compression
----------------------------

PyTables supports the `Blosc compressor `_ out of the box.
This allows for extremely high compression speed, while keeping decent
compression ratios. By doing so, I/O can be accelerated by a large extent, and
you may end up achieving higher performance than the bandwidth provided by your
I/O subsystem. See the
`Tuning The Chunksize section of the Optimization Tips chapter
`_
of the user documentation for some benchmarks.

Not a RDBMS replacement
-----------------------

PyTables is not designed to work as a relational database replacement,
but rather as a teammate. If you want to work with large datasets of
multidimensional data (for example, for multidimensional analysis), or
just provide a categorized structure for some portions of your
cluttered RDBS, then give PyTables a try. It works well for storing
data from data acquisition systems, simulation software, network
data monitoring systems (for example, traffic measurements of IP
packets on routers), or as a centralized repository for system logs,
to name only a few possible use cases.

Tables
------

A table is defined as a collection of records whose values are stored
in fixed-length fields. All records have the same structure, and all
values in each field have the same data type. The terms "fixed-length"
and strict "data types" seem to be a strange requirement for an
interpreted language like Python, but they serve a useful function if
the goal is to save very large quantities of data (such as
generated by many scientific applications, for example) in an
efficient manner that reduces demand on CPU time and I/O.

Arrays
------

There are other useful objects like arrays, enlargeable arrays, or
variable-length arrays that can cope with different use cases on your
project.

Easy to use
-----------

One of the principal objectives of PyTables is to be user-friendly.
In addition, many different iterators have been implemented to
make interactive work as productive as possible.

Platforms
---------

We use Linux on top of Intel32 and Intel64 boxes as the main
development platforms, but PyTables should be easy to compile/install
on other UNIX (including macOS) or Windows machines.

Compiling
---------

To compile PyTables, you will need a recent version of the HDF5
(C flavor) library, the Zlib compression library, and the NumPy and
Numexpr packages. Besides, PyTables comes with support for the Blosc, LZO,
and bzip2 compressor libraries. Blosc is mandatory, but PyTables comes
with Blosc sources so, although it is recommended to have Blosc
installed in your system, you don't absolutely need to install it
separately. LZO and bzip2 compression libraries are, however,
optional.

Make sure you have HDF5 version 1.10.5 or above. On Debian-based Linux
distributions, you can install it with::

   $ sudo apt install libhdf5-serial-dev

Installation
------------

1. Install with `pip `_:

       $ python3 -m pip install tables

2. To run the test suite::

       $ python3 -m tables.tests.test_all

   If there is some test that does not pass, please send us the
   complete output using the
   `GitHub Issue Tracker `_.


**Enjoy data!** -- The PyTables Team

.. Local Variables:
.. mode: text
.. coding: utf-8
.. fill-column: 70
.. End:

Owner

Name: PyTables
Login: PyTables
Kind: organization

Website: http://www.pytables.org
Repositories: 7
Profile: https://github.com/PyTables

GitHub Events

Total

Create event: 22
Release event: 1
Issues event: 17
Watch event: 33
Delete event: 20
Issue comment event: 58
Push event: 60
Gollum event: 1
Pull request review event: 27
Pull request review comment event: 2
Pull request event: 60
Fork event: 8

Last Year

Create event: 22
Release event: 1
Issues event: 17
Watch event: 33
Delete event: 20
Issue comment event: 58
Push event: 60
Gollum event: 1
Pull request review event: 27
Pull request review comment event: 2
Pull request event: 60
Fork event: 8

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 132
Total pull requests: 176
Average time to close issues: 8 months
Average time to close pull requests: 29 days
Total issue authors: 108
Total pull request authors: 40
Average comments per issue: 5.73
Average comments per pull request: 2.17
Merged pull requests: 147
Bot issues: 0
Bot pull requests: 51

Past Year

Issues: 4
Pull requests: 24
Average time to close issues: 3 days
Average time to close pull requests: about 17 hours
Issue authors: 4
Pull request authors: 3
Average comments per issue: 1.75
Average comments per pull request: 0.08
Merged pull requests: 20
Bot issues: 0
Bot pull requests: 22

View more stats

Top Authors

Issue Authors

mgorny (6)
alpae (4)
andreabedini (3)
joycebrum (3)
ivilata (3)
froody (2)
kostasmarkakis (2)
dependabot[bot] (2)
jpjarnoux (2)
FrancescAlted (2)
maxnoe (2)
sunilshah (2)
joshmoore (2)
KoStehner (2)
ulfllorenz (2)

Pull Request Authors

dependabot[bot] (84)
KoStehner (54)
ivilata (19)
avalentino (19)
larsoner (13)
maxnoe (7)
FrancescAlted (7)
matham (6)
pyssling (4)
cbrnr (3)
xmatthias (3)
eumiro (3)
graingert (2)
Joshuaalbert (2)
joycebrum (2)

Top Labels

Issue Labels

defect (37) help wanted (24) enhancement (19) setup (12) good first issues (11) wheel (9) osx-arm64 (8) invalid (5) windows (4) website (3) dependencies (3) duplicate (3) python (2) documentation (2) osx (2) from_trac (2) tests (2) python3 (2) strings (1) iteration (1) CI (1) aarch64/arm (1)

Pull Request Labels

dependencies (87) python (60) enhancement (58) typing (29) github_actions (24) defect (17) documentation (8) CI (7) help wanted (4) python3 (3) setup (3) good first issues (2) osx-arm64 (2) tests (1)

Packages

Total packages: 5
Total downloads:
- pypi 1,681,462 last-month
Total docker downloads: 363,811,515

Total dependent packages: 500
(may contain duplicates)
Total dependent repositories: 11,445
(may contain duplicates)
Total versions: 98
Total maintainers: 7

pypi.org: tables

Hierarchical datasets for Python

Documentation: https://tables.readthedocs.io/
License: BSD 3-Clause License
Latest release: 3.10.2
published over 1 year ago

Versions: 47
Dependent Packages: 496
Dependent Repositories: 11,004
Downloads: 1,681,462 Last month
Docker Downloads: 363,811,515

Rankings

Dependent packages count: 0.0%

Dependent repos count: 0.1%

Average: 0.2%

Docker downloads count: 0.3%

Downloads: 0.4%

Maintainers (7)

matthew.brett antonio.valentino falted andreabedini jsancho tomkooij ivilata

Last synced: 10 months ago

proxy.golang.org: github.com/PyTables/PyTables

Documentation: https://pkg.go.dev/github.com/PyTables/PyTables#section-documentation
License: bsd-3-clause
Latest release: v3.10.2+incompatible
published over 1 year ago

Versions: 19
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Forks count: 1.7%

Stargazers count: 1.9%

Average: 6.0%

Dependent packages count: 9.6%

Dependent repos count: 10.8%

Last synced: 11 months ago

proxy.golang.org: github.com/pytables/pytables

Documentation: https://pkg.go.dev/github.com/pytables/pytables#section-documentation
License: bsd-3-clause
Latest release: v3.10.2+incompatible
published over 1 year ago

Versions: 19
Dependent Packages: 0
Dependent Repositories: 0

Rankings

Forks count: 1.7%

Stargazers count: 1.9%

Average: 6.0%

Dependent packages count: 9.6%

Dependent repos count: 10.8%

Last synced: 10 months ago

anaconda.org: pytables

PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data. PyTables is built on top of the HDF5 library, using the Python language and the NumPy package.

Homepage: https://www.pytables.org
License: BSD-3-Clause
Latest release: 3.10.2
published over 1 year ago

Versions: 11
Dependent Packages: 4
Dependent Repositories: 440

Rankings

Dependent repos count: 8.0%

Dependent packages count: 11.1%

Average: 15.3%

Forks count: 20.7%

Stargazers count: 21.6%

Last synced: 10 months ago

anaconda.org: tables

Homepage: https://www.pytables.org
License: BSD-3-Clause
Latest release: 3.9.2
published over 2 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 1

Rankings

Forks count: 20.3%

Stargazers count: 21.4%

Average: 36.1%

Dependent packages count: 51.2%

Dependent repos count: 51.4%

Last synced: 10 months ago

Dependencies

environment.yml conda

hdf5

requirements.txt pypi

numexpr >=2.6.2
numpy >=1.19.0
packaging *

.github/workflows/ci.yml actions

actions/checkout v3 composite
conda-incubator/setup-miniconda v2 composite

.github/workflows/ubuntu.yml actions

actions/checkout v3 composite

.github/workflows/wheels.yml actions

actions/cache v2 composite
actions/checkout v3 composite
actions/download-artifact v3 composite
actions/setup-python v4 composite
actions/upload-artifact v3 composite
conda-incubator/setup-miniconda v2 composite
docker/setup-qemu-action v1 composite

pyproject.toml pypi

numexpr >= 2.6.2
numpy >= 1.19.0
packaging *
py-cpuinfo *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

pytables

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.rst

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: tables

Rankings

Maintainers (7)

proxy.golang.org: github.com/PyTables/PyTables

Rankings

proxy.golang.org: github.com/pytables/pytables

Rankings

anaconda.org: pytables

Rankings

anaconda.org: tables

Rankings

Dependencies