cottoncandy
cottoncandy: scientific python package for easy cloud storage - Published in JOSS (2018)
Science Score: 95.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
5 of 21 committers (23.8%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Scientific Fields
Repository
sugar for s3
Basic Info
- Host: GitHub
- Owner: gallantlab
- License: bsd-2-clause
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: http://gallantlab.github.io/cottoncandy/
- Size: 2.35 MB
Statistics
- Stars: 35
- Watchers: 14
- Forks: 17
- Open Issues: 23
- Releases: 7
Topics
Metadata Files
README.md
Welcome to cottoncandy!
sugar for s3
https://gallantlab.github.io/cottoncandy
What is cottoncandy?
A python scientific library for storing and accessing numpy array data on S3. This is achieved by reading arrays from memory and downloading arrays directly into memory. This means that you don't have to download your array to disk, and then load it from disk into your python session.
This library relies heavily on boto3
Try it out!
Jupyter Notebook examples using cottoncandy to
Explore the Allen Brain Observatory data: view notebook (launch using google colab)
Explore OpenNeuro nifti data: view notebook (launch using google colab)
Installation
Directly from the repo:
Clone the repo from GitHub and do the usual python install from the command line
$ git clone https://github.com/gallantlab/cottoncandy.git
$ cd cottoncandy
$ sudo python setup.py install
With pip:
$ pip install cottoncandy
Configuration file
Upon first use, cottoncandy will create a configuration file. This configuration file allows you to enter your S3 and Google Drive credentials and set many other options. See the default configuration file.
The configuration file is created the first time you import cottoncandy and it is stored under:
* Linux: ~/.config/cottoncandy/options.cfg
* MAC OS: ~/Library/Application Support/cottoncandy/options.cfg
* Windows (not supported): C:\Users\<username>\AppData\Local\<AppAuthor>\cottoncandy\options.cfg
By default, cottoncandy sets object and bucket permissions to authenticated-read. If you wish to keep all your objects private, modify your configuration file and set default_acl = private. See AWS ACL overview for more information on S3 permissions.
Advanced (for admins): One can customize the cottoncandy system install by cloning the repo and modifying defaults.cfg. For example, one can set the default encyption key across the system for all users (key = SoMeEncypTionKey). When a user first uses cottoncandy, this deault value will be copied to their personal configuration file. Note however that the user can still overwrite that value.
Getting started
Setup the connection (endpoint, access and secret keys can be specified in the configuration file instead)::
```python
import cottoncandy as cc cci = cc.getinterface('mybucket', ACCESSKEY='FAKEACCESSKEYTEXT', SECRETKEY='FAKESECRETKEYTEXT', endpoint_url='https://s3.amazonaws.com') ```
Storing numpy arrays
```python
import numpy as np arr = np.random.randn(100) s3response = cci.uploadrawarray('myarray', arr) arrdown = cci.downloadrawarray('myarray') assert np.allclose(arr, arr_down) ```
Storing dask arrays
```python
arr = np.random.randn(100,600,1000) s3response = cci.uploaddaskarray('testdim', arr, axis=-1) daskobject = cci.downloaddaskarray('testdim') daskobject dask.array
dask slice = daskobject[..., :200] daskslice dask.arraydownloadeddata = np.asarray(daskslice) # this downloads the array downloaded_data.shape (100, 600, 200) ```
Command-line search
```python
cci.glob('/path/to//file01.grp/imagedata') ['/path/to/my/file01a.grp/imagedata', '/path/to/my/file01b.grp/imagedata', '/path/to/your/file01a.grp/imagedata', '/path/to/your/file01b.grp/imagedata'] cci.glob('/path/to/my/file02.grp/') ['/path/to/my/file02a.grp/imagedata', '/path/to/my/file02a.grp/textdata', '/path/to/my/file02b.grp/imagedata', '/path/to/my/file02b.grp/text_data'] ```
File system-like object browsing
```python
import cottoncandy as cc browser = cc.getbrowser('mybucketname', ACCESSKEY='FAKEACCESSKEYTEXT', SECRETKEY='FAKESECRETKEYTEXT', endpointurl='https://s3.amazonaws.com') browser.sweetproject.sub
browser.sweet project.sub01awesomeanalysisDOTgrp browser.sweetproject.sub02awesomeanalysisDOTgrp browser.sweetproject.sub01awesomeanalysisDOTgrp(sub01awesomeanalysis.grp: 3 keys)> browser.sweetproject.sub01awesomeanalysisDOTgrp.resultmodel01 ```
Connection settings (S3 only)
cottoncandy allows users to modify connection settings via botocore. For example, the user can define the connection time out for downloads, and the number of times to retry dropped S3 requests.
from botocore.client import Config
config = Config(connect_timeout=60, read_timeout=60, retries=dict(max_attempts=10))
cci = cc.get_interface('my_bucket_name', config=config)
Google Drive backend
cottoncandy can also use Google Drive as a back-end. This equires a client_secrets.json file in your ~/.config/cottoncandy folder and the pydrive package.
See the Google Drive setup instructions for more details.
```python
import cottoncandy as cc cci = cc.get_interface(backend='gdrive') ```
Contributing
- If you find any issues with
cottoncandy, please report it by submitting an issue on GitHub. - If you wish to contribute, please submit a pull request. Include information as to how you ran the tests and the full output log if possible. Running tests on AWS can incur costs.
Cite as
Nunez-Elizalde AO, Gao JS, Zhang T, Gallant JL (2018). cottoncandy: scientific python package for easy cloud storage. Journal of Open Source Software, 3(28), 890, https://doi.org/10.21105/joss.00890
Owner
- Name: gallantlab
- Login: gallantlab
- Kind: organization
- Repositories: 26
- Profile: https://github.com/gallantlab
JOSS Publication
cottoncandy: scientific python package for easy cloud storage
Authors
Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA
Program in Bioengineering, UCSF and UC Berkeley, CA, USA
Helen Wills Neuroscience Institute, University of California, Berkeley, CA, USA, Program in Bioengineering, UCSF and UC Berkeley, CA, USA, Department of Psychology, University of California, Berkeley, CA, USA
Tags
S3 cloud storageGitHub Events
Total
- Issues event: 3
- Watch event: 1
- Delete event: 1
- Issue comment event: 6
- Push event: 5
- Pull request event: 11
- Create event: 4
Last Year
- Issues event: 3
- Watch event: 1
- Delete event: 1
- Issue comment event: 6
- Push event: 5
- Pull request event: 11
- Create event: 4
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Anwar Nunez-Elizalde | a****z@g****m | 178 |
| Tianjiao Zhang | t****g@b****u | 54 |
| Tom Dupré la Tour | t****r@m****g | 32 |
| MarkLescroart | m****t@b****u | 12 |
| Storm Slivkoff | s****f@g****m | 7 |
| fatma | f****u@g****m | 7 |
| carson | c****n@c****b | 6 |
| cchen23 | c****7@g****m | 6 |
| Sara Popham | s****m@b****u | 5 |
| robert | g****o@g****m | 5 |
| Alex Huth | a****h@b****u | 5 |
| Carson McNeil | c****l@g****m | 4 |
| Jen Holmberg | 8****g | 3 |
| arokem | a****m@g****m | 3 |
| Matteo Visconti di Oleggio Castello | m****c@b****u | 3 |
| carson | c****n@n****b | 3 |
| Aditya Vaidya | 6****8 | 1 |
| Michael Oliver | m****r@g****m | 1 |
| Thomas J. Leeper | t****r@g****m | 1 |
| Ubuntu | u****u@i****l | 1 |
| moflo | g****b@m****e | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 47
- Total pull requests: 60
- Average time to close issues: 5 months
- Average time to close pull requests: about 1 month
- Total issue authors: 15
- Total pull request authors: 18
- Average comments per issue: 1.57
- Average comments per pull request: 1.15
- Merged pull requests: 39
- Bot issues: 0
- Bot pull requests: 2
Past Year
- Issues: 2
- Pull requests: 17
- Average time to close issues: 5 months
- Average time to close pull requests: 3 months
- Issue authors: 2
- Pull request authors: 4
- Average comments per issue: 0.5
- Average comments per pull request: 0.88
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- anwarnunez (27)
- alexhuth (7)
- chrisgorgo (1)
- majure (1)
- jamesgao (1)
- cmcneil (1)
- spopham (1)
- vsoch (1)
- kroq-gar78 (1)
- mvdoc (1)
- jenholmberg (1)
- r-b-g-b (1)
- the-moliver (1)
- DeepakSahoo-Reflektion (1)
- TomDLT (1)
Pull Request Authors
- kroq-gar78 (8)
- mvdoc (8)
- anwarnunez (7)
- fatmai (5)
- marklescroart (5)
- cmcneil (4)
- arokem (3)
- r-b-g-b (3)
- eickenberg (3)
- spopham (2)
- jenholmberg (2)
- dependabot[bot] (2)
- TomDLT (2)
- cchen23 (2)
- sslivkoff (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 60 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 7
- Total maintainers: 4
pypi.org: cottoncandy
sugar for S3
- Homepage: http://gallantlab.github.io/cottoncandy/
- Documentation: https://cottoncandy.readthedocs.io/
- License: bsd-2-clause
-
Latest release: 0.2.0
published about 6 years ago
Rankings
Maintainers (4)
Dependencies
- boto3 >=1.2.3
- cloudpickle >=0.2.2
- dask *
- numcodecs >=0.5.5
- numpy >=1.6.0
- pycrypto >=2.6.1
- pydrive >=1.3.1
- python-dateutil >=2.7.3
- scipy >=0.9.0
- six >=1.11.0
- toolz >=0.7.4
- wheel >=0.31.1
- PyDrive *
- boto3 *
- botocore *
- pycrypto *
- python-dateutil *
- six *
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite