Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: centrebalearbiodiversitat
- License: other
- Language: Python
- Default Branch: master
- Size: 771 KB
Statistics
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 1
- Releases: 0
Metadata Files
README.md

biodumpy: A Comprehensive Biological Data Downloader
Overview
biodumpy is a powerful and versatile Python package designed to simplify the process of retrieving biological information from several public databases. With biodumpy, researchers can easily download and manage data from multiple sources, ensuring access to the most up to date and comprehensive biological information available.
Note: This package is currently under development.
Key Features
biodumpy offers dedicated modules for each supported database, with each module featuring functions specifically designed for retrieving information from its respective source. The modules implemented so far are:
- Barcode of life data system v4 (BOLD)
- Catalogue of life (COL)
- Crossref
- Global biodiversity information facility (GBIF)
- iNaturalist
- International union for the conservation of nature (IUCN)
- National center for biotechnology information (NCBI)
- Ocean biodiversity information system (OBIS)
- World register of marine species (WoRMS)
- ZooBank
This list can be expanded, thus suggestions and feedback are greatly appreciated.
Main functionalities and workflow
Before using biodumpy, users need to install the package in their Python environment with the following command:
pip install biodumpy
Usage
To simplify the use of biodumpy, we create a general structure common among the modules:
1) Load the package. Import biodumpy into your Python environment.
2) Load the desired modules. Import one or more specific modules needed to retrieve the data.
3) Set up the configuration of one or more modules. Configure the biodumpy function/s with the required parameters.
4) Start the download. Execute the function to begin retrieving the data.
Here, we provide two examples illustrating the general structure of a biodumpy script:
In detail, we described:
- Single Module Example: This example demonstrates how to use a single biodumpy module (for example, GBIF).
- Multiple Modules Example: This example shows how to use multiple biodumpy modules (for example, GBIF and IUCN).
Example N.1
``` python
# Import biodumpy package
from biodumpy import Biodumpy
# Import GBIF module
from biodumpy.inputs import GBIF
# Create a list of taxa
taxa = ['Alytes muletensis (Sanchíz & Adrover, 1979)', 'Bufotes viridis (Laurenti, 1768)',
'Hyla meridionalis Boettger, 1874', 'Anax imperator Leach, 1815']
# Set the Biodumpy function with the specific parameters
bdp = Biodumpy([GBIF(sleep=3, bulk=False, accepted_only=True)])
# Start the download
bdp.start(taxa, output_path='./downloads/{date}/{module}/{name}')
```
Example N.2
``` python
# Import biodumpy package
from biodumpy import Biodumpy
# Import GBIF and IUCN modules
from biodumpy.inputs import GBIF, IUCN
api_key = 'YOUR_IUCN_API_KEY'
# Create a list of taxa
taxa = ['Alytes muletensis', 'Bufotes viridis', 'Hyla meridionalis', 'Anax imperator']
# Set the Biodumpy functions with the specific parameters
bdp = Biodumpy([GBIF(bulk=False, accepted_only=True),
IUCN(api_key=api_key, bulk=True, region=['Global'])])
# Start the download
bdp.start(taxa, output_path='./downloads/{date}/{module}/{name}')
```
Documentation and Support
For detailed documentation and tutorials, please visit the biodumpy Read the Docs documentation.
Contribution
biodumpy is an open-source project, and contributions are welcome!
If you have ideas for new features, bug fixes, or improvements, please submit an issue or pull request in our GitHub repository or contact with the support team
✉️ here.
License
biodumpy is licensed under the MIT License for its software components. Additionally, any creative works associated with this project—such as documentation, visual assets, or other non-code materials—are licensed under the Creative Commons Attribution (CC BY 4.0) license. See the LICENSE file for full details.
Acknowledgments
The project is developed by the "Centre Balear de Biodiversitat" (CBB) at the University of the Balearic Islands, with support from MCIN and funding from the European Union—NextGenerationEU (PRTR-C17.I1), as well as the Government of the Balearic Islands.
Owner
- Name: centrebalearbiodiversitat
- Login: centrebalearbiodiversitat
- Kind: organization
- Repositories: 1
- Profile: https://github.com/centrebalearbiodiversitat
Citation (CITATIONS.md)
# Citation for Biodumpy If you use `biodumpy` in your research, please cite it as follows: **Cancellario T., Golomb T., Roldán A., Far, A.** `biodumpy`, Version 0.1.4, 2024. Available at: [biodumpy](https://pypi.org/project/biodumpy/). ## Description `biodumpy` is a powerful and versatile Python package designed to simplify the process of retrieving biological information from several public databases. With `biodumpy`, researchers can easily download and manage data from multiple sources, ensuring access to the most up to date and comprehensive biological information available. # Citations for Packages Used If you use `biodumpy`, please also consider citing the following packages that it depends on: --- `beautifulsoup4` **Richardson, L.** - *beautifulsoup4* - Version 4.12.3, 2024. Available at: [beautifulsoup4](https://pypi.org/project/beautifulsoup4/). **Citation**: - --- `biopython` **Chapman B., Chang J** - *Biopython* - Version 1.84, 2024. Available at: [biopython](https://pypi.org/project/biopython/). **Citation**: Chapman, B., Chang, J.: Biopython: Python tools for computational biology. ACM SIGBIO Newslett. 2000, 20: 15-19. 10.1145/360262.360268. --- `lxml` **lxml dev team** - *lxml* - Version 5.2.2, 2024. Available at: [lxml](https://pypi.org/project/lxml/). **Citation**: - --- `numpy` **Harris et al.** - *numpy* - Version 1.26.4, 2024. Available at: [numpy](https://pypi.org/project/numpy/1.26.4/). **Citation**: Harris, C.R., Millman, K.J., van der Walt, S.J. et al. Array programming with NumPy. Nature 585, 357–362 (2020). https://doi.org/10.1038/s41586-020-2649-2 --- `pandas` **The Pandas Development Team** - *pandas* - Version 2.2.2, 2024. Available at: [pandas](https://pypi.org/project/pandas/). **Citation**: The pandas development team. (2024). pandas-dev/pandas: Pandas (v2.2.3). Zenodo. https://doi.org/10.5281/zenodo.13819579 --- `pytest` **Richardson, L.** - *pytest* - Version 8.3.3, 2024. Available at: [pytest](https://pypi.org/project/pytest/). **Citation**: - --- `Python` **Van Rossum, G., & Drake, F. L.** - *Python* - Version 8.3.3, 2024. Available at: [Python](https://www.python.org/). **Citation**: Van Rossum, G., & Drake, F. L. (2009). Python 3 Reference Manual. Scotts Valley, CA: CreateSpace. --- `requests` **Reitz, K. et al.** - *requests* - Version 2.32.3, 2024. Available at: [requests](https://pypi.org/project/requests/). **Citation**: - --- `selenium` **Muthukadan, B.** - *selenium* - Version 4.23.0, 2024. Available at: [selenium](https://pypi.org/project/selenium/). **Citation**: - --- `tqdm` **Costa-Luis et al.** - *tqdm* - Version 4.66.4, 2024. Available at: [tqdm](https://pypi.org/project/tqdm/). **Citation**: Costa-Luis, D., Larroque, S. K., Altendorf, K., Mary, H., Korobov, M., Yorav-Raphael, N., ... & Malmgren, S. (2020). tqdm: A fast, Extensible Progress Bar for Python and CLI. Zenodo. https://doi.org/10.5281/zenodo.13207611 --- # Citations for API Used If you use `biodumpy`, please also consider citing the following API sources for each module utilized: --- **Barcode of life data system (BOLD)** **Ratnasingham et al.** Available at: [BOLD](https://boldsystems.org/index.php). **Citation**: Ratnasingham S, Wei C, Chan D, Agda J, Agda J, Ballesteros-Mejia L, Ait Boutou H, El Bastami ZM, Ma E, Manjunath R, Rea D, Ho C, Telfer A, McKeown J, Rahulan M, Steinke C, Dorsheimer J, Milton M, Hebert PDN. "BOLD v4: A Centralized Bioinformatics Platform for DNA-Based Biodiversity Data." In DNA Barcoding: Methods and Protocols, pp. 403-441, Chapter 26. New York, NY: Springer US, 2024. --- **Catalogue of life (COL)** **Bánki et al.** Available at: [COL](https://www.catalogueoflife.org/). **Citation**: Bánki, O., Roskov, Y., Döring, M., Ower, G., Hernández Robles, D. R., Plata Corredor, C. A., Stjernegaard Jeppesen, T., Örn, A., Vandepitte, L., Pape, T., Hobern, D., Garnett, S., Little, H., DeWalt, R. E., Ma, K., Miller, J., Orrell, T., Aalbu, R., Abbott, J., et al. (2024). Catalogue of Life (Version 2024-09-25). Catalogue of Life, Amsterdam, Netherlands. https://doi.org/10.48580/dgh3g --- **Crossref** Available at: [Crossref](https://www.crossref.org/). **Citation**: Crossref. (2024). --- **Global Biodiversity Information Facility (GBIF)** Available at: [GBIF](https://www.gbif.org/). **Citation**: GBIF. (2024). --- **iNaturalist** Available at: [iNaturalist](https://www.inaturalist.org/). **Citation**: iNaturalist. (2024). --- **IUCN Red List** Available at: [IUCN](https://www.iucnredlist.org/). **Citation**: IUCN. (2024). The IUCN Red List of Threatened Species. Version 2024-1. https://www.iucnredlist.org. --- **National Center for Biotechnology Information (NCBI)** Available at: [NCBI](https://www.ncbi.nlm.nih.gov/). **Citation**: National Center for Biotechnology Information (NCBI). Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; [2024]. --- **Ocean biodiversity information system (OBIS)** Available at: [OBIS](https://obis.org/). **Citation**: OBIS. (2024). Ocean Biodiversity Information System. Intergovernmental Oceanographic Commission of UNESCO. https://obis.org. --- **World register of marine species (WoRMS)** Available at: [WoRMS](https://www.marinespecies.org/). **Citation**: Ahyong et al., (2024). World Register of Marine Species. Available from https://www.marinespecies.org at VLIZ. Accessed 2024-10-10. doi:10.14284/170 --- **ZooBank** Available at: [ZooBank](https://zoobank.org/). **Citation**: ZooBank. (2024). International Commission on Zoological Nomenclature. ---
GitHub Events
Total
- Delete event: 5
- Push event: 30
- Pull request event: 11
- Create event: 4
Last Year
- Delete event: 5
- Push event: 30
- Pull request event: 11
- Create event: 4
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 5
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Total issue authors: 0
- Total pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 5
- Average time to close issues: N/A
- Average time to close pull requests: 2 days
- Issue authors: 0
- Pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 4
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- ToGo347 (7)
- TommasoCanc (3)
- ToniJosep (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 79 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 6
- Total maintainers: 1
pypi.org: biodumpy
A Comprehensive Biological Data Downloader.
- Homepage: https://github.com/centrebalearbiodiversitat/biodumpy
- Documentation: https://biodumpy.readthedocs.io/
- License: MIT License
-
Latest release: 0.1.6
published 9 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v4 composite
- chartboost/ruff-action v1 composite
- stefanzweifel/git-auto-commit-action v5 composite
- sphinx ==7.1.2
- sphinx-rtd-theme ==1.3.0rc1
- beautifulsoup4 ==4.12.3
- biopython ==1.84
- lxml ==5.2.2
- numpy ==2.0.0
- pandas ==2.2.2
- pytest ==8.3.2
- requests ==2.32.3
- selenium ==4.23.0
- tqdm ==4.66.4