Pyafscgap.org

Pyafscgap.org: Open source multi-modal Python-based tools for NOAA AFSC RACE GAP - Published in JOSS (2023)

https://github.com/schmidtdse/afscgap

Science Score: 100.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
    Organization schmidtdse has institutional domain (dse.berkeley.edu)
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

biodiversity biology fish fisheries noaa

Scientific Fields

Mathematics Computer Science - 84% confidence
Computer Science Computer Science - 42% confidence
Last synced: 6 months ago · JSON representation ·

Repository

Python-based tools for NOAA AFSC GAP marine species surveys data

Basic Info
  • Host: GitHub
  • Owner: SchmidtDSE
  • License: bsd-3-clause
  • Language: Python
  • Default Branch: main
  • Homepage: https://pyafscgap.org
  • Size: 4.2 MB
Statistics
  • Stars: 5
  • Watchers: 3
  • Forks: 4
  • Open Issues: 6
  • Releases: 6
Topics
biodiversity biology fish fisheries noaa
Created about 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Citation Zenodo

README.md

Python Tools for AFSC GAP

| Group | Badges | |-------|--------| | Status | build workflow status docs workflow status Project Status: Active – The project has reached a stable, usable state and is being actively developed. | | Usage | Python 3.7+ Pypi Badge License Binder | | Publication | pyOpenSci DOI | | Archive | Open in Code Ocean DOI |


Python-based tool chain ("Pyafscgap.org") for working with the public bottom trawl data from the NOAA AFSC GAP. This provides information from multiple survey programs about where certain species were seen and when under what conditions, information useful for research in ocean health.

See webpage, project Github, and example notebook.



Quickstart

Taking your first step is easy!

Explore the data in a UI: To learn about the datasets, try out an in-browser visual analytics app at https://app.pyafscgap.org without writing any code.

Try out a tutorial in your browser: Learn from and modify an in-depth tutorial notebook in a free hosted academic environment (all without installing any local software).

Jump into code: Ready to build your own scripts? Here's an example querying for Pacific cod in the Gulf of Alaska for 2021:

python import afscgap # install with pip install afscgap query = afscgap.Query() query.filter_year(eq=2021) query.filter_srvy(eq='GOA') query.filter_scientific_name(eq='Gadus macrocephalus') results = query.execute()

Continue your exploration in the developer docs.



Installation

Ready to take it to your own machine? Install the open source tools for accessing the AFSC GAP via Pypi / Pip:

bash $ pip install afscgap

The library's only dependency is requests and Pandas / numpy are not expected but supported. The above will install the release version of the library. However, you can also install the development version via:

bash $ pip install git+https://github.com/SchmidtDSE/afscgap.git@main

Installing directly from the repo provides the "edge" version of the library which should be treated as pre-release.



Purpose

Unofficial Python-based tool set for interacting with bottom trawl surveys from the Ground Fish Assessment Program (GAP). It offers:

  • Pythonic access to the NOAA AFSC GAP datasets.
  • Tools for inference of the "negative" observations not provided by the API service.
  • Visualization tools for quickly exploring and creating comparisons within the datasets, including for audiences with limited programming experience.

Note that GAP are an excellent collection of datasets produced by the Resource Assessment and Conservation Engineering (RACE) Division of the Alaska Fisheries Science Center (AFSC) as part of the National Oceanic and Atmospheric Administration's Fisheries organization (NOAA Fisheries).

Please see our objectives documentation for additional information about the purpose, developer needs addressed, and goals of the project.



Usage

This library provides access to the AFSC GAP data with optional zero catch ("absence") record inference.


Examples / tutorial

One of the best ways to learn is through our examples / tutorials series. For more details see our usage guide.


API Docs

Full formalized API documentation is available as generated by pdoc in CI / CD.


Data structure

Detailed information about our data structures and their relationship to the data structures found in NOAA's upstream database is available in our data model documentation.


Absence vs presence data

By default, the NOAA service will only return information on hauls matching a query. So, for example, requesting data on Pacific cod will only return information on hauls in which Pacific cod is found. This can complicate the calculation of important metrics like catch per unit effort (CPUE). That in mind, one of the most important features in afscgap is the ability to infer "zero catch" records as enabled by set_presence_only(False). See more information in our inference docs.


Data quality and completeness

There are a few caveats for working with these data that are important for researchers to understand. These are detailed in our limitations docs.


Community flat files

The upstream datasets have shifted starting in 2024 with one important change including decomposing the dataset into hauls, catches, and species. Without the ability to join through the API endpoint, the entire catch dataset has to be queried or catches named individually in requests in order to retrieve complete records. Therefore, starting with the 2.x releases, this library uses pre-joined community Avro files to speed up requests, offering precomputed indicies such that, where available, hauls can be pre-filtered to reduce download payload size and running time. See flat file documentation for more details about this service.



License

We are happy to make this library available under the BSD 3-Clause license. See LICENSE for more details. (c) 2023 Regents of University of California. See the Eric and Wendy Schmidt Center for Data Science and the Environment at UC Berkeley.



Developing

Intersted in contributing to the project or want to bulid manually? Please see our build docs for details.



People

Sam Pottinger is the primary contact with additional development from Giulia Zarpellon. Additionally some acknowledgements:

This is a project of the The Eric and Wendy Schmidt Center for Data Science and the Environment at UC Berkeley where Kevin Koy is Executive Director. Please contact us via dse@berkeley.edu.



Open Source

We are happy to be part of the open source community. We use the following:

In addition to Github-provided Github Actions, our build and documentation systems also use the following but are not distributed with or linked to the project itself:

Next, the visualization tool has additional dependencies as documented in the visualization readme. Similarly, the community flat files snapshot updater has additional dependencies as documented in the snapshot readme.

Finally, note that the website uses assets from The Noun Project under the NounPro plan. If used outside of https://pyafscgap.org, they may be subject to a different license.

Thank you to all of these projects for their contribution.



Version history

Annotated version history:

  • 2.0.1: Some minor changes to better support weaker internet connections.
  • 2.0.0: Switch to support new NOAA endpoints.
  • 1.0.4: Minor documentation fypo fix.
  • 1.0.3: Documentation edits for journal article.
  • 1.0.2: Minor documentation touch ups for pyopensci.
  • 1.0.1: Minor documentation fix.
  • 1.0.0: Release with pyopensci.
  • 0.0.9: Fix with issue for certain import modalities and the http module.
  • 0.0.8: New query syntax (builder / chaining) and units conversions.
  • 0.0.7: Visual analytics tools.
  • 0.0.6: Performance and size improvements.
  • 0.0.5: Changes to documentation.
  • 0.0.4: Negative / zero catch inference.
  • 0.0.3: Minor updates in documentation.
  • 0.0.2: License under BSD.
  • 0.0.1: Initial release.

The community files were last updated on Jan 7, 2025.

Owner

  • Name: DSE
  • Login: SchmidtDSE
  • Kind: organization
  • Email: dse@berkeley.edu
  • Location: United States of America

The Eric and Wendy Schmidt Center for Data Science & Environment at Berkeley

JOSS Publication

Pyafscgap.org: Open source multi-modal Python-based tools for NOAA AFSC RACE GAP
Published
June 28, 2023
Volume 8, Issue 86, Page 5593
Authors
A Samuel Pottinger ORCID
University of California, Berkeley, California, United States of America
Giulia Zarpellon ORCID
University of California, Berkeley, California, United States of America
Editor
Kevin M. Moerman ORCID
Tags
fishery alaska groundfish biodiversity food visualization

Citation (CITATION.cff)

cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Pottinger
    given-names: A Samuel
    orcid: 0000-0002-0458-4985
  - family-names: Zarpellon
    given-names: Giulia
    orcid: 0000-0002-9122-4709
title: Pyafscgap.org
abstract: Python-based tools for NOAA AFSC GAP marine species surveys data.
license: BSD-3-Clause
version: 1.0.2
date-released: 2023-06-02

GitHub Events

Total
  • Create event: 11
  • Release event: 1
  • Issues event: 7
  • Issue comment event: 1
  • Push event: 99
  • Pull request event: 18
Last Year
  • Create event: 11
  • Release event: 1
  • Issues event: 7
  • Issue comment event: 1
  • Push event: 99
  • Pull request event: 18

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 643
  • Total Committers: 2
  • Avg Commits per committer: 321.5
  • Development Distribution Score (DDS): 0.005
Past Year
  • Commits: 117
  • Committers: 1
  • Avg Commits per committer: 117.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Sam Pottinger s****r@b****u 640
gizarp g****n@p****a 3
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 27
  • Total pull requests: 104
  • Average time to close issues: about 1 month
  • Average time to close pull requests: 1 day
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 0.7
  • Average comments per pull request: 0.44
  • Merged pull requests: 101
  • Bot issues: 0
  • Bot pull requests: 3
Past Year
  • Issues: 4
  • Pull requests: 17
  • Average time to close issues: about 3 hours
  • Average time to close pull requests: 5 days
  • Issue authors: 1
  • Pull request authors: 2
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 14
  • Bot issues: 0
  • Bot pull requests: 3
Top Authors
Issue Authors
  • sampottinger (24)
  • 7yl4r (1)
Pull Request Authors
  • sampottinger (98)
  • dependabot[bot] (3)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels
dependencies (3) javascript (2) python (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 211 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 2
  • Total versions: 16
  • Total maintainers: 1
pypi.org: afscgap

Tools for interacting with the public bottom trawl surveys data from the NOAA AFSC GAP.

  • Versions: 16
  • Dependent Packages: 0
  • Dependent Repositories: 2
  • Downloads: 211 Last month
Rankings
Dependent packages count: 9.8%
Dependent repos count: 11.7%
Forks count: 15.4%
Average: 15.7%
Downloads: 18.6%
Stargazers count: 23.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/build.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
pyproject.toml pypi
  • requests ~= 2.28.2
.github/workflows/deploy.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/docs.yml actions
  • Creepios/sftp-action v1.0.3 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v1 composite
  • appleboy/ssh-action v0.1.8 composite
  • openjournals/openjournals-draft-action master composite
Dockerfile docker
  • ubuntu jammy-20230301 build
package-lock.json npm
  • 218 dependencies
package.json npm
  • grunt ^1.5.2
  • grunt-contrib-connect ^3.0.0
  • grunt-contrib-qunit ^6.1.0
afscgapviz/requirements.txt pypi
  • Flask *
  • afscgap *
  • geolib *
  • toolz *
notebooks/requirements.txt pypi
  • Cartopy *
  • afscgap *
  • geolib *
  • jupyterlab *
  • matplotlib *
  • notebook *
  • pandas *
  • scipy *