gov.nasa.pds:harvest
Standalone Harvest client application providing the functionality for capturing and indexing product metadata into the PDS Registry system (https://github.com/nasa-pds/registry).
Science Score: 59.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
13 of 25 committers (52.0%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Standalone Harvest client application providing the functionality for capturing and indexing product metadata into the PDS Registry system (https://github.com/nasa-pds/registry).
Basic Info
- Host: GitHub
- Owner: NASA-PDS
- License: other
- Language: Java
- Default Branch: main
- Homepage: https://nasa-pds.github.io/registry
- Size: 153 MB
Statistics
- Stars: 6
- Watchers: 6
- Forks: 3
- Open Issues: 33
- Releases: 57
Topics
Metadata Files
README.md
Harvest Tool
The Harvest Tool captures and indexes product metadata. Each discipline node of the Planetary Data System runs the tool to crawl the local data repositories, discovering products and indexing associated metadata into the Registry Service. As such, it's a sub-component of the PDS Registry Application (https://github.com/NASA-PDS/registry).
For more detailed documentation on this tool, see the PDS Registry Documentation: https://nasa-pds.github.io/registry/.
Documentation
The documentation for the latest release of the Harvest Tool, including release notes, installation, and operation of the software is ready to browse online.
If you would like to get the latest documentation, including any updates since the last release, you can execute the "mvn site:run" command and view the documentation locally at http://localhost:8080/.
👥 Contributing
Within the NASA Planetary Data System, we value the health of our community as much as the code. Towards that end, we ask that you read and practice what's described in these documents:
- Our contributor's guide delineates the kinds of contributions we accept.
- Our code of conduct outlines the standards of behavior we practice and expect by everyone who participates with our software.
🔢 Versioning
We use the SemVer philosophy for versioning this software. Or not! Update this as you see fit.
🪛 Development
To develop this project, use your favorite text editor, or an integrated development environment with Java support, such as Eclipse. You'll also need Apache Maven version 3. With these tools, you can typically run
mvn package
to produce a complete package. This runs all the phases necessary, including compilation, testing, and package assembly. Other common Maven phases include:
-
compile- just compile the source code -
test- just run unit tests -
install- install into your local repository -
deploy- deploy to a remote repository — note that the Roundup action does this automatically for releases
:guardsman: Secrets Detection Setup and Update
The PDS uses Detect Secrets to help prevent committing information to a repository that should remain secret.
For Detect Secrets to work, there is a one-time setup required to your personal global Git configuration, as well as several steps to create or update the required .secrets.baseline file needed to avoid false positive failures of the software. See the wiki entry on Detect Secrets to learn how to do this.
🪝 Pre-Commit Hooks
This package comes with a configuration for Pre-Commit, a system for automating and standardizing git hooks for code linting, security scanning, etc. Here in this Java template repository, we use Pre-Commit with Detect Secrets to prevent the accidental committing or commit messages containing secrets like API keys and passwords.
Pre-Commit and detect-secrets are language-neutral, but they themselves are written in Python. To take advantage of these features, you'll need a nearby Python installation. A recommended way to do this is with a virtual Python environment. Using the command line interface, run:
console
$ python -m venv .venv
$ source .venv/bin/activate # Use source .venv/bin/activate.csh if you're using a C-style shell
$ pip install pre-commit git+https://github.com/NASA-AMMOS/slim-detect-secrets.git@exp
See Detect Secrets information above to setup your secrets baseline prior to proceeding.
Finally, install the pre-commit hooks:
pre-commit install
pre-commit install -t pre-push
pre-commit install -t prepare-commit-msg
pre-commit install -t commit-msg
You can then work normally. Pre-commit will run automatically during git commit and git push so long as the Python virtual environment is active.
👉 Note: For Detect Secrets to work, there is a one-time setup required to your personal global Git configuration. See the wiki entry on Detect Secrets to learn how to do this.
🚅 Continuous Integration & Deployment
Thanks to GitHub Actions and the Roundup Action, this software undergoes continuous integration and deployment. Every time a change is merged into the main branch, an "unstable" (known in Java software development circles as a "SNAPSHOT") is created and delivered to the releases page and to the OSSRH.
You can make an official delivery by pushing a release/X.Y.Z branch to GitHub, replacing X with the major version number, Y with the minor version number, and Z with the micro version number. This results in a stable (non-SNAPSHOT) release generated and cryptographically signed (but by an automated process so alter trust expectations accordingly) and made available on the releases page and OSSRH; the website published; changelogs and requirements updated; and a new version number in the main branch prepared for future development.
The following sections detail how to do this manually should the automated steps fail.
🔧 Manual Publication
👉 Note: Requires using PDS Maven Parent POM to ensure release profile is set.
Update Version Numbers
Update pom.xml for the release version or use the Maven Versions Plugin, e.g.:
console
$ # Skip this step if this is a RELEASE CANDIDATE, we will deploy as SNAPSHOT version for testing
$ VERSION=1.15.0
$ mvn -DnewVersion=$VERSION versions:set
$ git add pom.xml
$ git add */pom.xml
Update Changelog
Update Changelog using Github Changelog Generator. Note: Make sure you set $CHANGELOG_GITHUB_TOKEN in your .bash_profile or use the --token flag.
console
$ # For RELEASE CANDIDATE, set VERSION to future release version.
$ GITHUB_ORG=NASA-PDS
$ GITHUB_REPO=validate
$ github_changelog_generator --future-release v$VERSION --user $GITHUB_ORG --project $GITHUB_REPO --configure-sections '{"improvements":{"prefix":"**Improvements:**","labels":["Epic"]},"defects":{"prefix":"**Defects:**","labels":["bug"]},"deprecations":{"prefix":"**Deprecations:**","labels":["deprecation"]}}' --no-pull-requests --token $GITHUB_TOKEN
$ git add CHANGELOG.md
Commit Changes
Commit changes using following template commit message:
console
$ # For operational release
$ git commit -m "[RELEASE] Validate v$VERSION"
$ # Push changes to main
$ git push --set-upstream origin main
Build and Deploy Software to Maven Central Repo
console
$ # For operational release
$ mvn --activate-profiles release clean site site:stage package deploy
$ # For release candidate
$ mvn clean site site:stage package deploy
Push Tagged Release
```console $ # For Release Candidate, you may need to delete old SNAPSHOT tag $ git push origin :v$VERSION $ # Now tag and push $ REPO=validate $ git tag v${VERSION} -m "[RELEASE] $REPO v$VERSION" -m "See CHANGELOG for more details." $ git push --tags
```
Deploy Site to Github Pages
From cloned repo:
console
$ git checkout gh-pages
$ # Copy the over to version-specific and default sites
$ rsync --archive --verbose target/staging/ .
$ git add .
$ # For operational release
$ git commit -m "Deploy v$VERSION docs"
$ # For release candidate
$ git commit -m "Deploy v${VERSION}-SNAPSHOT docs"
$ git push origin gh-pages
Update Versions For Development
Update pom.xml with the next SNAPSHOT version either manually or using Github Versions Plugin.
For RELEASE CANDIDATE, ignore this step.
console
$ git checkout main
$ # For release candidates, skip to push changes to main
$ VERSION=1.16.0-SNAPSHOT
$ mvn -DnewVersion=$VERSION versions:set
$ git add pom.xml
$ git commit -m "Update version for $VERSION development"
$ # Push changes to main
$ git push --set-upstream origin main
Complete Release in Github
Currently the process to create more formal release notes and attach Assets is done manually through the Github UI.
NOTE: Be sure to add the tar.gz and zip from the target/ directory to the release assets, and use the CHANGELOG generated above to create the RELEASE NOTES.
📃 License
The project is licensed under the Apache version 2 license.
Maven JAR Dependency Reference
- Operational Releases: https://search.maven.org/search?q=g:gov.nasa.pds%20AND%20a:harvest&core=gav
- Snapshots: https://oss.sonatype.org/content/repositories/snapshots/gov/nasa/pds/harvest/
If you want to access snapshots, add the following to your ~/.m2/settings.xml:
xml
<profiles>
<profile>
<id>allow-snapshots</id>
<activation><activeByDefault>true</activeByDefault></activation>
<repositories>
<repository>
<id>snapshots-repo</id>
<url>https://oss.sonatype.org/content/repositories/snapshots</url>
<releases><enabled>false</enabled></releases>
<snapshots><enabled>true</enabled></snapshots>
</repository>
</repositories>
</profile>
</profiles>
Owner
- Name: NASA Planetary Data System Software
- Login: NASA-PDS
- Kind: organization
- Email: pds-operator@jpl.nasa.gov
- Website: https://nasa-pds.github.io/
- Repositories: 106
- Profile: https://github.com/NASA-PDS
GitHub Events
Total
- Create event: 55
- Release event: 26
- Issues event: 62
- Watch event: 1
- Delete event: 79
- Issue comment event: 231
- Push event: 89
- Pull request review event: 9
- Pull request event: 35
Last Year
- Create event: 55
- Release event: 26
- Issues event: 62
- Watch event: 1
- Delete event: 79
- Issue comment event: 231
- Push event: 89
- Pull request review event: 9
- Pull request event: 35
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| PDSEN CI Bot | p****i@j****v | 182 |
| shardman | s****n@4****1 | 127 |
| Jordan Padams | j****s@j****v | 119 |
| mcayanan | m****n@4****1 | 113 |
| thomas loubrieu | t****u@j****v | 55 |
| PDS dev admin | p****i@g****m | 45 |
| Eugene | t****t@t****m | 42 |
| Al Niessner | A****r@x****x | 27 |
| Eugene | t****2@y****m | 22 |
| dependabot[bot] | 4****] | 21 |
| Eugene | k****o@R****v | 21 |
| al-niessner | 1****r | 19 |
| Sean Kelly | k****y@s****z | 9 |
| Sean Hardman | S****n@j****v | 7 |
| Alex Dunn | a****n@j****v | 7 |
| thomas loubrieu | 6****l | 6 |
| Michael Cayanan | m****n@j****v | 5 |
| Thomas Loubrieu | l****u@j****v | 5 |
| Mike Cayanan | m****n@j****v | 3 |
| Jimmie Young | j****g@j****v | 2 |
| Galen A Hollins | G****s@j****v | 1 |
| Ramesh Maddegoda | 9****a | 1 |
| GitHub Action | a****n@g****m | 1 |
| Lyle Barner | l****r@j****v | 1 |
| jpadams | j****s@4****1 | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 125
- Total pull requests: 108
- Average time to close issues: 4 months
- Average time to close pull requests: 11 days
- Total issue authors: 14
- Total pull request authors: 9
- Average comments per issue: 3.68
- Average comments per pull request: 0.99
- Merged pull requests: 87
- Bot issues: 0
- Bot pull requests: 35
Past Year
- Issues: 41
- Pull requests: 33
- Average time to close issues: 24 days
- Average time to close pull requests: 20 days
- Issue authors: 7
- Pull request authors: 4
- Average comments per issue: 3.54
- Average comments per pull request: 0.85
- Merged pull requests: 19
- Bot issues: 0
- Bot pull requests: 14
Top Authors
Issue Authors
- jordanpadams (61)
- tloubrieu-jpl (29)
- tdddblog (7)
- plawton-umd (7)
- scholes-ds (4)
- rchenatjpl (3)
- alexdunnjpl (3)
- tariqksoliman (2)
- mdrum (2)
- gxtchen (2)
- al-niessner (1)
- nutjob4life (1)
- imoon-ucla (1)
- msbentley (1)
- dependabot[bot] (1)
Pull Request Authors
- dependabot[bot] (55)
- al-niessner (35)
- tdddblog (30)
- alexdunnjpl (6)
- jordanpadams (6)
- nutjob4life (5)
- tloubrieu-jpl (4)
- ramesh-maddegoda (1)
- lylebarner (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
- Total downloads: unknown
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 30
repo1.maven.org: gov.nasa.pds:harvest
The Harvest Tool provides functionality for capturing and indexing product metadata. The tool will run locally at the Discipline Node to crawl the local data repository in order to discover products and index associated metadata with the Registry Service.
- Homepage: https://nasa-pds.github.io/harvest/
- Documentation: https://appdoc.app/artifact/gov.nasa.pds/harvest/
- License: The Apache License, Version 2.0
-
Latest release: 4.0.7
published 10 months ago
Rankings
Dependencies
- actions/cache v3 composite
- actions/checkout v3 composite
- actions/setup-java v2 composite
- actions/checkout v3 composite
- actions/upload-artifact v3 composite
- github/codeql-action/analyze v2 composite
- github/codeql-action/autobuild v2 composite
- github/codeql-action/init v2 composite
- NASA-PDS/roundup-action stable composite
- actions/cache v3 composite
- actions/checkout v3 composite
- NASA-PDS/roundup-action stable composite
- actions/cache v3 composite
- actions/checkout v3 composite
- com.google.code.gson:gson 2.8.9
- commons-cli:commons-cli 1.4
- commons-codec:commons-codec 1.15
- commons-lang:commons-lang 2.6
- gov.nasa.pds:registry-common 1.3.1
- org.apache.tika:tika-core 1.23
- org.json:json 20210307
- actions/checkout v4 composite