oss4climate
Listing of open source software for energy and climate, with both scraper and app-based search engine implementations
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.9%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Listing of open source software for energy and climate, with both scraper and app-based search engine implementations
Basic Info
- Host: GitHub
- Owner: Pierre-VF
- License: other
- Language: Python
- Default Branch: main
- Homepage: https://oss4climate.pierrevf.consulting
- Size: 821 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 17
- Releases: 0
Topics
Metadata Files
README.md
Listing of open-source for climate and energy applications
TLDR; if you're just looking for the search engine, you will find it here: https://oss4climate.pierrevf.consulting/ .
What is the vision of this project, is it just yet another listing?
The vision is that this project should provide a list of open-source software for energy and climate applications, which also provides insight on the following aspects, which are key to the success of open-source usage:
- maintenance
- security
- tech stack
- context data (who uses it, maintains it, ...)
All of this should be provided in a way that makes it easy to search and interface to (e.g. with a structured machine-readable registry).
However, in the current stage it is indeed not providing all of these features yet. Help is appreciated to get there (see open issues).
Do you just want to carry out a search for open-source matching your needs?
To carry out a search without installing anything, you can just use the web-app here: https://oss4climate.pierrevf.consulting/
What is the license of the code and the data?
The license differs between the source code and the listings data. Please make sure to check the license file for all details.
The underlying listing data comes from a variety of repositories, some of which have a creative common licence. That means that while you are free to reuse and adapt the software, there are restrictions on the usage of the listings data below. Having noted this, the listing data can be downloaded as TOML, CSV or Feather (for Python pandas).
Where is the data coming from?
Input to the discovery process are given in the files in the indexes folder (you are welcome to add your own contribution):
- the listings scraped are listed in listings.json (if you are interested in project listings, you should check all URLs in this file)
- repositories found in the listings and added manually are found in repositories.toml
- the associated scrapers in the folder "src/oss4climate/src/parsers"
The following projects are credited as major contributors to the underlying dataset:
OpenSustain.tech (who kindly licensed their dataset under Creative Commons Attribution 4.0 International)
Other listings found in listings.json
Data from the repositories themselves is fetched directly from the hosting platforms. The platforms currently supported are given in the table below.
| Platform | Identified in discovery | Data scraped | |-----------|-------------------------|---------------------| | Gitlab | Yes | Yes | | Gihub | Yes | Yes | | Bitbucket | Yes | Not yet | | Codeberg | Yes | Not yet |
Development and contribution
Installation
Installation
The installation is straightforward if you are used to Python.
You have 2 options:
Simple installation: Create a virtual environment with Python 3.12 (the code is not tested for previous versions). Then install the package:
pip install .
Development-oriented installation (with Poetry), which only works on Unix systems. Run the makefile command:
make install
It is highly recommended to operate with a Github token (which you can create here) in order to avoid being blocked by Github's rate limit on the API. The same consideration applies to Gitlab (token generation here). These are much lower for unauthenticated accounts.
Make sure to generate this token with permissions to access public repositories.
The token can be imported by generating a .env file in the root of your repository with the following content:
```bash
This is your token generated here: https://github.com/settings/tokens/new
GITHUBAPITOKEN="...[add your token here]..."
This is your token generated here: https://gitlab.com/-/usersettings/personalaccess_tokens
GITLABACCESSTOKEN="...[add your token here]..."
For app operation, a key to refresh the data (to avoid undesirable refreshing)
DATAREFRESHKEY="...[add a random key here]..."
You can adjust the position of the cache database here (leave to default if you don't need adjustment)
SQLITE_DB=".data/db.sqlite"
If you want to enable publication of the data to FTP, you can also set these variables
EXPORTFTPURL="" EXPORTFTPUSER="" EXPORTFTPPASSWORD="" ```
Running the code
Once you have completed the steps above, you can run the following commands (only valid on Unix systems):
Typical use-cases:
- To download the dataset: > make download_data
- To search in CLI mode (note that this is a very basic CLI): > make search
Advanced use-cases (to regenerate listings - avoid unless necessary, as this very resource intensive)
- To generate an output dataset: > make generate_listing
- To add new resources: > make add
- To refresh the list of targets to be scraped: > make discover
- To export the datasets to FTP (using the credentials from the environment): > make publish
Note: the indexing is heavy and involves a series of web (and API) calls. A caching mechanism is therefore added in the implementation of the requests (with a simple SQLite database). This means that you might potentially end with a large file stored locally on your disk (currently under 500 Mb).
Need new features or found a bug?
Please open an issue on the repository here.
If you have a use-case that you would like to develop based upon this, or need new features, please get in touch with PierreVF Consulting for support.
Owner
- Name: Pierre V-F
- Login: Pierre-VF
- Kind: user
- Company: PierreVF Consulting
- Website: pierrevf.consulting
- Repositories: 1
- Profile: https://github.com/Pierre-VF
Somewhat practical idealist, with a commitment to sustainability.
GitHub Events
Total
- Issues event: 31
- Delete event: 106
- Issue comment event: 60
- Push event: 158
- Pull request review event: 7
- Pull request event: 203
- Create event: 116
Last Year
- Issues event: 31
- Delete event: 106
- Issue comment event: 60
- Push event: 158
- Pull request review event: 7
- Pull request event: 203
- Create event: 116
Committers
Last synced: 6 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Pierre V-F | p****e@p****g | 85 |
| Pierre V-F | 7****F | 63 |
| github-actions[bot] | 4****] | 16 |
| dependabot[bot] | 4****] | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 28
- Total pull requests: 190
- Average time to close issues: 4 days
- Average time to close pull requests: 3 days
- Total issue authors: 3
- Total pull request authors: 3
- Average comments per issue: 0.32
- Average comments per pull request: 0.39
- Merged pull requests: 101
- Bot issues: 0
- Bot pull requests: 111
Past Year
- Issues: 28
- Pull requests: 190
- Average time to close issues: 4 days
- Average time to close pull requests: 3 days
- Issue authors: 3
- Pull request authors: 3
- Average comments per issue: 0.32
- Average comments per pull request: 0.39
- Merged pull requests: 101
- Bot issues: 0
- Bot pull requests: 111
Top Authors
Issue Authors
- Pierre-VF (25)
- Ly0n (1)
- urstrom (1)
- dependabot[bot] (1)
Pull Request Authors
- Pierre-VF (91)
- github-actions[bot] (74)
- dependabot[bot] (61)