wolfsoftware.github-extractor
Extract various information from the GitHub API.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Extract various information from the GitHub API.
Basic Info
Statistics
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 3
- Releases: 3
Topics
Metadata Files
README.md
Overview
The GitHub Extractor package is a Python library designed to facilitate the extraction of data from GitHub.
This package provides functions to fetch information about repositories, including languages used, releases, contributors, topics, workflows, and more with robust error handling and configuration support.
Features
- List organizations for a user from GitHub.
- List repositories for a user from GitHub.
- List repositories for a specified organization from GitHub.
- Support for authentication using GitHub API tokens.
- Filtering of organizations and repositories based on given patterns.
- Pagination handling for API requests.
Installation
You can install GitHub Extractor via pip:
bash
pip install wolfsoftware.github-extractor
Usage
Getting Token information
You an get basic information relating to the given token.
There is also a specific command line tool for this Github Token Validator.
```python from wolfsoftware.githubextractor import gettoken_information
config = { "token": "yourgithubtoken", } ```
Parameters
| Name | Required | Purpose | | :------ | :------: | :------------------------------------------------------------------------- | | token | Yes | Authentication for the GitHub API. | | timeout | No | The timeout to use when talking to the GitHub API (default is 10 seconds). | | slugs | No | Should we return the results as slugs. (List of names and nothing else). |Getting User Information
You an get basic information relating to the authenticated user (owner of the token). The information will be limited by the scope of the token.
```python from wolfsoftware.githubextractor import getauthenticated_user
config = { "token": "yourgithubtoken", } ```
Parameters
| Name | Required | Purpose | | :------ | :------: | :------------------------------------------------------------------------- | | token | Yes | Authentication for the GitHub API. | | timeout | No | The timeout to use when talking to the GitHub API (default is 10 seconds). | | slugs | No | Should we return the results as slugs. (List of names and nothing else). |Listing Organizations
You can list organizations that you are a member of using British or American English spelling.
```python from wolfsoftware.githubextractor import listorganisations, list_organizations
config = { "token": "yourgithubtoken", "ignore_orgs": ["Test*"] }
Using British English spelling
organisations = list_organisations(config)
Using American English spelling
organisationsus = listorganizations(config) ```
Parameters
| Name | Required | Purpose | | :------ | :------: | :------------------------------------------------------------------------- | | token | Yes | Authentication for the GitHub API. | | timeout | No | The timeout to use when talking to the GitHub API (default is 10 seconds). | | slugs | No | Should we return the results as slugs. (List of names and nothing else). |Filtering Parameters
| Name | Required | Purpose | | :----------- | :------: | :-------------------------------------------------------- | | include_orgs | No | A list of organisation names to include in the results. | | ignore_orgs | No | A list of organisation names to exclude from the results. | | get_members | No | Should we include organisation members in the results. |Listing User Repositories
You can list repositories for a user with optional filters:
```python from wolfsoftware.githubextractor import listuser_repositories
config = { "token": "yourgithubtoken", "ignorerepos": ["Test*"], "includerepos": ["Project*"] }
repositories = listuserrepositories(config) ```
Parameters
| Name | Required | Purpose | | :------------ | :------: | :------------------------------------------------------------------------------------------------------- | | token | No | Authentication for the GitHub API. | | timeout | No | The timeout to use when talking to the GitHub API (default is 10 seconds). | | slugs | No | Should we return the results as slugs. (List of names and nothing else). | | username | No | The GitHub username to list repositories for. (Authenticated user will be used is this is not supplied). |Additional Data Parameter
| Name | Required | Purpose | | :--------------- | :------: | :-------------------------------------------------------- | | get_branches | No | Add details about all branches to each repository. | | get_contributors | No | Add details about all contributors to each repository. | | get_languages | No | Add the list of identified languages for each repository. | | get_releases | No | Add details about all releases to each repository. | | get_tags | No | Add details about all tags to each repository. | | get_topics | No | Add the list of defined topics to each repository. | | get_workflows | No | Add details about all workflows to each repository. |Filtering Parameter
| Name | Required | Purpose | | :------------ | :------: | :---------------------------------------------------------------------------- | | include_names | No | A list of repository names to include in the results. | | ignore_names | No | A list of repository names to exclude from the results. | | include_repos | No | A list of organisation names/repository names to include in the results. | | ignore_repos | No | A list of organisation names/repository names to exclude from the results. | | skip_private | No | Do not include private repositories, this is for the authenticated user only. | > ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-packageListing Repositories by Organization
You can list repositories for a specific organization with optional filters:
```python from wolfsoftware.githubextractor import listrepositoriesbyorg
config = { "token": "yourgithubtoken", "orgname": "yourorganization", "ignorerepos": ["Test*"], "includerepos": ["Project*"] }
repositories = listrepositoriesby_org(config) ```
Parameters
| Name | Required | Purpose | | :------- | :------: | :------------------------------------------------------------------------- | | token | No | Authentication for the GitHub API. | | timeout | No | The timeout to use when talking to the GitHub API (default is 10 seconds). | | slugs | No | Should we return the results as slugs. (List of names and nothing else). | | org_name | No | The GitHub organisation to list repositories for. |Additional Data Parameter
| Name | Required | Purpose | | :--------------- | :------: | :-------------------------------------------------------- | | get_branches | No | Add details about all branches to each repository. | | get_contributors | No | Add details about all contributors to each repository. | | get_languages | No | Add the list of identified languages for each repository. | | get_releases | No | Add details about all releases to each repository. | | get_tags | No | Add details about all tags to each repository. | | get_topics | No | Add the list of defined topics to each repository. | | get_workflows | No | Add details about all workflows to each repository. |Filtering Parameter
| Name | Required | Purpose | | :------------ | :------: | :---------------------------------------------------------------------------- | | include_names | No | A list of repository names to include in the results. | | ignore_names | No | A list of repository names to exclude from the results. | | include_repos | No | A list of organisation names/repository names to include in the results. | | ignore_repos | No | A list of organisation names/repository names to exclude from the results. | | skip_private | No | Do not include private repositories, this is for the authenticated user only. | > ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-packageListing all Organisation Repositories
You can list all repositories for all organisations you're a member of.
```python from wolfsoftware.githubextractor import listallorgrepositories
config = { "token": "yourgithubtoken", "ignorerepos": ["Test*"], "includerepos": ["Project*"] }
repositories = listallorg_repositories(config) ```
Parameters
| Name | Required | Purpose | | :------------ | :------: | :------------------------------------------------------------------------------------------------------- | | token | Yes | Authentication for the GitHub API. | | timeout | No | The timeout to use when talking to the GitHub API (default is 10 seconds). | | slugs | No | Should we return the results as slugs. (List of names and nothing else). |Additional Data Parameter
| Name | Required | Purpose | | :--------------- | :------: | :-------------------------------------------------------- | | get_branches | No | Add details about all branches to each repository. | | get_contributors | No | Add details about all contributors to each repository. | | get_languages | No | Add the list of identified languages for each repository. | | get_releases | No | Add details about all releases to each repository. | | get_tags | No | Add details about all tags to each repository. | | get_topics | No | Add the list of defined topics to each repository. | | get_workflows | No | Add details about all workflows to each repository. |Filtering Parameter
| Name | Required | Purpose | | :------------ | :------: | :---------------------------------------------------------------------------- | | include_names | No | A list of repository names to include in the results. | | ignore_names | No | A list of repository names to exclude from the results. | | include_repos | No | A list of organisation names/repository names to include in the results. | | ignore_repos | No | A list of organisation names/repository names to exclude from the results. | | skip_private | No | Do not include private repositories, this is for the authenticated user only. | > ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-packageListing all Visible Repositories
You can list repositories that you are able to access.
```python from wolfsoftware.githubextractor import listallvisiblerepositories
config = { "token": "yourgithubtoken", "ignorerepos": ["Test*"], "includerepos": ["Project*"] }
repositories = listallvisible_repositories(config) ```
Parameters
| Name | Required | Purpose | | :------------ | :------: | :------------------------------------------------------------------------------------------------------- | | token | Yes | Authentication for the GitHub API. | | timeout | No | The timeout to use when talking to the GitHub API (default is 10 seconds). | | slugs | No | Should we return the results as slugs. (List of names and nothing else). |Additional Data Parameter
| Name | Required | Purpose | | :--------------- | :------: | :-------------------------------------------------------- | | get_branches | No | Add details about all branches to each repository. | | get_contributors | No | Add details about all contributors to each repository. | | get_languages | No | Add the list of identified languages for each repository. | | get_releases | No | Add details about all releases to each repository. | | get_tags | No | Add details about all tags to each repository. | | get_topics | No | Add the list of defined topics to each repository. | | get_workflows | No | Add details about all workflows to each repository. |Filtering Parameter
| Name | Required | Purpose | | :------------ | :------: | :---------------------------------------------------------------------------- | | include_names | No | A list of repository names to include in the results. | | ignore_names | No | A list of repository names to exclude from the results. | | include_repos | No | A list of organisation names/repository names to include in the results. | | ignore_repos | No | A list of organisation names/repository names to exclude from the results. | | skip_private | No | Do not include private repositories, this is for the authenticated user only. | > ignore and include names use the full name of the repository, which is the organisation name / repository name E.g. GitHubToolbox/github-extractor-packageExceptions
The following custom exceptions are used:
| Name | Purpose | | :--------------------- | :--------------------------------------------------------------------------------------------- | | AuthenticationError | Raised when authentication fails. This is caused by an invalid token. | | MissingOrgNameError | Raised when the organization name is missing. | | MissingTokenError | Raised when the GitHub API token is missing but is required. | | NotFoundError | Raised when a requested resource is not found. This is caused by incorrect scope of the token. | | RateLimitExceededError | Raised when the GitHub API rate limit is exceeded. | | RequestError | Raised for general request errors. | | RequestTimeoutError | Raised when a request times out. |
Owner
- Name: GitHub Toolbox
- Login: GitHubToolbox
- Kind: organization
- Email: github@wolfsoftware.com
- Location: United Kingdom
- Website: https://wolfsoftware.com
- Twitter: wolfsoftware
- Repositories: 2
- Profile: https://github.com/GitHubToolbox
An assortment of tools for interacting with GitHub. Created by Wolf Software
Citation (CITATION.cff)
cff-version: 1.2.0
message: If you use this software, please cite it using these metadata.
title: GitHub Extractor
abstract: Extract various information from the GitHub API.
type: software
version: 0.1.1
date-released: 2024-06-26
repository-code: https://github.com/GitHubToolbox/github-extractor-package
keywords:
- "Wolf Software"
- "Software"
license: MIT
authors:
- family-names: "Wolf"
orcid: "https://orcid.org/0009-0007-0983-2072"
GitHub Events
Total
- Watch event: 1
- Delete event: 83
- Issue comment event: 167
- Push event: 130
- Pull request review event: 121
- Pull request event: 161
- Create event: 86
Last Year
- Watch event: 1
- Delete event: 83
- Issue comment event: 167
- Push event: 130
- Pull request review event: 121
- Pull request event: 161
- Create event: 86
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| dependabot[bot] | 4****] | 111 |
| Wolf | w****f@t****m | 2 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 0
- Total pull requests: 160
- Average time to close issues: N/A
- Average time to close pull requests: 3 days
- Total issue authors: 0
- Total pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 1.83
- Merged pull requests: 123
- Bot issues: 0
- Bot pull requests: 159
Past Year
- Issues: 0
- Pull requests: 128
- Average time to close issues: N/A
- Average time to close pull requests: 3 days
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 1.83
- Merged pull requests: 96
- Bot issues: 0
- Bot pull requests: 128
Top Authors
Issue Authors
- dependabot[bot] (1)
Pull Request Authors
- dependabot[bot] (222)
- TGWolf (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 9 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
pypi.org: wolfsoftware.github-extractor
Extract various information from the GitHub API.
- Homepage: https://github.com/GitHubToolbox/github-extractor-package
- Documentation: https://github.com/GitHubToolbox/github-extractor-package
- License: MIT
-
Latest release: 0.1.1
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- ActionsToolbox/get-language-versions-action 446919617fd774095b5dd3ed71c39dd3fd0d8f4f composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- actions/setup-python 82c7e631bb3cdc910f68e0081d67478d79c6982d composite
- ActionsToolbox/get-language-versions-action 446919617fd774095b5dd3ed71c39dd3fd0d8f4f composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- citation-file-format/cffconvert-github-action 4cf11baa70a673bfdf9dad0acc7ee33b3f4b6084 composite
- ruby/setup-ruby 0cde4689ba33c09f1b890c1725572ad96751a3fc composite
- Gamesight/slack-workflow-status 68bf00d0dbdbcb206c278399aa1ef6c14f74347a composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- github/codeql-action/analyze a57c67b89589d2d13d5ac85a9fc4679c7539f94c composite
- github/codeql-action/autobuild a57c67b89589d2d13d5ac85a9fc4679c7539f94c composite
- github/codeql-action/init a57c67b89589d2d13d5ac85a9fc4679c7539f94c composite
- Gamesight/slack-workflow-status 68bf00d0dbdbcb206c278399aa1ef6c14f74347a composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- github/codeql-action/analyze a57c67b89589d2d13d5ac85a9fc4679c7539f94c composite
- github/codeql-action/autobuild a57c67b89589d2d13d5ac85a9fc4679c7539f94c composite
- github/codeql-action/init a57c67b89589d2d13d5ac85a9fc4679c7539f94c composite
- Gamesight/slack-workflow-status 68bf00d0dbdbcb206c278399aa1ef6c14f74347a composite
- Mattraks/delete-workflow-runs 39f0bbed25d76b34de5594dceab824811479e5de composite
- dependabot/fetch-metadata 5e5f99653a5b510e8555840e80cbf1514ad4af38 composite
- ActionsToolbox/get-language-versions-action 446919617fd774095b5dd3ed71c39dd3fd0d8f4f composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- actions/setup-node 60edb5dd545a775178f52524783378180af0d1f8 composite
- ruby/setup-ruby 0cde4689ba33c09f1b890c1725572ad96751a3fc composite
- ActionsToolbox/get-language-versions-action 446919617fd774095b5dd3ed71c39dd3fd0d8f4f composite
- Bullrich/generate-release-changelog 6b60f004b4bf12ff271603dc32dbd261965ad2f2 composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- actions/setup-python 82c7e631bb3cdc910f68e0081d67478d79c6982d composite
- softprops/action-gh-release 69320dbe05506a9a39fc8ae11030b214ec2d1f87 composite
- ActionsToolbox/get-language-versions-action 446919617fd774095b5dd3ed71c39dd3fd0d8f4f composite
- Bullrich/generate-release-changelog 6b60f004b4bf12ff271603dc32dbd261965ad2f2 composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- actions/setup-python 82c7e631bb3cdc910f68e0081d67478d79c6982d composite
- softprops/action-gh-release 69320dbe05506a9a39fc8ae11030b214ec2d1f87 composite
- actions/first-interaction 34f15e814fe48ac9312ccf29db4e74fa767cbab7 composite
- Gamesight/slack-workflow-status 68bf00d0dbdbcb206c278399aa1ef6c14f74347a composite
- otto-de/purge-deprecated-workflow-runs 31a4e821d43e9a354cbd65845922c76e4b4b3633 composite
- ActionsToolbox/get-language-versions-action 446919617fd774095b5dd3ed71c39dd3fd0d8f4f composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- actions/setup-go cdcb36043654635271a94b9a6d1392de5bb323a7 composite
- actions/setup-python 82c7e631bb3cdc910f68e0081d67478d79c6982d composite
- actions/checkout a5ac7e51b41094c92402da3b24376905380afc29 composite
- zgosalvez/github-actions-ensure-sha-pinned-actions 2f2ebc6d914ab515939dc13f570f91baeb2c194c composite
- Gamesight/slack-workflow-status 68bf00d0dbdbcb206c278399aa1ef6c14f74347a composite
- actions/stale 28ca1036281a5e5922ead5184a1bbf96e5fc984e composite
- pytest ==8.2.1
- requests ==2.32.2
- setuptools ==70.0.0