Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
GitHub Innovation Graph
Basic Info
- Host: GitHub
- Owner: github
- License: cc0-1.0
- Language: Python
- Default Branch: main
- Homepage: https://innovationgraph.github.com/
- Size: 14.6 MB
Statistics
- Stars: 493
- Watchers: 160
- Forks: 52
- Open Issues: 3
- Releases: 8
Topics
Metadata Files
README.md
GitHub Innovation Graph
This repo contains structured data files of public activity on GitHub, aggregated by economy on a quarterly basis from 2020 onward.
Through offerings such as the GitHub Innovation Graph, we hope to inform research and public policy that could benefit from data on software development activity globally. We welcome developers, data analysts, researchers, policymakers, and all other interested stakeholders to explore the data, discover insights, and create visualizations, among much more.
The GitHub Innovation Graph provides data on the following areas:
See the datasheet for more information.
Exploring Innovation Graph data
For an overview of the dataset, check out the charts and tables at the GitHub Innovation Graph website.
To dive deeper into the data and run your own analyses, feel free to fork this repo, explore the structured data files using the exploratory data analysis tool of your choice, and share your findings in our Discussions page.
Limitations
The GitHub Innovation Graph dataset contains data on (1) public activity (2) on GitHub (3) aggregated by economy (4) on a quarterly basis. As such, this dataset would not be useful for understanding:
- private activity;
- outside of GitHub;
- at a more granular geographic level than economy; or
- at a more granular temporal level than quarterly.
Additionally, economies that have fewer developers on GitHub (which generally correlates with the population of an economy) will have less data associated with them in this dataset.
See the datasheet for more information on limitations.
Representativeness of Innovation Graph data
How many economies are included?
We endeavor to publish as much data about public activity on GitHub as possible. However, the number of developers varies considerably by economy, and in some cases we decline to publish specific statistics for economies with fewer than 100 unique developers performing the relevant activity during the specified quarter out of an abundance of caution for developers’ privacy. You can find more information on our methodology in the datasheet.
Below a heatmap shows the count of economies reported for each data file by quarter:
Count of economies by data file by quarter

You can also find the CSV for this heatmap in the data/representativeness_data directory.
Which economies are included?
We aggregate GitHub activity for economies using a definition broader than recognized UN member states. For example, AQ reports activity from developers stationed on Antarctica. Below a heatmap reports the count of data files for each economy by quarter:

You can also find the CSV for this heatmap in the data/representativeness_data directory.
License
This project is released under CC0-1.0.
Maintainers
See CODEOWNERS
Support
See SUPPORT
Owner
- Name: GitHub
- Login: github
- Kind: organization
- Location: San Francisco, CA
- Website: https://github.com/about
- Repositories: 462
- Profile: https://github.com/github
How people build software.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this dataset, please cite it as below." authors: - name: "GitHub" contact: - affilation: "GitHub" email: policy@github.com title: "GitHub Innovation Graph" version: 1.0.7 date-released: 2025-08-14 license: CC0-1.0 url: https://github.com/github/innovationgraph type: dataset repository: "https://github.com/github/innovationgraph" keywords: - "open source" - "open data" - dataset - github
GitHub Events
Total
- Create event: 7
- Release event: 3
- Issues event: 7
- Watch event: 112
- Delete event: 5
- Issue comment event: 7
- Push event: 6
- Pull request review event: 9
- Pull request event: 13
- Fork event: 21
Last Year
- Create event: 7
- Release event: 3
- Issues event: 7
- Watch event: 112
- Delete event: 5
- Issue comment event: 7
- Push event: 6
- Pull request review event: 9
- Pull request event: 13
- Fork event: 21
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Kevin Xu | k****u@g****m | 10 |
| dependabot[bot] | 4****] | 9 |
| Mike Linksvayer | m****a@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 21
- Total pull requests: 26
- Average time to close issues: about 1 month
- Average time to close pull requests: about 5 hours
- Total issue authors: 19
- Total pull request authors: 4
- Average comments per issue: 1.38
- Average comments per pull request: 0.08
- Merged pull requests: 24
- Bot issues: 0
- Bot pull requests: 12
Past Year
- Issues: 6
- Pull requests: 12
- Average time to close issues: about 1 month
- Average time to close pull requests: about 6 hours
- Issue authors: 6
- Pull request authors: 3
- Average comments per issue: 0.83
- Average comments per pull request: 0.08
- Merged pull requests: 10
- Bot issues: 0
- Bot pull requests: 4
Top Authors
Issue Authors
- brphelps (2)
- frank-zsy (2)
- muhdaizwan (1)
- Kingdo1-stack (1)
- knadh (1)
- jkone27 (1)
- akirataguchi115 (1)
- lnielsen (1)
- Jraytagle (1)
- verhovsky (1)
- HKalbasi (1)
- danielzayas (1)
- abbydab90 (1)
- madnight (1)
- mlinksva (1)
Pull Request Authors
- dependabot[bot] (18)
- khxu (15)
- volkanutkuurl (2)
- lukerduck (1)
- mlinksva (1)
- crookscastle (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- frictionless *
- annotated-types ==0.5.0
- attrs ==23.1.0
- certifi ==2023.7.22
- chardet ==5.2.0
- charset-normalizer ==3.2.0
- click ==8.1.7
- colorama ==0.4.6
- frictionless ==5.15.10
- humanize ==4.8.0
- idna ==3.4
- isodate ==0.6.1
- jinja2 ==3.1.2
- jsonschema ==4.17.3
- markdown-it-py ==3.0.0
- marko ==2.0.0
- markupsafe ==2.1.3
- mdurl ==0.1.2
- petl ==1.7.14
- pydantic ==2.3.0
- pydantic-core ==2.6.3
- pygments ==2.16.1
- pyrsistent ==0.19.3
- python-dateutil ==2.8.2
- python-slugify ==8.0.1
- pyyaml ==6.0.1
- requests ==2.31.0
- rfc3986 ==2.0.0
- rich ==13.5.2
- shellingham ==1.5.3
- simpleeval ==0.9.13
- six ==1.16.0
- stringcase ==1.2.0
- tabulate ==0.9.0
- text-unidecode ==1.3
- typer ==0.9.0
- typing-extensions ==4.7.1
- urllib3 ==2.0.4
- validators ==0.21.2