xgi-data
Standardized higher-order datasets with corresponding datasheets
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary
Keywords
Repository
Standardized higher-order datasets with corresponding datasheets
Basic Info
- Host: GitHub
- Owner: xgi-org
- License: other
- Language: Python
- Default Branch: main
- Homepage: https://zenodo.org/communities/xgi
- Size: 70.6 MB
Statistics
- Stars: 15
- Watchers: 5
- Forks: 2
- Open Issues: 9
- Releases: 0
Topics
Metadata Files
README.md
XGI-DATA
This is a repository of openly available hypergraph datasets in JSON format with documentation more extensively describing the datasets. They are hosted in the XGI Community on Zenodo and a table of statistics can be found on Read The Docs. There is also a rudimentary inspection script for checking that datasets are in the proper format. This is loosely inspired by Datasheets for Datasets by Gebru et al.
Overview of the xgi-data format
The xgi-data format for hypergraph data sets is a JSON data structure with the following structure:
* hypergraph-data: This tag accesses the attributes of the entire hypergraph dataset such as the authors or dataset name.
* node-data: This tag accesses the nodes of the hypergraph and their associated properties as a dictionary where the keys are node IDs and the corresponding values are dictionaries. If a node doesn't have any properties, the associated dictionary is empty.
* name: This tag accesses the node's name if there is one that is different from the ID specified in the hyperedges.
* Other tags are user-specified based on the particular attributes provided by the dataset.
* edge-data: This tag accesses the hyperedges of the hypergraph and their associated attributes.
* name: This tag accesses the edge's name if one is provided.
* timestamp: This is the tag specifying the time associated with the hyperedge if it is given. All times are stored in ISO8601 standard.
* Other tags are user-specified based on the particular attributes provided by the dataset.
* edge-dict: This tag accesses the edge IDs and the corresponding nodes which participate in that hyperedge.
All IDs are strings but can be converted to other types if desired.
Data sets available on xgi-data
Currently available data sets are: * coauth-dblp * coauth-mag-geology * coauth-mag-history * congress-bills * contact-high-school * contact-primary-school * dawn * diseasome * disgenenet * email-enron * email-eu * eventernote-events * eventernote-places * hospital-lyon * house-bills * house-committees * hypertext-conference * hyperbard * invs13 * invs15 * kaggle-whats-cooking * malawi-village * ndc-classes * ndc-substances * plant-pollinator-mpl-014 * plant-pollinator-mpl-015 * plant-pollinator-mpl-016 * plant-pollinator-mpl-021 * plant-pollinator-mpl-034 * plant-pollinator-mpl-044 * plant-pollinator-mpl-046 * plant-pollinator-mpl-049 * plant-pollinator-mpl-057 * plant-pollinator-mpl-062 * science-gallery * senate-bills * senate-committees * sfhh-conference * tags-ask-ubuntu * tags-math-sx * tags-stack-overflow * threads-ask-ubuntu * threads-math-sx * threads-stack-overflow
These datasets can be loaded with xgi using the following lines:
python
import xgi
H = xgi.load_xgi_data("<dataset_name>")
where <dataset_name> is chosen from the list above.
These datasets have been taken from the following sources: * Data! by Austin Benson * DisGeneNet * Gephi * SocioPatterns
Repository Description
index.json is a dictionary of the data sets that are currently available on xgi-data and the url where they are hosted.
The code folder contains the scripts used to convert hypergraph datasets into a more standard format and the JSON inspection script. This code can be adapted to convert data sets that are currently not part of xgi-data into xgi-data format.
Checking dataset format
To check if a file has the xgi-data format, run the following command:
python inspect_json.py filepath.json
Funding
The XGI-DATA package has been supported by NSF Grant 2121905, "HNDS-I: Using Hypergraphs to Study Spreading Processes in Complex Social Networks".
Owner
- Name: Complex Group Interactions
- Login: xgi-org
- Kind: organization
- Repositories: 1
- Profile: https://github.com/xgi-org
CompleX Group Interactions (XGI) provides an ecosystem for the analysis and representation of complex systems with group interactions.
Citation (CITATION.cff)
# YAML 1.2
cff-version: "1.2.0"
authors:
- email: nicholas.landry@uvm.edu
family-names: Landry
given-names: Nicholas W.
orcid: "https://orcid.org/0000-0003-1270-4980"
- family-names: Lucas
given-names: Maxime
orcid: "https://orcid.org/0000-0001-8087-2981"
- family-names: Iacopini
given-names: Iacopo
orcid: "https://orcid.org/0000-0001-8794-6410"
- family-names: Petri
given-names: Giovanni
orcid: "https://orcid.org/0000-0003-1847-5031"
- family-names: Schwarze
given-names: Alice
orcid: "https://orcid.org/0000-0002-9146-8068"
- family-names: Patania
given-names: Alice
orcid: "https://orcid.org/0000-0002-3047-4376"
- family-names: Torres
given-names: Leo
orcid: "https://orcid.org/0000-0002-2675-2775"
contact:
- email: nicholas.landry@uvm.edu
family-names: Landry
given-names: Nicholas W.
orcid: "https://orcid.org/0000-0003-1270-4980"
doi: 10.5281/zenodo.7939055
message: If you use this software, please cite our article in the
Journal of Open Source Software.
preferred-citation:
authors:
- email: nicholas.landry@uvm.edu
family-names: Landry
given-names: Nicholas W.
orcid: "https://orcid.org/0000-0003-1270-4980"
- family-names: Lucas
given-names: Maxime
orcid: "https://orcid.org/0000-0001-8087-2981"
- family-names: Iacopini
given-names: Iacopo
orcid: "https://orcid.org/0000-0001-8794-6410"
- family-names: Petri
given-names: Giovanni
orcid: "https://orcid.org/0000-0003-1847-5031"
- family-names: Schwarze
given-names: Alice
orcid: "https://orcid.org/0000-0002-9146-8068"
- family-names: Patania
given-names: Alice
orcid: "https://orcid.org/0000-0002-3047-4376"
- family-names: Torres
given-names: Leo
orcid: "https://orcid.org/0000-0002-2675-2775"
date-published: 2023-05-17
doi: 10.21105/joss.05162
issn: 2475-9066
issue: 85
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 5162
title: "XGI: A Python package for higher-order interaction networks"
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.05162"
volume: 8
title: "XGI: A Python package for higher-order interaction networks"
GitHub Events
Total
- Issues event: 1
- Watch event: 7
- Delete event: 3
- Issue comment event: 2
- Push event: 11
- Pull request event: 3
- Create event: 1
Last Year
- Issues event: 1
- Watch event: 7
- Delete event: 3
- Issue comment event: 2
- Push event: 11
- Pull request event: 3
- Create event: 1
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 1
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 0.0
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- nwlandry (3)
- tlarock (2)
- maximelucas (1)
Pull Request Authors
- nwlandry (6)
- maximelucas (4)
- doabell (1)