Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.3%) to scientific vocabulary
Repository
Data repository for PyGOD
Basic Info
- Host: GitHub
- Owner: pygod-team
- License: mit
- Default Branch: main
- Size: 87.2 MB
Statistics
- Stars: 41
- Watchers: 2
- Forks: 3
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
Data Repository for PyGOD
The statistics of the available dataset (#Con. means the number of contextual outliers, while #Strct. means the number of structural outliers. The number of outliers is slightly less than the sum of two types of outliers because of the intersection between two types of outliers.):
| Dataset | Type | #Nodes | #Edges | #Feat | Avg. Degree | #Con. | #Strct. | #Outliers | Outlier Ratio | | ------------ | --------- | ------ | ------- | ------ | ----------- | ----- | ------- | --------- | ------------- | | 'weibo' | organic | 8,405 | 407,963 | 400 | 48.5 | - | - | 868 | 10.3% | | 'reddit' | organic | 10,984 | 168,016 | 64 | 15.3 | - | - | 366 | 3.3% | | 'disney' | organic | 124 | 335 | 28 | 2.7 | - | - | 6 | 4.8% | | 'books' | organic | 1,418 | 3,695 | 21 | 2.6 | - | - | 28 | 2.0% | | 'enron' | organic | 13,533 | 176,987 | 18 | 13.1 | - | - | 5 | 0.04% | | 'injcora' | injected | 2,708 | 11,060 | 1,433 | 4.1 | 70 | 70 | 138 | 5.1% | | 'injamazon' | injected | 13,752 | 515,042 | 767 | 37.2 | 350 | 350 | 694 | 5.0% | | 'injflickr' | injected | 89,250 | 933,804 | 500 | 10.5 | 2,240 | 2,240 | 4,414 | 4.9% | | 'gentime' | generated | 1,000 | 5,746 | 64 | 5.7 | 100 | 100 | 189 | 18.9% | | 'gen100' | generated | 100 | 618 | 64 | 6.2 | 10 | 10 | 18 | 18.0% | | 'gen500' | generated | 500 | 2,662 | 64 | 5.3 | 10 | 10 | 20 | 4.0% | | 'gen1000' | generated | 1,000 | 4,936 | 64 | 4.9 | 10 | 10 | 20 | 2.0% | | 'gen5000' | generated | 5,000 | 24,938 | 64 | 5.0 | 10 | 10 | 20 | 0.4% | | 'gen_10000' | generated | 10,000 | 49,614 | 64 | 5.0 | 10 | 10 | 20 | 0.2% |
To use the datasets:
python
from pygod.utils import load_data
data = load_data('weibo') # in PyG format
Alternative download source in Baidu Disk (Chinese): https://pan.baidu.com/s/1afEZaygCRUYWJPtVbzuRYw Access Code: bond
For injected/generated datasets, the labels meanings are as follows.
- 0: inlier
- 1: contextual outlier only
- 2: structural outlier only
- 3: both contextual outlier and structural outlier
Examples to convert the labels are as follows:
python
y = data.y.bool() # binary labels (inlier/outlier)
yc = data.y >> 0 & 1 # contextual outliers
ys = data.y >> 1 & 1 # structural outliers
Owner
- Name: PyGOD Team
- Login: pygod-team
- Kind: organization
- Email: dev@pygod.org
- Website: https://pygod.org
- Repositories: 2
- Profile: https://github.com/pygod-team
Maintaining A Python Library for Graph Outlier Detection (Anomaly Detection)
Citation (CITATION.cff)
cff-version: 1.2.0
message: If you use this library, please cite it as below.
title: PyGOD
authors:
- family-names: PyGOD Team
url: https://pygod.org
preferred-citation:
type: conference-paper
title: "BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs"
authors:
- family-names: Liu
given-names: Kay
- family-names: Dou
given-names: Yingtong
- family-names: Zhao
given-names: Yue
- family-names: Ding
given-names: Xueying
- family-names: Hu
given-names: Xiyang
- family-names: Zhang
given-names: Ruitong
- family-names: Ding
given-names: Kaize
- family-names: Chen
given-names: Canyu
- family-names: Peng
given-names: Hao
- family-names: Shu
given-names: Kai
- family-names: Sun
given-names: Lichao
- family-names: Li
given-names: Jundong
- family-names: Chen
given-names: George H
- family-names: Jia
given-names: Zhihao
- family-names: Yu
given-names: Philip S
collection-title: Advances in Neural Information Processing Systems 35
collection-type: proceedings
editors:
- family-names: Koyejo
given-names: S.
- family-names: Mohamed
given-names: S.
- family-names: Agarwal
given-names: A.
- family-names: Belgrave
given-names: D.
- family-names: Cho
given-names: K.
- family-names: Oh
given-names: A.
start: 27021
end: 27035
year: 2022
publisher:
name: Curran Associates, Inc.
url: https://proceedings.neurips.cc/paper_files/paper/2022/file/acc1ec4a9c780006c9aafd595104816b-Paper-Datasets_and_Benchmarks.pdf
GitHub Events
Total
- Issues event: 1
- Watch event: 5
- Issue comment event: 1
- Fork event: 1
Last Year
- Issues event: 1
- Watch event: 5
- Issue comment event: 1
- Fork event: 1
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 2
- Total pull requests: 0
- Average time to close issues: 3 months
- Average time to close pull requests: N/A
- Total issue authors: 2
- Total pull request authors: 0
- Average comments per issue: 0.5
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Djerry-h (1)
- uuice11 (1)