https://github.com/bstee615/devign
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary
Last synced: 10 months ago
·
JSON representation
Repository
Basic Info
- Host: GitHub
- Owner: bstee615
- License: mit
- Language: Python
- Default Branch: master
- Size: 44.9 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of saikat107/Devign
Created almost 5 years ago
· Last pushed over 4 years ago
https://github.com/bstee615/Devign/blob/master/
# Devign - Implementation
In this repository, we provide lightweight implementation of [Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks](https://arxiv.org/pdf/1909.03496.pdf).
### Requirements
1. Python=3.6
2. Pytorch==1.4.0
3. [Deep Graph Library](https://www.dgl.ai/)
### Usage
```shell
python main.py \
--dataset \
--input_dir ;
```
### Datset
The `input_dir` should contain three json files namely
1. `train_GGNNinput.json`
2. `valid_GGNNinput.json`
3. `test_GGNNinput.json`
Each json file should contain a list of json object of the following structure
```shell
{
'node_features': ,
'graph':
'target': <0 or 1 representing the vulnerability>
}
```
* Let's assume `n` nodes in the graph are indexed as `0` to `n-1`. The length of `node_features` list should be `n`. Each feature vector should be 100 elements long. Thus the `node_features` list should be a 2D list of shape `(n, 100)`.
* The length of `graph` list should be the number of the edges. Each edge should be represented as a three element tuple `[source, edge_type, destination]`. Where the `source` and `destinations` are indices of corresponding node in `node_features` list. Edge types should be from `0` to `max_edge_types`.
## Note
1. In this implementation, we followed Devign's paper. We could **NOT** recreate the result in the original paper though.
## Reference
[1] Zhou, Yaqin, et al. "Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks." arXiv preprint arXiv:1909.03496 (2019).
Owner
- Name: Benjamin Steenhoek
- Login: bstee615
- Kind: user
- Website: benjijang.com
- Repositories: 12
- Profile: https://github.com/bstee615
3rd year PhD student @ ISU. Interests and research: deep learning, program analysis