https://github.com/bstee615/devign

Science Score: 10.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Basic Info

Host: GitHub
Owner: bstee615
License: mit
Language: Python
Default Branch: master
Size: 44.9 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Fork of saikat107/Devign

Created almost 5 years ago · Last pushed over 4 years ago

https://github.com/bstee615/Devign/blob/master/

# Devign - Implementation

In this repository, we provide lightweight implementation of [Devign: Effective Vulnerability Identification by Learning Comprehensive Program Semantics via Graph Neural Networks](https://arxiv.org/pdf/1909.03496.pdf). 

### Requirements
1. Python=3.6 
2. Pytorch==1.4.0
3. [Deep Graph Library](https://www.dgl.ai/)

### Usage
```shell
python main.py \
      --dataset  \
      --input_dir ;
```

### Datset
The `input_dir` should contain three json files namely
1. `train_GGNNinput.json`
2. `valid_GGNNinput.json`
3. `test_GGNNinput.json`

Each json file should contain a list of json object of the following structure 
```shell
{
  'node_features': ,
  'graph': 
  'target': <0 or 1 representing the vulnerability>
}
```

* Let's assume `n` nodes in the graph are indexed as `0` to `n-1`. The length of `node_features` list should be `n`. Each feature vector should be 100 elements long. Thus the `node_features` list should be a 2D list of shape `(n, 100)`.
  
* The length of `graph` list should be the number of the edges. Each edge should be represented as a three element tuple `[source, edge_type, destination]`. Where the `source` and `destinations` are indices of corresponding node in `node_features` list. Edge types should be from `0` to `max_edge_types`. 

## Note 
1. In this implementation, we followed Devign's paper. We could **NOT** recreate the result in the original paper though.

## Reference
[1] Zhou, Yaqin, et al. "Devign: Effective vulnerability identification by learning comprehensive program semantics via graph neural networks." arXiv preprint arXiv:1909.03496 (2019).

Owner

Name: Benjamin Steenhoek
Login: bstee615
Kind: user

Website: benjijang.com
Repositories: 12
Profile: https://github.com/bstee615

3rd year PhD student @ ISU. Interests and research: deep learning, program analysis

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science