Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: shadab75
  • License: mit
  • Default Branch: main
  • Size: 8.89 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme License Citation

README.md

DSRL-APT-2023: A New Synthetic Dataset For Advanced Persistent Threats

Link to Article on Journal

Abstract

Detecting Advanced Persistent Threats (APTs) is crucial, and one effective method is using an intrusion detection system (IDS) integrated with supervised machine learning algorithms. These algorithms require a balanced dataset with ample attack samples to learn and recognize attack patterns effectively. However, widely used APT datasets, such as DAPT2020 and SCVIC-APT-2021, suffer from imbalance issues, which limit the performance of machine learning-based IDS.

To address this, we introduce DSRL-APT-2023, a new balanced synthetic APT dataset generated with CTGAN. We trained the CTGAN model on the DAPT2020 dataset to create this balanced dataset.

Evaluation and Comparison

We evaluate and compare the performance of six common supervised machine learning algorithms:

  • Decision Tree
  • Support Vector Machine
  • K-Nearest Neighbor
  • Logistic Regression
  • Random Forest
  • Multi-Layer Perceptron

Alongside these, we also assess an Intelligent Intrusion Detection System (IDS) based on tree-structured machine learning models. Our evaluation focuses on detecting attacks in DSRL-APT-2023 and compares it to DAPT2020 and SCVIC-APT-2021.

Synthetic Dataset Generation Quality

Additionally, we compare the data quality of synthetic datasets generated by two prominent GANs, CopulaGAN and CTGAN, with CTGAN demonstrating slightly superior performance in generating high-quality tabular data.

Results

Our results show that both the machine learning algorithms and the intelligent IDS can accurately detect attacks in the synthetic dataset, as indicated by the F1-Score metrics.

Owner

  • Login: shadab75
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this dataset, please cite it as below."
authors:
- family-names: "Shadabfar"
  given-names: "Hossein"
  orcid: ""
- family-names: "Dehghan"
  given-names: "Motahareh"
  orcid: ""
- family-names: "Sadeghian"
  given-names: "Babak"
  orcid: ""
title: "DSRL-APT-2023: A New Synthetic Dataset For Advanced Persistent Threats"
version: 1.0.0
date-released: TBC
url: "https://github.com/shadab75/DSRL-APT-2023"

GitHub Events

Total
  • Watch event: 2
  • Issue comment event: 1
  • Push event: 3
  • Pull request event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Issue comment event: 1
  • Push event: 3
  • Pull request event: 2
  • Fork event: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 month
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 month
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 1.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • sahnaseredini (1)
Top Labels
Issue Labels
Pull Request Labels