TOP > 研究成果 > APTGen: An Approach towards Generating Practical Dataset Labelled with Targeted Attack Sequences

研究成果

APTGen: An Approach towards Generating Practical Dataset Labelled with Targeted Attack Sequences


News

  • Nov. 11, 2020: Released APTGen Tools Terms of Use.
  • Jul. 21, 2020: "APTGen: An Approach towards Generating Practical Dataset Labelled with Targeted Attack Sequences" was accepted on USENIX CSET '20.

Contents

  1. Research Description
  2. Available Datasets
  3. Acknowledgements

Research Description


The constant threats of targeted cyber attacks are one of the major security challenges in nowadays. When the organization realizes the security breach within its network, the computer security incident response team (CSIRT) responds to the incident. The mission of the CSIRT is to reveal the whole picture of the attack through an incident response cycle, which consists of detection, analysis, containment, eradication and recovery. During the process, the CSIRT not only investigates the attack methods that an attacker executed in the corporate network, but also investigates the sequence of these attack methods (attack sequence) and the attacker’s purpose. The faster the whole picture of the attack gets revealed, the period between detection and containment or eradication becomes shorter. As a result, security researchers have shed light in revealing the purpose of the attack and developed methods to automate or support investigating attack sequences.


Developing these methods needs various kinds of attack sequence data, network logs and endpoint logs that contain attack traces related to the sequence. However, to the best of our knowledge, these kinds of open dataset are limited. For this reason, we have decided to build the dataset for R&D for incident handling by ourselves.


We propose APTGen, an approach for generating attack sequences and executing them for building a dataset. APTGen first generates artificial attack sequences based on MITRE's ATT&CK and executes corresponding attacks in an experimental environment. Thanks to this approach, we can obtain the attack sequence corresponding to an attack trace left in the logs. We implement APTGen, which consists of attack sequences generation tool (generation tool) and attack sequences execution tool (execution tool).


Available Datasets

  1. Attack sequences and logs
  2. We publish generated attack sequences data and logs data obtained from our experimental environment. If you are interested in our dataset, you can download these data from the following URL.
    aptgen-dataset-v1.0.zip (zip size: 3.4GB, uncompressed size: 103GB)
    SHA256: 88827775CB8AB654FF544BC0A681F6D07F5B7AF86EF2C5F096C75C2243A7FCE5

    This zip needs 7z and a password for uncompression. Please contact to the following email address with your name and affiliation in order to get the password for the dataset. Please use your official university/corporate email address when contacting us. Note that we may use the affiliation information as a record of the provision.


  3. APTGen implementation (generation tool and execution tool)
  4. Please see the terms of use for our tools. If you are interested and able to accept the terms, please contact to the following email address for further information. Please use your official university/corporate email address when contacting us. Note that we may use the affiliation information as a record of the provision.

    E-mail: ynugr-ylab-aptgen[at]ynu.ac.jp


Acknowledgements

This is a joint work between Yokohama National University and NEC Corporation.