Codes of USENIX Security'23 paper "On the Security Risks of Knowledge Graph Reasoning"

(TO COMPLETE)

Sources

Constructing data

Cyber KG

We construct Cyber KG with recorded CVEs (link), from which we crawled vulnerability-related information such as affected vendor, product, version, vulnerability types, descriptions, relevant CWE, etc. One can refer to ./data/cyberkg/crawler.ipynb to check the information we crawled. We construct a Cybersecurity KG with queries/answers in gen_cyberkg.py.

We modify on the released codes from this repository.

Guide

We organize the structure of our files as follows:

.
├──  data/
│   └──  cyberkg/              # constructed KG may also saved in this dir by default
│       ├──  cve_url/          # collected CVE web links from 1999 to 2019
│       └──  crawler.ipynb     # crawling scripts that takes cve_url/ as inputs
├──  genkg/
│   ├──  cyberkg_backbone.py   # parse crawled data and construct a cyberkg
│   ├──  cyberkg_query.py      # generate queries and answers
│   └──  cyberkg_utils.py      # utility functions specific to generate cyberkg and QA
├──  dataloader.py              
├──  gen_cyberkg.py            # run this file to generate a cyberkg and QA, see details below
├──  main.py                   # main file for reasoning
├──  models.py                  
├──  READEME.md
└──  util.py

Run the Code

To run the code, one needs to first construct a CyberKG and its QA, then feed them to a model for downstream reasoning task.

Step1 : You may need the crawled CVE files from link, where you need to first download the ./data/cyberkg-raw/ into you local disk recorded as <raw path>.
Step2 : To construct a CyberKG with corresponding QA, one can directly run python gen_cyberkg.py --raw_path <raw path>, where the <raw path> are the load path of your saved raw CVE information. Instead of using all crawled CVE-IDs, our codes filter them with specific vendors and products: if a vendor has number of products within a threshold, and when a product has number of versions within another threshold, we will keep them and keep their related CVE-IDs. We then remove the graph edges (konwledge facts) that contains removed entities. You can adjust those two thresholds by python gen_cyberkg.py --raw_path <raw path> --pd_num_thre <low int boundary> <high int boundary> --ver_num_thre <low int boundary> <high int boundary>.
Step3 : After generating train/test queries and answers, you can use the exampled commands presented in the original repository.

We also provide a runnable demo in demo.sh for an easily use, but you have to download the crawled CVE files from link and change argparser raw_path in gen_cyberkg.py.

Cite

Please cite our paper if it is helpful:

@inproceedings{kg-attack,
  title="{On the Security Risks of Knowledge Graph Reasoning}",
  author={Xi, Zhaohan and Du, Tianyu and Pang, Ren and Li, Changjiang and Ji, Shouling and Luo, Xiapu and Xiao, Xusheng and Ma, Fenglong and Wang, Ting},
  booktitle={Proceedings of USENIX Security Symposium (SEC)},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
config		config
gendata		gendata
helper		helper
main		main
model		model
notebook		notebook
roar		roar
run		run
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

config

config

gendata

gendata

helper

helper

main

main

model

model

notebook

notebook

roar

roar

run

run

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Codes of USENIX Security'23 paper "On the Security Risks of Knowledge Graph Reasoning"

Sources

Guide

Cite

About

Releases

Packages

Languages

zhaohan-xi/security-risk-KG-reasoning

Folders and files

Latest commit

History

Repository files navigation

Codes of USENIX Security'23 paper "On the Security Risks of Knowledge Graph Reasoning"

Sources

Guide

Cite

About

Resources

Stars

Watchers

Forks

Languages