Skip to content

goodskillprogramer/MalwareClassify

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

恶意软件分类

  • 机器学习和恶意软件分类
  • 基于API调用序列,主要是n-gram和tfidf特征
  • 机器学习工具用的lightgbm

malware classify based on API sequence

  • Using machine learning method to classify malware type
  • most of the feature is extracted from API sequence
  • using n-gram and tfidf to extract the vector
  • you can download the trainset from this website

程序介绍

  • file_split.py 读取csv文件,并按照不同的文件ID组织
  • preprocess.py 会将每个文件,转成json格式,并且序列化api
  • basic_feature.py 提取简单特征
  • tfidf_model.py 生成tfidf模型
  • feature.py 利用生成的tfidf模型转换训练和测试数据
  • light_gbm_model.py 模型调参
  • model_predict.py 结果预测

说明

About

malware classify 第三届『阿里云安全算法挑战赛』源码

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages