Skip to content

practicingman/chinese_text_cnn

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TextCNN Pytorch实现 中文文本分类

论文

Convolutional Neural Networks for Sentence Classification

参考

依赖项

  • python3.5
  • pytorch==1.0.0
  • torchtext==0.3.1
  • jieba==0.39

词向量

https://github.com/Embedding/Chinese-Word-Vectors
(这里用的是Zhihu_QA 知乎问答训练出来的word Word2vec)

用法

python3 main.py -h

训练

python3 main.py

准确率

  • CNN-rand 随机初始化Embedding
      python main.py
      Batch[1800] - loss: 0.009499  acc: 100.0000%(128/128)
      Evaluation - loss: 0.000026  acc: 94.0000%(6616/7000)
      early stop by 1000 steps, acc: 94.0000%
    
  • CNN-static 使用预训练的静态词向量
      python main.py -static=true
      Batch[1900] - loss: 0.011894  acc: 100.0000%(128/128)
      Evaluation - loss: 0.000018  acc: 95.0000%(6679/7000)
      early stop by 1000 steps, acc: 95.0000%
    
  • CNN-non-static 微调预训练的词向量
      python main.py -static=true -non-static=true
      Batch[1500] - loss: 0.008823  acc: 99.0000%(127/128))
      Evaluation - loss: 0.000016  acc: 96.0000%(6729/7000)
      early stop by 1000 steps, acc: 96.0000%
    
  • CNN-multichannel 微调加静态
      python main.py -static=true -non-static=true -multichannel=true
      Batch[1500] - loss: 0.023020  acc: 98.0000%(126/128))
      Evaluation - loss: 0.000016  acc: 96.0000%(6744/7000)
      early stop by 1000 steps, acc: 96.0000%
    

Releases

No releases published

Packages

No packages published

Languages