Skip to content

Latest commit

 

History

History
110 lines (95 loc) · 5.45 KB

README.md

File metadata and controls

110 lines (95 loc) · 5.45 KB

clj-bosonnlp

Clojars Project

clj-bosonnlp是Boson NLP的clojure封装。

Boson NLP提供了中文自然语言处理中,包括:

  • 情感分析(Sentiment Analysis)
  • 实体识别(Name Entity Recognition)
  • 依存句法(Dependent parser)
  • 关键词提取(Keywords extraction)
  • 新闻分类(News classification)
  • 语义联想(Semantic Words Suggestion)
  • 分词和词性(Segmentation and Postag of Chinese)

更详细的信息,请参见Boson官方文档

Usage

使用Leiningin的,在project.clj文件中加入:

[clj-bosonnlp "0.1.2"]

使用maven的,在pom.xml中加入:

<dependency>
  <groupId>clj-bosonnlp</groupId>
  <artifactId>clj-bosonnlp</artifactId>
  <version>0.1.2</version>
</dependency>

然后就可以在项目中使用了,下面的是在clojure repl中使用clj-bosonnlp的例子:

user=> (use '[clj-bosonnlp.core])
nil  
;; initialize with you api-token
user=> (initialize "<your-api-token>")
"<you-api-token>"
;; sentiment example 
user=> (sentiment ["这个世界好复杂", "计算机是科学么"]
[[0.17128982245610536 0.8287101775438946] \
[0.3028239178178842 0.6971760821821158]]
user=> (suggest "粉丝" 20)
[[0.9999999999999997 "粉丝/n"] [0.4860246796131101 "脑残粉/n"] \
[0.47638025976400966 "听众/n"] [0.4574711603743687 "球迷/n"] \
[0.44279396622121586 "观众/n"] [0.4399638841304087 "喷子/n"] \
[0.4370675116868156 "乐迷/n"] [0.4365171009654033 "鳗鱼/n"] \
[0.4357353461210972 "水军/n"] [0.43320908113367257 "好友/n"] \
[0.4321432244549219 "歌迷/n"] [0.4218593870538608 "影迷/n"] \
[0.4179423555308083 "前辈/n"] [0.4142211812540118 "网民/n"] \
[0.40556773652629086 "参赛者/n"] [0.40544885221034965 "博友/n"] \
[0.3976491020591731 "公知/n"] [0.3971053944003027 "支持者/n"] \
[0.3864395283882839 "选手/n"] [0.38543008430007086 "歌手/n"]]
user=> (tag ["这个世界好复杂", "计算机是科学么"]
[{"tag" ["DT" "M" "NN" "AD" "VA"], "word" ["" "" "世界" "" "复杂"]} \
{"tag" ["NN" "VC" "NN" "SP"], "word" ["计算机" "" "科学" ""]}]
;; news classify result
user=> (classify ["俄否决安理会谴责叙军战机空袭阿勒颇平民", \
"邓紫棋谈男友林宥嘉:我觉得我比他唱得好", "Facebook收购印度初创公司:"])
[5 4 8]
clj-bosonnlp.core=> (depparser "这个世界好复杂")
[{"head" [1 2 4 4 -1], "role" ["M" "NMOD" "SBJ" "VMOD" "ROOT"], \
"tag" ["DT" "M" "NN" "AD" "VA"], "word" ["" "" "世界" "" "复杂"]}]
clj-bosonnlp.core=> (depparser ["这个世界好复杂", "计算机是科学么"]
[{"head" [1 2 4 4 -1], "role" ["M" "NMOD" "SBJ" "VMOD" "ROOT"], \
"tag" ["DT" "M" "NN" "AD" "VA"], "word" ["" "" "世界" "" "复杂"]} \
{"head" [1 -1 1 1], "role" ["SBJ" "ROOT" "VMOD" "VMOD"], \
"tag" ["NN" "VC" "NN" "SP"], "word" ["计算机" "" "科学" ""]}]

;; document summary api example
clj-bosonnlp.core> (def content (str "腾讯科技讯(刘亚澜)10月22日消息,"
     "前优酷土豆技术副总裁黄冬已于日前正式加盟芒果TV,出任CTO一职。"
     "资料显示,黄冬历任土豆网技术副总裁、优酷土豆集团产品技术副总裁等职务,"
     "曾主持设计、运营过优酷土豆多个大型高容量产品和系统。"
     "此番加入芒果TV或与芒果TV计划自主研发智能硬件OS有关。"
     "今年3月,芒果TV对外公布其全平台日均独立用户突破3000万,日均VV突破1亿,"
     "但挥之不去的是业内对其技术能力能否匹配发展速度的质疑,"
     "亟须招揽技术人才提升整体技术能力。"
     "芒果TV是国内互联网电视七大牌照方之一,之前采取的是“封闭模式”与硬件厂商预装合作,"
     "而现在是“开放下载”+“厂商预装”。"
     "黄冬在加盟土豆网之前曾是国内FreeBSD(开源OS)社区发起者之一,"
     "是研究并使用开源OS的技术专家,离开优酷土豆集团后其加盟果壳电子,"
     "涉足智能硬件行业,将开源OS与硬件结合,创办魔豆智能路由器。"
     "未来黄冬可能会整合其在开源OS、智能硬件上的经验,结合芒果的牌照及资源优势,"
     "在智能硬件或OS领域发力。"
     "公开信息显示,芒果TV在今年6月对外宣布完成A轮5亿人民币融资,估值70亿。"
     "据芒果TV控股方芒果传媒的消息人士透露,芒果TV即将启动B轮融资。"))
#'clj-bosonnlp.core/content
clj-bosonnlp.core> (summary {"content" content} 0.1)
"腾讯科技讯(刘亚澜)10月22日消息,前优酷土豆技术副总裁黄冬已于日前正式加盟芒果TV,出任CTO一职。"
clj-bosonnlp.core> (summary {"content" content} 0.2)
"腾讯科技讯(刘亚澜)10月22日消息,前优酷土豆技术副总裁黄冬已于日前正式加盟芒果TV,出任CTO一职。
未来黄冬可能会整合其在开源OS、智能硬件上的经验,结合芒果的牌照及资源优势,在智能硬件或OS领域发力。
据芒果TV控股方芒果传媒的消息人士透露,芒果TV即将启动B轮融资。"
clj-bosonnlp.core> (summary {"content" content} 30 true)
"此番加入芒果TV或与芒果TV计划自主研发智能硬件OS有关。"
clj-bosonnlp.core> 

License

Copyright © 2015 m00nlight

Distributed under the Eclipse Public License either version 1.0 or (at your option) any later version.