Skip to content

UDICatNCHU/udic-nlp-API

Repository files navigation

udic nlp APIBuild Status

中興大學普及資料與智慧運算實驗室所開發之自然語言web api

目前我們提供4種模式之字詞關聯查詢:

  • 頻繁共現關聯 (Co-occurrence Relationship):例如,輸入蔡英文,將會回傳一系列與蔡英文頻繁一起出現之字詞,如總統、台灣、民主進步黨等。  

  • 上下文情境相似關聯 (Similar Context Sharing Relationship):例如,輸入周杰倫,將會回傳一系列相似詞,如蔡依林、王力宏、張惠妹等。

  • 字詞概念推論 (Hyperonym-Hyponym Relationship):例如,五月天樂團香蕉水果周杰倫歌手

  • 句子情緒判斷 (Sentiment Analysis):例如,齊家治國平天下,小家給治了!國家更需要妳,加油!擇善固執莫在意全家滿意,至於她家謾駡攻許隨她去(正常情緒紓緩),革命尚未成功期盼繼續努力 -> 正面情緒。

Install

  • lang:Supported Language Parameter
    • zh:中文
    • en:English Still working on it
    • th:th Still working on it
  1. Install Docker and Docker-compose:

    1. Docker:curl -fsSL get.docker.com -o get-docker.sh; sh get-docker.sh
    2. How to install docker-compose
  2. git clone https://github.com/udicatnchu/udic-nlp-api

  3. cd udic-nlp-api

  4. Need to specify which port to be exported for api server:export OUTPUT_PORT=80

  5. docker-compose --compatibility up -d: This command will create three containers

    (check Config to know further detail)

    1. Django (named as udic-nlp-api_web_1)
    2. MongoDB (named as udic-nlp-api_mongo_1)
    3. MySQL (named as udic-nlp-api_db_1)
  6. docker exec -it udic-nlp-api_db_1 bash:Enter into the MySQL container

    • Insert WikiDump into MySQL:nohup download_wikisql.sh <lang> &
    • This command can be executed simultaneously with command 7
  7. docker exec -it udic-nlp-api_web_1 bash:enter into the Django container

    • Build Model:nohup bash -c 'time bash install.sh <lang>' &
      • Env: 109G RAM, 32 cores
      • Execute time:
        real  2352m46.045s
        user  12311m52.132s
        sys   533m10.096s
  8. After finishing the building process, you need to restart the Django container (udicnlpapi_web_1):docker restart udic-nlp-api_web_1

  9. That's it !

API usage and Results

parameter

  • keyword:the word you want to query.
  • lang:Supported Language Parameter
    • zh:中文
    • en:English Still working on it
    • th:th Still working on it
  • num(optional):The amount of result you want to get (Default:10)
  • kcm, kem:Used by kcem, different combination of kcm and kem may have entirely different output. You can customarily adjust these two parameter as you wish.

url pattern

  1. Keyword Co-Occurence API(Co-Occurrence Relationship):

  2. Pointwise mutual information API:

  3. Word2Vec Online API:

  4. Hyperonym-Hyponym Relationship API:

  5. TF-IDF API:

  6. 中文情緒分類器API:

  7. Behavior 2 Text API:

    • coming soon

Config

Use Docker-Compose Resources to Configures resource constraints.

Also, please check these two issue1, issue2.

It explain why we should add --compatibility flag. To re-allocate resources, please check docker-compose.yml

To Do

  1. MySQL use test database, not sure if there's any security issue:
  2. Maybe we need to use React router, i implement router in stupid way:
  3. After building kcm model, the api of kcm didn't load the newly build data in it. Need to restart docker-compuse manually, maybe it's a MongoDB issue ?

Built With

  • Django 1.10.2
  • python3.5

Contributors

  • 范耀中 教授
  • 黃思穎
  • 陳聖軒
  • 鄭銘毅
  • 張泰瑋 david

License

This package use GPL3.0 License.

About

由中興大學普及資料與智慧運算實驗室所開發之字詞關聯查詢web api

Resources

License

Stars

Watchers

Forks

Packages

No packages published