Skip to content

Scrape the host's danmu information in Douyu_TV live-show and do the corresponding statistic analysis by SPARK and some Big Data technologies.

Notifications You must be signed in to change notification settings

KaygoYM/Douyu-danmu-spark

Repository files navigation

Douyu-danmu-spark

Version 3.0||Fin version

Introduction

Compared to The first version of Douyu_danmu, in this repository, the analysis of Douyu_TV's danmu is based on SPARK instead of MYSQL(Pymysql).

Environment:

Python 3.6
Module jieba, wordcloud
Spyder
SPARK (Pyspark)
Windows10 (64bit)

HOW TO USE

Scrapy

In Anaconda Prompt/CMD, print"python Spark_danmu_scrapy.py", and then input the room-id to activate the scrapy process.
OR use the .exe app in the link below.

Analyze

After the live-broadcast show, stop the scrapy process. In Anaconda Prompt/CMD, print"python Spark_danmu_analyze.py", and then input the room-id to activate the analyze process.

Result

The results include: Hot Words/The histogram of level/The Top5 badges and so on. Just as shown in 687423_03_07_2018.jpg
and
156277_01_21_2018.jpg
(two examples)

Tips

The psd files are the templets that I use to make the daily reports.
Like nvliu66 and yjjimpaopao.Like 156277 and 687423 (ง•̀_•́)ง

APP

BAIDU CLOUD: 链接(Link): http://t.cn/R8MZkGV 密码(PWD): h5ed

Further work

Monthly or Yearly Report——By applying KMEANS to help host improve the LIVE.

About

Scrape the host's danmu information in Douyu_TV live-show and do the corresponding statistic analysis by SPARK and some Big Data technologies.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages