BeiJingSubwayFlows

这个爬虫代码已过时，不再维护
之前微博是把html代码放到了js中，所以可以从代码中直接提取需要的数据

突然很好奇北京地铁每天的客流量变化，于是写了个爬虫。结果很有意思，每周7天的客流变化都很规律

结果：

https://www.ikaze.cn/sub_flows.html

其他：

使用python3爬数据，echart统计图
爬虫的结果直接存到了文件中，因为项目比较小，就不用数据库了
爬虫脚本只是爬昨天的数据的，需要所有的要改一下get_flow_from_html()函数

def get_flow_from_html(html):

    # 需要根据页数调整年份
    year = 2018

    soup = bs(html, 'html.parser')
    work_list = soup.find_all('div', class_='work_list')
    data = work_list[0].find_all('li')
    for d in data:
        s = data.get_text()
        ...

然后直接循环跑就行：

page=200
while page>0:
    html = get_html(get_page_url(page))
    get_flow_from_html(html)
    ...

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
__pycache__		__pycache__
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
datetime_helper.py		datetime_helper.py
get_sub_flows.py		get_sub_flows.py
sub_data.js		sub_data.js
sub_flows.html		sub_flows.html
sub_flows.txt		sub_flows.txt
tu.png		tu.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pycache

pycache

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

datetime_helper.py

datetime_helper.py

get_sub_flows.py

get_sub_flows.py

sub_data.js

sub_data.js

sub_flows.html

sub_flows.html

sub_flows.txt

sub_flows.txt

tu.png

tu.png

Repository files navigation

BeiJingSubwayFlows

结果：

其他：

About

Releases

Packages

Languages

License

gojuukaze/BeiJingSubwayFlows

Folders and files

Latest commit

History

Repository files navigation

BeiJingSubwayFlows

结果：

其他：

About

Topics

Resources

License

Stars

Watchers

Forks

Languages