Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2.2 多线程编程+2.3 多进程编程作业疑惑? #21

Open
maolin510 opened this issue Dec 10, 2019 · 1 comment
Open

2.2 多线程编程+2.3 多进程编程作业疑惑? #21

maolin510 opened this issue Dec 10, 2019 · 1 comment

Comments

@maolin510
Copy link

普通读取目录.py
#!/usr/bin/env python

-- coding:utf-8 --

Author: maolin

import threading
import time
import os

a = time.time()
def find_all_file(parentDir,tag):
if os.path.isdir(parentDir): # 是目录
dir_list = os.listdir(parentDir)
paths = [os.path.join('%s%s' % (parentDir, tt)) for tt in dir_list]
for path in paths:
find_all_file(path,tag)
else: # 不是目录,直接输出
print('------------'+str(tag))
print(parentDir)

设置读取盘符路径

parentDir1 = 'C:'
j=1;
for i in range(1,4):
find_all_file(parentDir1,j)
j = j+1
b = time.time()
print(b-a)

多线程读取目录.py
#!/usr/bin/env python

-- coding:utf-8 --

Author: maolin

import threading
import time
import os

a = time.time()
def find_all_file(parentDir,tag):
if os.path.isdir(parentDir): # 是目录
dir_list = os.listdir(parentDir)
paths = [os.path.join('%s%s' % (parentDir, tt)) for tt in dir_list]
for path in paths:
find_all_file(path,tag)
else: # 不是目录,直接输出
print('------------'+str(tag))
print(parentDir)

设置读取盘符路径

parentDir1 = 'C:'
parentDir2 = 'C:'
parentDir3 = 'C:'

创建一个线程

thead1 = threading.Thread(target=find_all_file(parentDir1, 1))
thead2 = threading.Thread(target=find_all_file(parentDir2, 2))
thead3 = threading.Thread(target=find_all_file(parentDir3, 3))

启动刚刚创建的线程

thead1.start()

线程等待

thead1.join()
thead2.start()
thead2.join()
thead3.start()
thead3.join()
b = time.time()
print(b-a)

多进程-pool-读取目录.py
#!/usr/bin/env python

-- coding:utf-8 --

Author: maolin

from multiprocessing import Pool
import os, time, random
'''
pool
如果要启动大量的子进程,可以用进程池的方式批量创建子进程:
'''

def long_time_task(parentDir):
if os.path.isdir(parentDir): # 是目录
dir_list = os.listdir(parentDir)
paths = [os.path.join('%s%s' % (parentDir, tt)) for tt in dir_list]
for path in paths:
long_time_task(path)
else: # 不是目录,直接输出
print("文件为:"+str(parentDir))

if name == 'main':
a = time.time()
print('Parent process %s.' % os.getpid())
parentDir1 = 'C:'
parentDir2 = 'C:'
parentDir3 = 'C:'
p = Pool(3) #就可以同时跑3个进程。
for i in [parentDir1,parentDir2,parentDir3]:
p.apply_async(long_time_task, args=(i,)) #异步非阻塞
p.close()
p.join()
b = time.time()
print(b - a)

多进程-queue-读取目录.py
#!/usr/bin/env python

-- coding:utf-8 --

Author: maolin

from multiprocessing import Process, Queue
import os, time, random

写数据进程执行的代码:

def long_time_task(parentDir):
if os.path.isdir(parentDir): # 是目录
dir_list = os.listdir(parentDir)
paths = [os.path.join('%s%s' % (parentDir, tt)) for tt in dir_list]
for path in paths:
long_time_task(path)
else: # 不是目录,直接输出
print("文件为:"+str(parentDir))

if name == 'main':
# 父进程创建Queue,并传给各个子进程:
a = time.time()
parentDir1 = 'C:'
parentDir2 = 'C:'
parentDir3 = 'C:'
q = Queue()
p1 = Process(target=long_time_task(parentDir1), args=(q,))
p2 = Process(target=long_time_task(parentDir2), args=(q,))
p3 = Process(target=long_time_task(parentDir3), args=(q,))
# 启动子进程p1,写入:
p1.start()
# 等待p1结束:
p1.join()
# 启动子进程p2,写入:
p2.start()
# 等待p2结束:
p2.join()
# 启动子进程p3,写入:
p3.start()
# 等待p3结束:
p3.join()
b = time.time()
print(b - a)

需要答疑点:
1 首先不知道按照理解写的code是否正确?
2 为什么普通读取和多线程读取,以及多进程pool和queue读取时间上没有什么差异?

@xuanhun
Copy link
Owner

xuanhun commented Dec 12, 2019

代码看着没什么问题。
时间上没有差异,取决于几个原因,你可以对照分析一下:
1) 多进程也好,多线程也好,你的cpu的内核数代表了并发数。如果内核数少,自然起不到作用。而且多线程和多进程也是有开销的,也许会更慢
2) 执行的任务可能本身需要的时间太短,无法准确对比

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants