python多进程程序一直在运行却不出结果,求大佬改进
各位大佬好,我用如下的代码进行文本之间的相似度计算(其中相似度计算的代码未附),由于文本量非常大,要进行一一对比计算相似度,非常耗时,用如下多进程方式改进,程序如图界面报错!
应该是我在`
`
rl =pool.map(deal_many_data,'tt',data_all)
`
这句代码上传参有问题,请问该如何修改?其中data_all的格式如图`
待匹配文本列表data3的格式如图所示,
`
请问该如何调整,多谢大佬。
def deal_many_data(threadName,data_all,list1): for key,v in data_all.items():
sim_all = 0
count = 0
sim3 = 0 #判断前两个是否相似不相似则跳出
if len(data_all[key]['content']) >= 10:
newlist = random.sample(list(range(0,len(data_all[key]['content']))),10)
else:
newlist = list(range(0,len(data_all[key]['content'])))
for index in newlist:
data = data_all[key]['title'][index] + data_all[key]['content'][index]
sim = sentence_similarity(data,new_data)
sim_all += sim
count += 1
list1[int(key)] = sim_all / count
print ("%s: %s" % ( threadName, key ))
from multiprocessing import Process, Manager
from multiprocessing import Pool
if __name__ == '__main__':
db,data_all,data2 = load_data()
data3 =data2[:10]
start_time=time.time()
for item in data3:
title = item['autn:content']['DOCUMENT']['DRETITLE']['$'].strip().replace(' ','')
content = item['autn:content']['DOCUMENT']['DRECONTENT']['$'].strip().replace('\n','').replace(' ','')
list1 = [0] * (len(data_all)+1)
new_data = title + ' ' + content
with Manager() as manager:
#list1 = manager.list1
pool = Pool(5) #创建拥有5个进程数量的进程池,假设核数就是4个,轮询处理4个,
rl =rl =pool.map(deal_many_data,'tt',data_all)#传递主运行函数,待循环变量为字典格式数据data_all,待修改共享数据变量list1
pool.close() #关闭进程池,不再接受新的任务
pool.join() #主进程阻塞等待子进程的退出
end = time.time()
print ('finally cost time %ss'%(end-start_time))`
回答:
你的参数传递的不对:
def map(self, func, iterable, chunksize=None): '''
Apply `func` to each element in `iterable`, collecting the results
in a list that is returned.
'''
return self._map_async(func, iterable, mapstar, chunksize).get()
以上是 python多进程程序一直在运行却不出结果,求大佬改进 的全部内容, 来源链接: utcz.com/p/937667.html