python多进程程序一直在运行却不出结果,求大佬改进

各位大佬好,我用如下的代码进行文本之间的相似度计算(其中相似度计算的代码未附),由于文本量非常大,要进行一一对比计算相似度,非常耗时,用如下多进程方式改进,程序如图界面报错python多进程程序一直在运行却不出结果,求大佬改进!
应该是我在``

rl =pool.map(deal_many_data,'tt',data_all)

`这句代码上传参有问题,请问该如何修改?其中data_all的格式如图python多进程程序一直在运行却不出结果,求大佬改进`

待匹配文本列表data3的格式如图所示,python多进程程序一直在运行却不出结果,求大佬改进
`
请问该如何调整,多谢大佬。

def deal_many_data(threadName,data_all,list1):

for key,v in data_all.items():

sim_all = 0

count = 0

sim3 = 0 #判断前两个是否相似不相似则跳出

if len(data_all[key]['content']) >= 10:

newlist = random.sample(list(range(0,len(data_all[key]['content']))),10)

else:

newlist = list(range(0,len(data_all[key]['content'])))

for index in newlist:

data = data_all[key]['title'][index] + data_all[key]['content'][index]

sim = sentence_similarity(data,new_data)

sim_all += sim

count += 1

list1[int(key)] = sim_all / count

print ("%s: %s" % ( threadName, key ))

from multiprocessing import Process, Manager

from multiprocessing import Pool

if __name__ == '__main__':

db,data_all,data2 = load_data()

data3 =data2[:10]

start_time=time.time()

for item in data3:

title = item['autn:content']['DOCUMENT']['DRETITLE']['$'].strip().replace(' ','')

content = item['autn:content']['DOCUMENT']['DRECONTENT']['$'].strip().replace('\n','').replace(' ','')

list1 = [0] * (len(data_all)+1)

new_data = title + ' ' + content

with Manager() as manager:

#list1 = manager.list1

pool = Pool(5) #创建拥有5个进程数量的进程池,假设核数就是4个,轮询处理4个,

rl =rl =pool.map(deal_many_data,'tt',data_all)#传递主运行函数,待循环变量为字典格式数据data_all,待修改共享数据变量list1

pool.close() #关闭进程池,不再接受新的任务

pool.join() #主进程阻塞等待子进程的退出

end = time.time()

print ('finally cost time %ss'%(end-start_time))`


回答:

你的参数传递的不对:

def map(self, func, iterable, chunksize=None):

'''

Apply `func` to each element in `iterable`, collecting the results

in a list that is returned.

'''

return self._map_async(func, iterable, mapstar, chunksize).get()

以上是 python多进程程序一直在运行却不出结果,求大佬改进 的全部内容, 来源链接: utcz.com/p/937667.html

回到顶部