Why when I use multiprocessing it spend more time?

Why when I use multiprocessing it spend more time?



The code I use without multiprocessing is as below, it time it spend is 0:00:03.044280:


def execute_it():

number = 10000000
listing_1 = range(number)
listing_2 = range(number)
listing_3 = range(number)
start = datetime.now()
task(listing_1, listing_2, listing_3)
print datetime.now() - start

def task(listing_1, listing_2, listing_3):

for l1, l2, l3 in zip(listing_1, listing_2, listing_3):
l1 + l2 + l3



I want to use multiprocessing to spend less time, the code I tried is as below:


def execute_it():


number = 10000000
listing_1 = list(range(number))
listing_2 = list(range(number))
listing_3 = list(range(number))

params = zip(listing_1, listing_2, listing_3)


start = datetime.now()
pool = mp.Pool(processes=5)
pool.map(task, params)
pool.close()
print datetime.now() - start

def task(params):

params[0] + params[1] + params[2]



it spend 0:00:15.654919 !!!



What is wrong in my code? I am sure the thing they do is same.





You can profile it to confirm it, but I think params = zip(listing_1, listing_2, listing_3) + creating 5 processes is what makes it take longer than the single-process version. Try itertools.izip
– khachik
Aug 21 at 1:36


params = zip(listing_1, listing_2, listing_3)





@khachik I tried it, it even spent more time:0:00:18.134695
– Yuchen Huang
Aug 21 at 1:39





please see my answer for the details
– khachik
Aug 21 at 1:45





@khachik many thanks for your help. it do work when it is a fucntion, but when I use it in the class(especially task in class), it just ignore the code.
– Yuchen Huang
Aug 21 at 2:01




1 Answer
1



The multiprocessing version takes longer because it is effectively the same as the single-process version plus some additional stuff like creating processes and running map.



You can replace zip with itertools.izip and mp.map with mp.imap to get the expected parallelism effect, otherwise all the heavy processing will happen in the main process.


itertools.izip


mp.map


mp.imap


from itertools import izip
...

def execute_it():
number = 10000000
listing_1 = list(range(number))
listing_2 = list(range(number))
listing_3 = list(range(number))

params = izip(listing_1, listing_2, listing_3)

start = datetime.now()
pool = mp.Pool(processes=5)
pool.imap(task, params)
pool.close()
print datetime.now() - start





it works for me, only spend 0:00:01.126610. And I have one more question, can imap used in class function?
– Yuchen Huang
Aug 21 at 1:52





@YuchenHuang not sure I understand the question - if you mean classmethod then yes, method scopes are irrelevant here.
– khachik
Aug 21 at 2:12





I mean when I use self.task in the class, this code doesnt work
– Yuchen Huang
Aug 21 at 2:24





@YuchenHuang your original post does not contain anything about your classes and methods and it is hard to tell what is going on without seeing the code. You can ask a separate question since it is completely unrelated to mp performance.
– khachik
Aug 21 at 14:43






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

Help:Category

How can temperature be calculated given relative humidity and dew point?

I have a recursive function to validate tree graph and need a return condition