The multiprocessing package supports spawning processes using an API similar to the threading module. It also offers both local and remote concurrency. This tutorial will discuss multiprocessing in Python and how to use multiprocessing to communicate between processes and perform synchronization between processes, as well as logging.
Introduction to Multiprocessing
Multiprocessing works by creating a Process
object and then calling its start()
method as shown below.
from multiprocessing import Process def greeting(): print 'hello world' if __name__ == '__main__': p = Process(target=greeting) p.start() p.join()
In the example code above, we first import the Process class and then instantiate the Process object with the greeting function which we want to run.
We then tell the process to begin using the start()
method, and we finally complete the process with the join()
method.
Additionally, you can also pass arguments to the function by providing the args
keyword argument like so:
from multiprocessing import Process def greeting(name): print 'hello' + " " + name if __name__ == '__main__': p = Process(target=greeting, args=('world',)) p.start() p.join()
Example
Let's look at a more detailed example that covers all the concepts we have discussed above.
In this example, we are going to create a process that calculates the square of numbers and prints the results to the console.
from multiprocessing import Process def square(x): for x in numbers: print('%s squared is %s' % (x, x**2)) if __name__ == '__main__': numbers = [43, 50, 5, 98, 34, 35] p = Process(target=square, args=('x',)) p.start() p.join print "Done" #result Done 43 squared is 1849 50 squared is 2500 5 squared is 25 98 squared is 9604 34 squared is 1156 35 squared is 1225
You can also create more than one process at the same time, as shown in the example below, in which process p1 gets the results of numbers squared, while the second process p2 checks if the given numbers are even.
from multiprocessing import Process def square(x): for x in numbers: print('%s squared is %s' % (x, x**2)) def is_even(x): for x in numbers: if x % 2 == 0: print('%s is an even number ' % (x)) if __name__ == '__main__': numbers = [43, 50, 5, 98, 34, 35] p1 = Process(target=square, args=('x',)) p2 = Process(target=is_even, args=('x',)) p1.start() p2.start() p1.join() p2.join() print "Done" #result 43 squared is 1849 50 squared is 2500 5 squared is 25 98 squared is 9604 34 squared is 1156 35 squared is 1225 50 is an even number 98 is an even number 34 is an even number Done
Communication Between Processes
Multiprocessing supports two types of communication channels between processes:
- Pipes
- Queues
Queues
Queue
objects are used to pass data between processes. They can store any pickle-able Python object, and you can use them as shown in the example below:
import multiprocessing def is_even(numbers, q): for n in numbers: if n % 2 == 0: q.put(n) if __name__ == "__main__": q = multiprocessing.Queue() p = multiprocessing.Process(target=is_even, args=(range(20), q)) p.start() p.join() while q: print(q.get())
In the above example, we first create a function that checks if a number is even and then put the result at the end of the queue. We then instantiate a queue object and a process object and begin the process.
Finally, we check if the queue is empty, and if not, we get the values from the front of the queue and print them to the console.
We have shown how to share data between two processes using a queue, and the result is as shown below.
# result 0 2 4 6 8 10 12 14 16 18
It's also important to note that Python has a Queue module which lives in the process module and is used to share data between threads, unlike the multiprocessing queue which lives in shared memory and is used to share data between processes.
Pipes
Pipes in multiprocessing are primarily used for communication between processes. Usage is as simple as:
from multiprocessing import Process, Pipe def f(conn): conn.send(['hello world']) conn.close() if __name__ == '__main__': parent_conn, child_conn = Pipe() p = Process(target=f, args=(child_conn,)) p.start() print parent_conn.recv() p.join()
Pipe()
returns two connection objects which represent the two ends of the pipe. Each connection object has send()
and recv()
methods. Here we create a process that prints the string hello world
and then shares the data across.
Result
# result ['hello world']
Locks
Locks
work by ensuring that only one process is executed at a time, hence blocking other processes from executing similar code. This allows the process to be completed, and only then can the lock be released.
The example below shows a pretty straightforward usage of the Lock method.
from multiprocessing import Process, Lock def greeting(l, i): l.acquire() print 'hello', i l.release() if __name__ == '__main__': lock = Lock() names = ['Alex', 'sam', 'Bernard', 'Patrick', 'Jude', 'Williams'] for name in names: Process(target=greeting, args=(lock, name)).start() #result hello Alex hello sam hello Bernard hello Patrick hello Jude hello Williams
In this code, we first import the Lock method, acquire it, execute the print function, and then release it.
Logging
The multiprocessing module also provides support for logging, although the logging package doesn't use locks so messages between processes might end up being mixed up during execution.
Usage of logging is as simple as:
import multiprocessing, logging logger = multiprocessing.log_to_stderr() logger.setLevel(logging.INFO) logger.warning('Error has occurred')
Here we first import the logging and multiprocessing modules, and we then define the multiprocessing.log_to_stderr()
method, which performs a call to get_logger()
as well as adding a handler which sends output to sys.stderr
. Finally, we set the logger level and the message we want to convey.
Conclusion
This tutorial has covered what is necessary to get started with multiprocessing in Python. Multiprocessing overcomes the problem of GIL (Global Interpreter Lock) since it leverages the use of subprocesses instead of threads.
There is much more in the Python documentation that isn’t covered in this tutorial, so feel free to visit the Python multiprocessing docs and utilize the full power of this module.
No comments:
Post a Comment