Cara menggunakan kill process python multiprocessing

The

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
6 module allows the programmer to fully leverage multiple processors on a given machine. The API used is similar to the classic
#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
7 module. It offers both local and remote concurrency.

The multiprocesing module avoids the limitations of the Global Interpreter Lock (GIL) by using subprocesses instead of threads. The multiprocessed code does not execute in the same order as serial code. There is no guarantee that the first process to be created will be the first to complete.

Python GIL

A global interpreter lock (GIL) is a mechanism used in Python interpreter to synchronize the execution of threads so that only one native thread can execute at a time, even if run on a multi-core processor.

The C extensions, such as numpy, can manually release the GIL to speed up computations. Also, the GIL released before potentionally blocking I/O operations.

Note that both Jython and IronPython do not have the GIL.

Concurrency means that two or more calculations happen within the same time frame. Parallelism means that two or more calculations happen at the same moment. Parallelism is therefore a specific case of concurrency. It requires multiple CPU units or cores.

True parallelism in Python is achieved by creating multiple processes, each having a Python interpreter with its own separate GIL.

Python has three modules for concurrency:

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
6,
#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
7, and
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
0. When the tasks are CPU intensive, we should consider the
#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
6 module. When the tasks are I/O bound and require lots of connections, the
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
0 module is recommended. For other types of tasks and when libraries cannot cooperate with
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
0, the
#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
7 module can be considered.

Embarrassinbly parallel

The term embarrassinbly parallel is used to describe a problem or workload that can be easily run in parallel. It is important to realize that not all workloads can be divided into subtasks and run parallelly. For instance those, who need lots of communication among subtasks.

The examples of perfectly parallel computations include:

  • Monte Carlo analysis
  • numerical integration
  • rendering of computer graphics
  • brute force searches in cryptography
  • genetic algorithms

Another situation where parallel computations can be applied is when we run several different computations, that is, we don't divide a problem into subtasks. For instance, we could run calculations of π using different algorithms in parallel.

Both processes and threads are independent sequences of execution. The following table summarizes the differences between a process and a thread:

ProcessThreadprocesses run in separate memory (process isolation)threads share memoryuses more memoryuses less memorychildren can become zombiesno zombies possiblemore overheadless overheadslower to create and destroyfaster to create and destroyeasier to code and debugcan become harder to code and debug

Process

The

$ ./joining.py
starting main
starting fun
finishing fun
finishing main
5 object represents an activity that is run in a separate process. The
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
6 class has equivalents of all the methods of
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
7. The
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
5 constructor should always be called with keyword arguments.

The

$ ./joining.py
starting main
starting fun
finishing fun
finishing main
9 argument of the constructor is the callable object to be invoked by the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
0 method. The
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
1 is the process name. The
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
2 method starts the process's activity. The
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 method blocks until the process whose
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 method is called terminates. If the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
5 option is provided, it blocks at most timeout seconds. The
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
6 method returns a boolean value indicationg whether the process is alive. The
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
7 method terminates the process.

Advertisements

The Python multiprocessing style guide recommends to place the multiprocessing code inside the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
8 idiom. This is due to the way the processes are created on Windows. The guard is to prevent the endless loop of process generations.

Simple process example

The following is a simple program that uses

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
6.

#!/usr/bin/python

from multiprocessing import Process


def fun(name):
    print(f'hello {name}')

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()


if __name__ == '__main__':
    main()

We create a new process and pass a value to it.

def fun(name):
    print(f'hello {name}')

The function prints the passed parameter.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()

A new process is created. The

$ ./joining.py
starting main
starting fun
finishing fun
finishing main
9 option provides the callable that is run in the new process. The
#!/usr/bin/python

from multiprocessing import Process
import time

def fun(val):

    print(f'starting fun with {val} s')
    time.sleep(val)
    print(f'finishing fun with {val} s')


def main():

    p1 = Process(target=fun, args=(3, ))
    p1.start()
    # p1.join()

    p2 = Process(target=fun, args=(2, ))
    p2.start()
    # p2.join()

    p3 = Process(target=fun, args=(1, ))
    p3.start()
    # p3.join()

    p1.join()
    p2.join()
    p3.join()

    print('finished main')

if __name__ == '__main__':

    main()
1 provides the data to be passed. The multiprocessing code is placed inside the main guard. The process is started with the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
2 method.

if __name__ == '__main__':
    main()

The code is placed inside the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
8 idiom.

Python multiprocessing join

The

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 method blocks the execution of the main process until the process whose
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 method is called terminates. Without the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 method, the main process won't wait until the process gets terminated.

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')

The example calls the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 on the newly created process.

$ ./joining.py
starting main
starting fun
finishing fun
finishing main

The finishing main message is printed after the child process has finished.

$ ./joining.py
starting main
finishing main
starting fun
finishing fun

When we comment out the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 method, the main process finishes before the child process.

It is important to call the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 methods after the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
2 methods.

#!/usr/bin/python

from multiprocessing import Process
import time

def fun(val):

    print(f'starting fun with {val} s')
    time.sleep(val)
    print(f'finishing fun with {val} s')


def main():

    p1 = Process(target=fun, args=(3, ))
    p1.start()
    # p1.join()

    p2 = Process(target=fun, args=(2, ))
    p2.start()
    # p2.join()

    p3 = Process(target=fun, args=(1, ))
    p3.start()
    # p3.join()

    p1.join()
    p2.join()
    p3.join()

    print('finished main')

if __name__ == '__main__':

    main()

If we call the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 methods incorrectly, then we in fact run the processes sequentially. (The incorrect way is commented out.)

The

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
6 method determines if the process is running.

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('calling fun')
    time.sleep(2)

def main():

    print('main fun')

    p = Process(target=fun)
    p.start()
    p.join()

    print(f'Process p is alive: {p.is_alive()}')


if __name__ == '__main__':
    main()

When we wait for the child process to finish with the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3 method, the process is already dead when we check it. If we comment out the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
3, the process is still alive.

The

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('calling fun')
    time.sleep(2)

def main():

    print('main fun')

    p = Process(target=fun)
    p.start()
    p.join()

    print(f'Process p is alive: {p.is_alive()}')


if __name__ == '__main__':
    main()
5 returns the current process Id, while the
#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('calling fun')
    time.sleep(2)

def main():

    print('main fun')

    p = Process(target=fun)
    p.start()
    p.join()

    print(f'Process p is alive: {p.is_alive()}')


if __name__ == '__main__':
    main()
6 returns the parent's process Id.

#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()

The example runs two child processes. It prints their Id and their parent's Id.

def fun(name):
    print(f'hello {name}')
0

The parent Id is the same, the process Ids are different for each child process.

With the

$ ./joining.py
starting main
finishing main
starting fun
finishing fun
1 property of the
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
5, we can give the worker a specific name. Otherwise, the module creates its own name.

def fun(name):
    print(f'hello {name}')
1

In the example, we create three processes; two of them are given a custom name.

def fun(name):
    print(f'hello {name}')
2

Subclassing Process

When we subclass the

$ ./joining.py
starting main
starting fun
finishing fun
finishing main
5, we override the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
0 method.

def fun(name):
    print(f'hello {name}')
3

We create a

#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()
1 class which inherits from the
$ ./joining.py
starting main
starting fun
finishing fun
finishing main
5. In the
$ ./joining.py
starting main
finishing main
starting fun
finishing fun
0 method, we write the worker's code.

The management of the worker processes can be simplified with the

#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()
4 object. It controls a pool of worker processes to which jobs can be submitted. The pool's
#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()
5 method chops the given iterable into a number of chunks which it submits to the process pool as separate tasks. The pool's
#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()
5 is a parallel equivalent of the built-in
#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()
5 method. The
#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()
5 blocks the main execution until all computations finish.

The

#!/usr/bin/python

from multiprocessing import Process
import os

def fun():

    print('--------------------------')

    print('calling fun')
    print('parent process id:', os.getppid())
    print('process id:', os.getpid())

def main():

    print('main fun')
    print('process id:', os.getpid())

    p1 = Process(target=fun)
    p1.start()
    p1.join()

    p2 = Process(target=fun)
    p2.start()
    p2.join()


if __name__ == '__main__':
    main()
4 can take the number of processes as a parameter. It is a value with which we can experiment. If we do not provide any value, then the number returned by
def fun(name):
    print(f'hello {name}')
00 is used.

def fun(name):
    print(f'hello {name}')
4

In the example, we create a pool of processes and apply values on the

def fun(name):
    print(f'hello {name}')
01 function. The number of cores is determined with the
def fun(name):
    print(f'hello {name}')
02 function.

def fun(name):
    print(f'hello {name}')
5

On a computer with four cores it took slightly more than 2 seconds to finish four computations, each lasting two seconds.

def fun(name):
    print(f'hello {name}')
6

When we add additional value to be computed, the time increased to over four seconds.

Multiple arguments

To pass multiple arguments to a worker function, we can use the

def fun(name):
    print(f'hello {name}')
03 method. The elements of the iterable are expected to be iterables that are unpacked as arguments.

def fun(name):
    print(f'hello {name}')
7

In this example, we pass two values to the

def fun(name):
    print(f'hello {name}')
04 function: the value and the exponent.

def fun(name):
    print(f'hello {name}')
8

Multiple functions

The following example shows how to run multiple functions in a pool.

def fun(name):
    print(f'hello {name}')
9

We have three functions, which are run independently in a pool. We use the

def fun(name):
    print(f'hello {name}')
05 to prepare the functions and their parameters before they are executed.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
0

Python multiprocessing π calculation

The π is the ratio of the circumference of any circle to the diameter of the circle. The π is an irrational number whose decimal form neither ends nor becomes repetitive. It is approximately equal to 3.14159. There are several formulas to calculate π.

Calculating approximations of π can take a long time, so we can leverage the parallel computations. We use the Bailey–Borwein–Plouffe formula to calculate π.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
1

First, we calculate three approximations sequentially. The precision is the number of digits of the computed π.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
2

On our machine, it took 0.57381 seconds to compute the three approximations.

In the following example, we use a pool of processes to calculate the three approximations.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
3

We run the calculations in a pool of three processes and we gain some small increase in efficiency.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
4

When we run the calculations in parallel, it took 0.38216479 seconds.

In multiprocessing, each worker has its own memory. The memory is not shared like in threading.

Advertisements
def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
5

We create a worker to which we pass the global

def fun(name):
    print(f'hello {name}')
06 list. We add additional values to the list in the worker but the original list in the main process is not modified.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
6

As we can see from the output, the two lists are separate.

Advertisements

Sharing state between processes

Data can be stored in a shared memory using

def fun(name):
    print(f'hello {name}')
07 or
def fun(name):
    print(f'hello {name}')
08.

Note: It is best to avoid sharing data between processes. Message passing is preferred.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
7

The example creates a counter object which is shared among processes. Each of the processes increases the counter.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
8

Each process must acquire a lock for itself.

The message passing is the preferred way of communication among processes. Message passing avoids having to use synchronization primitives such as locks, which are difficult to use and error prone in complex situations.

To pass messages, we can utilize the pipe for the connection between two processes. The queue allows multiple producers and consumers.

def main():

    p = Process(target=fun, args=('Peter',))
    p.start()
9

In the example, we create four processes. Each process generates a random value and puts it into the queue. After all processes finish, we get all values from the queue.

if __name__ == '__main__':
    main()
0

The queue is passed as an argument to the process.

if __name__ == '__main__':
    main()
1

The

def fun(name):
    print(f'hello {name}')
09 method removes and returns the item from the queue.

Advertisements
if __name__ == '__main__':
    main()
2

The example generates a list of four random values.

In the following example, we put words in a queue. The created processes read the words from the queue.

if __name__ == '__main__':
    main()
3

Four processes are created; each of them reads a word from the queue and prints it.

if __name__ == '__main__':
    main()
4

Queue order

In multiprocessing, there is no guarantee that the processes finish in a certain order.

if __name__ == '__main__':
    main()
5

We have processes that calculate the square of a value. The input data is in certain order and we need to maintain this order. To deal with this, we keep an extra index for each input value.

if __name__ == '__main__':
    main()
6

To illustrate variation, we randomly slow down the calculation with the

def fun(name):
    print(f'hello {name}')
10 method. We place an index into the queue with the calculated square.

if __name__ == '__main__':
    main()
7

We get the results. At this moment, the tuples are in random order.

if __name__ == '__main__':
    main()
8

We sort the result data by their index values.

if __name__ == '__main__':
    main()
9

We get the square values that correspond to the initial data.

Monte Carlo methods are a broad class of computational algorithms that rely on repeated random sampling to obtain numerical results. The underlying concept is to use randomness to solve problems that might be deterministic in principle.

The following formula is used to calculate the approximation of π:

π4≈MN

The M is the number of generated points in the square and N is the total number of points.

While this method of π calculation is interesting and perfect for school examples, it is not very accurate. There are far better algorithms to get π.

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
0

In the example, we calculate the approximation of the π value using one hundred million generated random points.

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
1

It took 44.78 seconds to calculate the approximation of π

Advertisements

Now we divide the whole task of π computation into subtasks.

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
2

In the example, we find out the number of cores and divide the random sampling into subtasks. Each task will compute the random values independently.

Advertisements
#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
3

Instead of calculating 100_000_000 in one go, each subtask will calculate a portion of it.

#!/usr/bin/python

from multiprocessing import Process
import time

def fun():

    print('starting fun')
    time.sleep(2)
    print('finishing fun')

def main():

    p = Process(target=fun)
    p.start()
    p.join()


if __name__ == '__main__':

    print('starting main')
    main()
    print('finishing main')
4

The partial calculations are passed to the

def fun(name):
    print(f'hello {name}')
11 variable and the sum is then used in the final formula.