PyMOTW: os (Part 4)
Module: os (Part 4)
Description:
This week I am wrapping up coverage of the os module (saving os.path for a future post of its own) and discuss functions useful for working with multiple processes. I covered use of pipes in part 2, so this week we will look at system(), fork(), exec(), and related functions.
Disclaimer
Many of these functions have limited portability. For a more consistent way to work with processes in a platform independent manner, see the subprocess module instead.
Running External Command
The simplest way to run a separate command, without interacting with it at all, is os.system(). It takes a single string which is the command line to be executed by a sub-process running a shell.
import os
# Simple command
os.system('ls -l')
$ python os_system_example.py
total 168
-rw-r--r-- 1 dhellman dhellman 0 May 27 06:58 __init__.py
-rw-r--r-- 1 dhellman dhellman 1391 Jun 10 09:36 os_access.py
-rw-r--r-- 1 dhellman dhellman 1383 May 27 09:23 os_cwd_example.py
-rw-r--r-- 1 dhellman dhellman 1535 Jun 10 09:36 os_directories.py
-rw-r--r-- 1 dhellman dhellman 1613 May 27 09:23 os_environ_example.py
-rw-r--r-- 1 dhellman dhellman 2816 Jun 3 08:34 os_popen_examples.py
-rw-r--r-- 1 dhellman dhellman 1438 May 27 09:23 os_process_id_example.py
-rw-r--r-- 1 dhellman dhellman 1887 May 27 09:23 os_process_user_example.py
-rw-r--r-- 1 dhellman dhellman 1545 Jun 10 09:36 os_stat.py
-rw-r--r-- 1 dhellman dhellman 1638 Jun 10 09:36 os_stat_chmod.py
-rw-r--r-- 1 dhellman dhellman 1452 Jun 10 09:36 os_symlinks.py
-rw-r--r-- 1 dhellman dhellman 1279 Jun 17 12:17 os_system_example.py
-rw-r--r-- 1 dhellman dhellman 1672 Jun 10 09:36 os_walk.py
Since the command is passed directly to the shell for processing, it can even include shell syntax such as globbing or environment variables:
# Command with shell expansion
os.system('ls -l $HOME')
total 40
-rwx------ 1 dhellman dhellman 1328 Dec 13 2005 %backup%~
drwx------ 11 dhellman dhellman 374 Jun 17 12:11 Desktop
drwxr-xr-x 15 dhellman dhellman 510 May 27 07:50 Devel
drwx------ 29 dhellman dhellman 986 May 31 17:01 Documents
drwxr-xr-x 45 dhellman dhellman 1530 Jun 17 12:12 DownloadedApps
drwx------ 55 dhellman dhellman 1870 May 22 14:53 Library
drwx------ 8 dhellman dhellman 272 Mar 4 2006 Movies
drwx------ 10 dhellman dhellman 340 Feb 14 10:54 Music
drwx------ 12 dhellman dhellman 408 Jun 17 01:00 Pictures
drwxr-xr-x 5 dhellman dhellman 170 Oct 1 2006 Public
drwxr-xr-x 15 dhellman dhellman 510 May 12 15:19 Sites
drwxr-xr-x 4 dhellman dhellman 136 Jan 23 2006 iPod
-rw-r--r-- 1 dhellman dhellman 105 Mar 7 11:48 pgadmin.log
drwxr-xr-x 3 dhellman dhellman 102 Apr 29 16:32 tmp
Unless you explicitly run the command in the background, the call to os.system() blocks until it is complete. Standard input, output, and error from the child process are tied to the appropriate streams owned by the caller by default, but can be redirected using shell syntax.
import os
import time
print 'Calling...'
os.system('date; (sleep 3; date) &')
print 'Sleeping...'
time.sleep(5)
This is getting into shell trickery, though, and there are better ways to accomplish the same thing.
$ python os_system_background.py
Calling...
Sun Jun 17 12:27:20 EDT 2007
Sleeping...
Sun Jun 17 12:27:23 EDT 2007
Creating Processes with os.fork()
The POSIX functions fork() and exec*() (available under Mac OS X, Linux, and other UNIX variants) are available through the os module. Entire books have been written about reliably using these functions, so check your library or bookstore for more details than I will present here.
To create a new process as a clone of the current process, use os.fork():
pid = os.fork()
if pid:
print 'Child process id:', pid
else:
print 'I am the child'
Your output will vary based on the state of your system each time you run the example, but it should look something like:
$ python os_fork_example.py
Child process id: 5883
I am the child
After the fork, you end up with 2 processes running the same code. To tell which one you are in, check the return value. If it is 0, you are inside the child process. If it is not 0, you are in the parent process and the return value is the process id of the child process.
From the parent process, it is possible to send the child signals. This is a bit more complicated to set up, and uses the signal module, so let's walk through the code. First we can define a signal handler to be invoked when the signal is received.
import os
import signal
import time
def signal_usr1(signum, frame):
pid = os.getpid()
print 'Received USR1 in process %s' % pid
Then we fork, and in the parent pause a short amount of time before sending a USR1 signal using os.kill(). The short pause gives the child process time to set up the signal handler.
print 'Forking...'
child_pid = os.fork()
if child_pid:
print 'PARENT: Pausing before sending signal...'
time.sleep(1)
print 'PARENT: Signaling %s' % child_pid
os.kill(child_pid, signal.SIGUSR1)
In the child, we set up the signal handler and go to sleep for a while to give the parent time to send us the signal:
else:
print 'CHILD: Setting up signal handler'
signal.signal(signal.SIGUSR1, signal_usr1)
print 'CHILD: Pausing to wait for signal'
time.sleep(5)
In a real app, you probably wouldn't need to (or want to) call sleep, of course.
$ python os_kill_example.py
Forking...
PARENT: Pausing before sending signal...
CHILD: Setting up signal handler
CHILD: Pausing to wait for signal
PARENT: Signaling 6053
Received USR1 in process 6053
As you see, a simple way to handle separate behavior in the child process is to check the return value of fork() and branch. For more complex behavior, you may want more code separation than a simple branch. In other cases, you may have an existing program you have to wrap. For both of these situations, you can use the os.exec*() series of functions to run another program. When you "exec" a program, the code from that program replaces the code from your existing process.
child_pid = os.fork()
if child_pid:
os.waitpid(child_pid, 0)
else:
os.execlp('ls', 'ls', '-l', '/tmp/')
$ python os_exec_example.py
total 40
drwxr-xr-x 2 dhellman wheel 68 Jun 17 14:35 527
prw------- 1 root wheel 0 Jun 15 19:24 afpserver_PIPE
drwx------ 3 dhellman wheel 102 Jun 17 12:13 emacs527
drwxr-xr-x 2 dhellman wheel 68 Jun 16 05:01 hsperfdata_dhellmann
-rw------- 1 nobody wheel 12 Jun 17 13:55 objc_sharing_ppc_4294967294
-rw------- 1 dhellman wheel 144 Jun 17 14:32 objc_sharing_ppc_527
-rw------- 1 security wheel 24 Jun 17 07:09 objc_sharing_ppc_92
drwxr-xr-x 4 dhellman dhellman 136 Jun 8 03:16 var_backups
There are many variations of exec*(), depending on what form you might have the arguments in, whether you want the path and environment of the parent process to be copied to the child, etc. Have a look at the library documentation to for details.
For all variations, the first argument is a path or filename and the remaining arguments control how that program runs. They are either passed as command line arguments or override the process "environment" (see os.environ and os.getenv).
Waiting for a Child
Suppose you are using multiple processes to work around the threading limitations of Python and the Global Interpreter Lock. If you start several processes to run separate tasks, you will want to wait for one or more of them to finish before starting new ones, to avoid overloading the server. There are a few different ways to do that using wait() and related functions.
If you don't care, or know, which child process might exit first os.wait() will return as soon as any exits:
import os
import sys
import time
for i in range(3):
print 'PARENT: Forking %s' % i
worker_pid = os.fork()
if not worker_pid:
print 'WORKER %s: Starting' % i
time.sleep(2 + i)
print 'WORKER %s: Finishing' % i
sys.exit(i)
for i in range(3):
print 'PARENT: Waiting for %s' % i
done = os.wait()
print 'PARENT:', done
Notice that the return value from os.wait() is a tuple containing the process id and exit status ("a 16-bit number, whose low byte is the signal number that killed the process, and whose high byte is the exit status").
$ python os_wait_example.py
PARENT: Forking 0
PARENT: Forking 1
PARENT: Forking 2
PARENT: Waiting for 0
WORKER 0: Starting
WORKER 1: Starting
WORKER 2: Starting
WORKER 0: Finishing
PARENT: (6501, 0)
PARENT: Waiting for 1
WORKER 1: Finishing
PARENT: (6502, 256)
PARENT: Waiting for 2
WORKER 2: Finishing
PARENT: (6503, 512)
If you want a specific process, use os.waitpid().
import os
import sys
import time
workers = []
for i in range(3):
print 'PARENT: Forking %s' % i
worker_pid = os.fork()
if not worker_pid:
print 'WORKER %s: Starting' % i
time.sleep(2 + i)
print 'WORKER %s: Finishing' % i
sys.exit(i)
workers.append(worker_pid)
for pid in workers:
print 'PARENT: Waiting for %s' % pid
done = os.waitpid(pid, 0)
print 'PARENT:', done
$ python os_waitpid_example.py
PARENT: Forking 0
WORKER 0: Starting
PARENT: Forking 1
WORKER 1: Starting
PARENT: Forking 2
WORKER 2: Starting
PARENT: Waiting for 6547
WORKER 0: Finishing
PARENT: (6547, 0)
PARENT: Waiting for 6548
WORKER 1: Finishing
PARENT: (6548, 256)
PARENT: Waiting for 6549
WORKER 2: Finishing
PARENT: (6549, 512)
wait3() and wait4() work in a similar manner, but return more detailed information about the child process with the pid, exit status, and resource usage.
Spawn
As a convenience, the os.spawn*() family of functions handles the fork() and exec*() calls for you in one statement:
os.spawnlp(os.P_WAIT, 'ls', 'ls', '-l', '/tmp/')
$ python os_exec_example.py
total 40
drwxr-xr-x 2 dhellman wheel 68 Jun 17 14:35 527
prw------- 1 root wheel 0 Jun 15 19:24 afpserver_PIPE
drwx------ 3 dhellman wheel 102 Jun 17 12:13 emacs527
drwxr-xr-x 2 dhellman wheel 68 Jun 16 05:01 hsperfdata_dhellmann
-rw------- 1 nobody wheel 12 Jun 17 13:55 objc_sharing_ppc_4294967294
-rw------- 1 dhellman wheel 144 Jun 17 14:32 objc_sharing_ppc_527
-rw------- 1 security wheel 24 Jun 17 07:09 objc_sharing_ppc_92
drwxr-xr-x 4 dhellman dhellman 136 Jun 8 03:16 var_backups
Conclusion
There are a lot of other considerations to be taken into account when working with multiple processes, such as handling signals, closing duplicated file descriptors, etc. All of these topics are covered in reference books such as Advanced Programming in the UNIX(R) Environment.
Next week, I'll pick a module that won't take 4 weeks to write about. :-) Suggestions are welcome, as usual.
References:
Python Module of the Week
Sample Code
Delve into UNIX process creation
Advanced Programming in the UNIX(R) Environment
Updated 9/5/2007 with minor formatting changes.
Technorati Tags:
python, PyMOTW
Twitter
2 comments:
I love this series! Keep up the good work. Can I required the subprocess module for next week?
If I don't cover subprocess this week, it will definitely be soon.
Thanks for the feedback!
Doug
Post a Comment