Sunday, May 27, 2007

PyMOTW: os

Module: os
Purpose: Portable access to operating system specific features.
Python Version: 1.4 (or earlier)

Description:

The os module provides a wrapper for platform specific modules such as posix, nt, and mac. The API for functions available on all platform should be the same, so using the os module offers some measure of portability. Not all functions are available on all platforms, however. Many of the process management functions described in this summary are not available for Windows.

The Python documentation for the os module is subtitled "Miscellaneous operating system interfaces". The module includes mostly functions for creating and managing running processes or filesystem content (files and directories), with a few other random bits of functionality thrown in besides. In this session, we'll cover the features for learning about and changing process parameters.

A quick warning: Some of the example code below will only work on Unix-like operating systems.

Process Owner

The first set of functions I'll cover are used for determining and changing the process owner ids. These are mostly useful to authors of daemons or special system programs which need to change permission level rather than running as root. I won't try to explain all of the intricate details of Unix security, process owners, etc. in this brief post. See the References list below for more details.

Let's start with a script to show the real and effective user and group information for a process, and then change the effective values. This is similar to what a daemon would need to do when it starts as root during a system boot, to lower the privilege level and run as a different user. If you download the examples to try them out, you should change the TEST_GID and TEST_UID values to match your user.

import os

TEST_GID=501
TEST_UID=527

def show_user_info():
print 'Effective User :', os.geteuid()
print 'Effective Group :', os.getegid()
print 'Actual User :', os.getuid(), os.getlogin()
print 'Actual Group :', os.getgid()
print 'Actual Groups :', os.getgroups()
return

print 'BEFORE CHANGE:'
show_user_info()
print

try:
os.setegid(TEST_GID)
except OSError:
print 'ERROR: Could not change effective group. Re-run as root.'
else:
print 'CHANGED GROUP:'
show_user_info()
print

try:
os.seteuid(TEST_UID)
except OSError:
print 'ERROR: Could not change effective user. Re-run as root.'
else:
print 'CHANGE USER:'
show_user_info()
print


When run as myself (527, 501) on OS X, I see this output:

$ python os_process_user_example.py
BEFORE CHANGE:
Effective User : 527
Effective Group : 501
Actual User : 527 dhellmann
Actual Group : 501
Actual Groups : [501, 81, 79, 80]

CHANGED GROUP:
Effective User : 527
Effective Group : 501
Actual User : 527 dhellmann
Actual Group : 501
Actual Groups : [501, 81, 79, 80]

CHANGE USER:
Effective User : 527
Effective Group : 501
Actual User : 527 dhellmann
Actual Group : 501
Actual Groups : [501, 81, 79, 80]


Notice that the values do not change. Since I am not running as root, processes I start cannot change their effective owner values. If I do try to set the effective user id or group id to anything other than my own, an OSError is raised.

Now let's look at what happens when we run the same script using sudo to start out with root privileges:

$ sudo python os_process_user_example.py
Password:
BEFORE CHANGE:
Effective User : 0
Effective Group : 0
Actual User : 0 dhellmann
Actual Group : 0
Actual Groups : [0, 262, 1, 2, 3, 31, 4, 29, 5, 80, 20]

CHANGED GROUP:
Effective User : 0
Effective Group : 501
Actual User : 0 dhellmann
Actual Group : 0
Actual Groups : [501, 262, 1, 2, 3, 31, 4, 29, 5, 80, 20]

CHANGE USER:
Effective User : 527
Effective Group : 501
Actual User : 0 dhellmann
Actual Group : 0
Actual Groups : [501, 262, 1, 2, 3, 31, 4, 29, 5, 80, 20]


In this case, since we start as root, we can change the effective user and group for the process. Once we change the effective UID, the process is limited to the permissions of that user. Since non-root users cannot change their effective group, we need to change the group first then the user.

Besides finding and changing the process owner, there are functions for determining the current and parent process id, finding and changing the process group and session ids, as well as finding the controlling terminal id. These can be useful for sending signals between processes or for complex applications such as writing your own command line shell.

Process Environment

Another feature of the operating system exposed to your program though the os module is the environment. Variables set in the environment are visible as strings which can be read through os.environ or os.getenv(). Environment variables are commonly used for configuration values such as search paths, file locations, and debug flags. Let's look at an example of retrieving an environment variable, and passing a value through to a child process.

print 'Initial value:', os.environ.get('TESTVAR', None)
print 'Child process:'
os.system('echo $TESTVAR')

os.environ['TESTVAR'] = 'THIS VALUE WAS CHANGED'

print
print 'Changed value:', os.environ['TESTVAR']
print 'Child process:'
os.system('echo $TESTVAR')

del os.environ['TESTVAR']

print
print 'Removed value:', os.environ.get('TESTVAR', None)
print 'Child process:'
os.system('echo $TESTVAR')


The os.environ object follows the standard Python mapping API for retrieving and setting values. Changes to os.environ are exported for child processes.

 $ python os_environ_example.py
Initial value: None
Child process:


Changed value: THIS VALUE WAS CHANGED
Child process:
THIS VALUE WAS CHANGED

Removed value: None
Child process:


Process Working Directory

A concept from operating systems with hierarchical filesystems is the notion of the "current working directory". This is the directory on the filesystem the process uses as the default location when files are accessed with relative paths.

print 'Starting:', os.getcwd()
print os.listdir(os.curdir)

print 'Moving up one:', os.pardir
os.chdir(os.pardir)

print 'After move:', os.getcwd()
print os.listdir(os.curdir)


Note the use of os.curdir and os.pardir to refer to the current and parent directories in a portable manner. The output should not be surprising:

Starting: /Users/dhellmann/Documents/PyMOTW/PyMOTW/os
['.svn', '__init__.py', 'os_cwd_example.py', 'os_environ_example.py',
'os_process_id_example.py', 'os_process_user_example.py']
Moving up one: ..
After move: /Users/dhellmann/Documents/PyMOTW/PyMOTW
['.svn', '__init__.py', 'bisect', 'ConfigParser', 'fileinput', 'linecache',
'locale', 'logging', 'os', 'Queue', 'StringIO', 'textwrap']


To be continued...

Today I've covered the functions in the os module for finding and changing process parameters. Next time, I will continue with the portion of the os module dedicated to managing filesystem objects.

References:

Python Module of the Week
Example Source
Python Reference Manual, Process Parameters
Unix Manual Page introduction (definitions of real and effective ids, etc.)
Speaking UNIX, Part 8: UNIX processes
geteuid
getsid
setpgrp

Updated 9/5/2007 with minor formatting changes.

Technorati Tags:
, ,


Sunday, May 20, 2007

PyMOTW: locale

Module: locale
Purpose: POSIX cultural localization API
Python Version: 1.5, with extensions through 2.5 (this discussion assumes 2.5)

Description:

The locale module is part of Python's internationalization and localization support library. It provides a standard way to handle operations that may depend on the language or location of your users. For example, formatting numbers as currency, comparing strings for sorting, and working with dates. It does not cover translation (see the gettext module) or Unicode encoding.

Changing the locale can have application-wide ramifications, so the recommended practice is to avoid changing the value in a library and to let the application set it one time. In the examples below, I will change the locale several times for illustration purposes. It is far more likely that your application will set the locale once at startup and not change it.

Example:

The most common way to let the user change the locale settings for an application is through an environment variable (LC_ALL, LC_CTYPE, LANG, or LANGUAGE, depending on your platform). The application then calls locale.setlocale() without a hard-coded value, and the environment value is used.

import locale
import os
import pprint

print 'Environment settings:'
for env_name in [ 'LC_ALL', 'LC_CTYPE', 'LANG', 'LANGUAGE' ]:
print '\t%s = %s' % (env_name, os.environ.get(env_name, ''))

# What is the default locale?
print
print 'Default locale:', locale.getdefaultlocale()

# Default settings based on the user's environment.
locale.setlocale(locale.LC_ALL, '')

# If we do not have a locale, assume US English.
print 'From environment:', locale.getlocale()

pprint.pprint(locale.localeconv())


On my Mac, this produces output like:

$ python locale_env_example.py
Environment settings:
LC_ALL =
LC_CTYPE =
LANG =
LANGUAGE =

Default locale: (None, 'mac-roman')
From environment: (None, None)
{'currency_symbol': '',
'decimal_point': '.',
'frac_digits': 127,
'grouping': [127],
'int_curr_symbol': '',
'int_frac_digits': 127,
'mon_decimal_point': '',
'mon_grouping': [127],
'mon_thousands_sep': '',
'n_cs_precedes': 127,
'n_sep_by_space': 127,
'n_sign_posn': 127,
'negative_sign': '',
'p_cs_precedes': 127,
'p_sep_by_space': 127,
'p_sign_posn': 127,
'positive_sign': '',
'thousands_sep': ''}


Now if we run the same script with the LANG variable set, you can see that the locale and default encoding change accordingly:

France:

$ LANG=fr_FR python locale_env_example.py
Environment settings:
LC_ALL =
LC_CTYPE =
LANG = fr_FR
LANGUAGE =

Default locale: (None, 'mac-roman')
From environment: ('fr_FR', 'ISO8859-1')
{'currency_symbol': 'Eu',
'decimal_point': ',',
'frac_digits': 2,
'grouping': [127],
'int_curr_symbol': 'EUR ',
'int_frac_digits': 2,
'mon_decimal_point': ',',
'mon_grouping': [3, 3, 0],
'mon_thousands_sep': ' ',
'n_cs_precedes': 0,
'n_sep_by_space': 1,
'n_sign_posn': 2,
'negative_sign': '-',
'p_cs_precedes': 0,
'p_sep_by_space': 1,
'p_sign_posn': 1,
'positive_sign': '',
'thousands_sep': ''}


Spain:

$ LANG=es_ES python locale_env_example.py
Environment settings:
LC_ALL =
LC_CTYPE =
LANG = es_ES
LANGUAGE =

Default locale: (None, 'mac-roman')
From environment: ('es_ES', 'ISO8859-1')
{'currency_symbol': 'Eu',
'decimal_point': ',',
'frac_digits': 2,
'grouping': [127],
'int_curr_symbol': 'EUR ',
'int_frac_digits': 2,
'mon_decimal_point': ',',
'mon_grouping': [3, 3, 0],
'mon_thousands_sep': '.',
'n_cs_precedes': 1,
'n_sep_by_space': 1,
'n_sign_posn': 1,
'negative_sign': '-',
'p_cs_precedes': 1,
'p_sep_by_space': 1,
'p_sign_posn': 1,
'positive_sign': '',
'thousands_sep': ''}


Portual:

$ LANG=pt_PT python locale_env_example.py
Environment settings:
LC_ALL =
LC_CTYPE =
LANG = pt_PT
LANGUAGE =

Default locale: (None, 'mac-roman')
From environment: ('pt_PT', 'ISO8859-1')
{'currency_symbol': 'Eu',
'decimal_point': ',',
'frac_digits': 2,
'grouping': [127],
'int_curr_symbol': 'EUR ',
'int_frac_digits': 2,
'mon_decimal_point': '.',
'mon_grouping': [3, 3, 0],
'mon_thousands_sep': '.',
'n_cs_precedes': 0,
'n_sep_by_space': 1,
'n_sign_posn': 1,
'negative_sign': '-',
'p_cs_precedes': 0,
'p_sep_by_space': 1,
'p_sign_posn': 1,
'positive_sign': '',
'thousands_sep': ' '}


Poland:

$ LANG=pl_PL python locale_env_example.py
Environment settings:
LC_ALL =
LC_CTYPE =
LANG = pl_PL
LANGUAGE =

Default locale: (None, 'mac-roman')
From environment: ('pl_PL', 'ISO8859-2')
{'currency_symbol': 'z?\x82',
'decimal_point': ',',
'frac_digits': 2,
'grouping': [3, 3, 0],
'int_curr_symbol': 'PLN ',
'int_frac_digits': 2,
'mon_decimal_point': ',',
'mon_grouping': [3, 3, 0],
'mon_thousands_sep': ' ',
'n_cs_precedes': 1,
'n_sep_by_space': 2,
'n_sign_posn': 4,
'negative_sign': '-',
'p_cs_precedes': 1,
'p_sep_by_space': 2,
'p_sign_posn': 4,
'positive_sign': '',
'thousands_sep': ' '}


So you can see that the currency symbol setting changes, the character to separate whole numbers from decimal fractions, etc. Now let's use the different locales to print the same information formatted for each of these different locales (US dollars, Euros, and Polish złoty):

sample_locales = [ ('USA', 'en_US'),
('France', 'fr_FR'),
('Spain', 'es_ES'),
('Portugal', 'pt_PT'),
('Poland', 'pl_PL'),
]

for name, loc in sample_locales:
locale.setlocale(locale.LC_ALL, loc)
print '%20s: %s' % (name, locale.currency(1234.56))


The output is this small table:

$ python locale_currency_example.py
USA: $1234.56
France: 1234,56 Eu
Spain: Eu 1234,56
Portugal: 1234.56 Eu
Poland: zł 1234,56


Besides generating output in different formats, the locale module helps with parsing input. Different cultures use different conventions for formatting numbers (as illustrated above). The locale module provides atoi() and atof() functions for converting the strings to integer and floating point values respectively.

sample_data = [ ('USA', 'en_US', '1234.56'),
('France', 'fr_FR', '1234,56'),
('Spain', 'es_ES', '1234,56'),
('Portugal', 'pt_PT', '1234.56'),
('Poland', 'pl_PL', '1234,56'),
]

for name, loc, a in sample_data:
locale.setlocale(locale.LC_ALL, loc)
f = locale.atof(a)
locale.setlocale(locale.LC_ALL, 'en_US')
print '%20s: %7s => %f' % (name, a, f)


$ python locale_atof_example.py
USA: 1234.56 => 1234.560000
France: 1234,56 => 1234.560000
Spain: 1234,56 => 1234.560000
Portugal: 1234.56 => 1234.560000
Poland: 1234,56 => 1234.560000


Another important aspect of localization is date and time formatting:

import locale
import time

sample_locales = [ ('USA', 'en_US'),
('France', 'fr_FR'),
('Spain', 'es_ES'),
('Portugal', 'pt_PT'),
('Poland', 'pl_PL'),
]

for name, loc in sample_locales:
locale.setlocale(locale.LC_ALL, loc)
print '%20s: %s' % (name, time.strftime(locale.nl_langinfo(locale.D_T_FMT)))


$ python locale_date_example.py
USA: Sun May 20 10:19:54 2007
France: Dim 20 mai 10:19:54 2007
Spain: dom 20 may 10:19:54 2007
Portugal: Dom 20 Mai 10:19:54 2007
Poland: ndz 20 maj 10:19:54 2007


This week I have only covered some of the high-level functions in the localize module. There are others which are lower level (format_string) or which relate to managing the locale for your application (resetlocale). As usual, you will want to refer to the Python library documentation for more details.

I am still learning about internationalization and localization myself, so if you have feedback on this summary (or if you spot a mistake), please post a comment on the blog to let me know.

References:

Example code

Locale - Wikipedia
Internationalization and localization - Wikipedia
OpenI18N.org - The Free standards Group Open Internationalisation Initiative
MSDN - National Language Support Constants
Internationalizing Python - Martin von Löwis (from 1997)
Python Module of the Week

Updated 9/5/2007 with minor formatting changes.

Technorati Tags:
,


Thursday, May 17, 2007

Telecommuting

I recently came across a few articles by Esther Schindler on telecommuting which struck a chord with me.

The first, "Getting Clueful: Seven Things the CIO Should Know About Telecommuting", is directed at managers of telecommuters. It covers the benefits to the company of having telecommuters (cost savings, productivity, etc.), potential pitfalls (not everyone can manage themselves well enough to work remotely, ), and how to cope with them. She places a heavy emphasis on building trust in the manager/employee relationship, especially through communication.

The second article is directed at the telecommuting employee. In "Telecommuters Need to Develop Special Skills", Schindler goes a bit beyond the basic advice normally found in articles like this. Unsurprisingly, most of that advice centers around communication issues such as status reporting and "visibility". One key item regards conference calls and telling people sitting around a speaker phone to speak up - we have that problem frequently at my company.

Both articles include specific advice from experienced managers and telecommuters. I'm happy to say that my company gets most of this right. We are based in Atlanta, and unless you live in the same building as your company there is no easy commute in Atlanta. From the very beginning, my manager told all of us he would rather have us working than sitting in traffic and we have been expected to figure out how to do as much of our work from home as possible. This went beyond the grudging acceptance of telecommuting at my previous employer to active encouragement. He saw the immediate cost benefit in office space and productivity benefit in gaining as much as 2 hours per day of extra work.

As far as remote communication goes, we follow the technology chart laid out by Schindler pretty closely. We don't have formalized rules; it just seemed to work out that way naturally. One category of tools she does not mention for status/discussion are online collaboration tools such as wikis and issue tracking systems. We rely on both, with the ticket tracking system used for asynchronous design discussions and the wiki for historical documentation, how-tos, etc.

After an initial settling in period (2-3 months) where I got to know my new co-workers and learned about the development environment, I have been working at home as much as possible. These days that usually means 4 of 5 days in a week, sometimes 9 of 10 over 2 weeks. There have been stretches where I didn't go to the office for a few months at a time, but those are rare. One day in the office per week works out well, since most of the developers seem to talk better with a whiteboard in front of us. I've looked into online whiteboard tools in the past, but haven't found anything to replace the nuance of the in-person meeting. Perhaps if we installed some web cams...

We're a small group, so we try to keep everyone informed of all design issues. The code is usually implemented by just one or two developers, but everyone has an opportunity to be involved in signing off on design and implementation before anything goes into the svn trunk. We've been doing this for 5-6 years now, so we've settled into a pattern. For small designs, an individual developer may do the work and write up the "approach", explaining the design and tricky implementation details. The approach (a wiki page in trac) is then submitted for review along with the completed changeset. This works well for small changes or bug fixes, but for larger projects we usually have some sort of conversation before development begins. It might be a simple sanity check by one other developer via IM, phone, or email. If a more formal review is needed, we typically write a preliminary approach, without any code, and submit it for comments. If we can't agree on the approach it is time to schedule a meeting, either as a conference call or in person.

Of course, we also make heavy use of instant messaging on our own Jabber server. Most of our quick communication has moved off of email to IM, and we use IM for "presence" notification. If I step away from my desk to take care of something around the house, run an errand, or get lunch, I use the IM status message to tell people when I expect to come back. Email tends to be reserved for progress updates, issues that don't need immediate resolution, and sometimes scheduling times for more direct means of communication.

A drawback to working remotely so often is that I frequently end up the last to know about things like schedule or priority changes. If an informal discussion in the hall results in a major decision, an email still might not go out right away or the eventual message might assume more knowledge of details than I have. This is a group communication problem, and we're working on it, but it can be frustrating at times.

Schindler does not offer tips for handling physical space for telecommuters in the office. She mentions the opportunity to save office space, but doesn't get into what to do when all of those remote workers do show up at the office. For a couple of years I had my own cube, even though I was hardly ever there. We've recently grown big enough that my desk was needed by someone who is in the office more frequently than I am, so I gave it up. When I cleaned out the drawers, the only things I kept were a handful of business cards and a couple of pocket reference books. There wasn't anything else in the drawers that belonged to me!

Now I sit in the "hotel" room when I'm in the office for meetings; that's typically the only reason I go in, any more. Even if I don't plan specific meetings, I usually end up spending the day in informal discussions rather than writing code. We converted a small conference room by replacing the central table with folding tables along the walls to serve as desks and by adding a few chairs. There are internet connections (or wireless), a phone, and the old whiteboard from when the room was actually a conference room. I used to spend most of my time at the office in a conference room anyway, so this works out fine. :-)

And of course when I'm at home, I work at my desk here most of the time. I do try to mix it up a bit, though. On nice days, I start on the patio after breakfast. I catch up on email, read news, listen to podcasts, etc. If it is cold or wet, I usually start out at the dining room table, or on the sofa with a cat in my lap. When I am ready to do some coding, I move to my home office, where I have a door to close and space on the desk to spread out notes or reference manuals. I also have a second monitor for my laptop there, which turns out to be a lot more useful than I ever expected. Occasionally I take some reading or documentation work to a local coffee shop, but their chairs tend to be less comfortable and the people traffic can be distracting, so I don't usually do any coding while I'm there.

Our application runs on Linux, and my desktop is a PowerBook, so I rely heavily on remote access tools such as ssh and VNC. I have a development box at home, because the lag time to the office is intolerable sometimes. Most of the collaboration tools we, such as trac and Jabber, use can be tunneled, so my ssh configuration includes a lot of port-forwarding. The end result is that I can access any of my work systems or tools remotely, even from the coffee shop if need be.

I've been working remotely for several years now, and I have a hard time imagining going back to the office every day. I do like being in the office, but the commute is killer. If the job were closer, or we lived somewhere else, it might not be as big of a concern. But this is a good situation for me now, and I've had no trouble getting used to it.


Technorati Tags:


Sunday, May 13, 2007

PyMOTW: Example code

All of the example code for the Python Module of the Week series is available for download. You can download it directly, or use easy_install to grab it from PyPI.

Updated 5/20/2007 with technorati tags.

Technorati Tags:
,


blogs and preservation

Ms. PyMOTW sent me a link to the survey UNC-Chappel Hill School of Information & Library Science is conducting called "Blogger Perceptions on Digital Preservation". If you blog, you might want to go participate. They ask thoughtful questions, and it only takes 5-10 minutes.

PyMOTW: logging

Module: logging
Purpose: Provide a standard interface for Python modules to report status, error, and informational messages.
Python Version: 2.3

Description:

The logging module defines a standard API for reporting errors and status information from all of your modules. The key benefit of having the logging API provided by a standard library module is that all python modules can participate in logging, so your application log can include messages from third-party modules.

It is, of course, possible to log messages with different verbosity levels or to different destinations. Support for writing log messgaes to files, HTTP GET/POST locations, email via SMTP, generic sockets, or OS-specific logging mechnisms are all supported by the standard module. You can also create your own log destination class if you have special requirements not met by any of the built-in classes.

Example:

Most applications are probably going to want to log to a file, so let's start with that case. Using the basicConfig() function, we can set up the default handler so that debug messages are written to a file.

import logging
LOG_FILENAME = '/tmp/logging_example.out'
logging.basicConfig(filename=LOG_FILENAME,
level=logging.DEBUG,
)

logging.debug('This message should go to the log file')


And now if we open the file and look at what we have, we should find the log message:

f = open(LOG_FILENAME, 'rt')
try:
body = f.read()
finally:
f.close()
print 'FILE:'
print body
print


FILE:
DEBUG:root:This message should go to the log file


If you run the script repeatedly, the additional log messages are appended to the file. To create a new file each time, you can pass a filemode argument to basicConfig() with a value of 'w'. Rather than managing the file size yourself, though, it is simpler to use a RotatingFileHandler:

import glob
import logging
import logging.handlers

LOG_FILENAME = '/tmp/logging_rotatingfile_example.out'

# Set up a specific logger with our desired output level

my_logger = logging.getLogger('MyLogger')
my_logger.setLevel(logging.DEBUG)

# Add the log message handler to the logger
handler = logging.handlers.RotatingFileHandler(LOG_FILENAME, maxBytes=20, backupCount=5)

my_logger.addHandler(handler)

# Log some messages
for i in range(20):
my_logger.debug('i = %d' % i)

# See what files are created
logfiles = glob.glob('%s*' % LOG_FILENAME)

for filename in logfiles:
print filename


The result should be 6 separate files, each with part of the log history for the application:

/tmp/logging_rotatingfile_example.out
/tmp/logging_rotatingfile_example.out.1
/tmp/logging_rotatingfile_example.out.2
/tmp/logging_rotatingfile_example.out.3
/tmp/logging_rotatingfile_example.out.4
/tmp/logging_rotatingfile_example.out.5


The most current file is always /tmp/logging_rotatingfile_example.out, and each time it reaches the size limit it is renamed with the suffix .1. Each of the existing backup files is renamed to increment the suffix (.1 becomes .2, etc.) and the .5 file is erased.

Obviously this example sets the log length much much too small as an extreme example. You would want to set maxBytes to an appropriate value.

Another useful feature of the logging API is the ability to produce different messages at different log levels. This allows you to instrument your code with debug messages, for example, but turning the log level down so that those debug messages are not written for your production system.

  CRITICAL 50
ERROR 40
WARNING 30
INFO 20
DEBUG 10
UNSET 0


The logger, handler, and log message call each specify a level. The log message is only emitted if the handler and logger are configured to emit messages of that level or lower. For example, if a message is CRITICAL, and the logger is set to ERROR, the message is emitted. If a message is a WARNING, and the logger is set to produce only ERRORs, the message is not emitted.

import logging
import sys

LEVELS = { 'debug':logging.DEBUG,
'info':logging.INFO,
'warning':logging.WARNING,
'error':logging.ERROR,
'critical':logging.CRITICAL,
}

if len(sys.argv) > 1:
level_name = sys.argv[1]
level = LEVELS.get(level_name, logging.NOTSET)
logging.basicConfig(level=level)

logging.debug('This is a debug message')
logging.info('This is an info message')
logging.warning('This is a warning message')
logging.error('This is an error message')
logging.critical('This is a critical error message')


Run the script with an argument like 'debug' or 'warning' to see which messages show up at different levels:

 $ python logging_level_example.py debug
DEBUG:root:This is a debug message
INFO:root:This is an info message
WARNING:root:This is a warning message
ERROR:root:This is an error message
CRITICAL:root:This is a critical error message

$ python logging_level_example.py info
INFO:root:This is an info message
WARNING:root:This is a warning message
ERROR:root:This is an error message
CRITICAL:root:This is a critical error message


You will notice that these log messages all have 'root' embedded in them. The logging module supports a hierarchy of loggers with different names. An easy way to tell where a specific log message comes from is to use a separate logger object for each of your modules. Each new logger "inherits" the configuration of its parent, and log messages sent to a logger include the name of that logger. Optionally, each logger can be configured differently, so that messages from different modules are handled in different ways. Let's look at a simple example of how to log from different modules so it is easy to trace the source of the message:

import logging

logging.basicConfig(level=logging.WARNING)

logger1 = logging.getLogger('package1.module1')
logger2 = logging.getLogger('package2.module2')

logger1.warning('This message comes from one module')
logger2.warning('And this message comes from another module')


And the output:

$ python logging_modules_example.py
WARNING:package1.module1:This message comes from one module
WARNING:package2.module2:And this message comes from another module


There are many, many, more options for configuring logging, including different log message formatting options, having messages delivered to multiple destinations, and changing the configuration of a long-running application on the fly using a socket interface. All of these options are covered in depth in the library module documentation.

References:

Example Code for logging

Collected examples

PEP 282

Python Standard Logging by Jeremy Jones

Python Module of the Week


Updated to correct download link for example code.
Updated 5/20/2007 with technorati tags.
Updated 9/5/2007 with minor formatting changes.

Technorati Tags:
,


MarsEdit Test

This is a test publishing through MarsEdit.

Thursday, May 10, 2007

CherryPy Essentials

A little over a week ago I received a review copy of Sylvain Hellegouarch's new book, "CherryPy Essentials", published through Packt Publishing. The timing couldn't have been better, since we have begun investigating Python web application frameworks at work for a new project. From previous work I have done with TurboGears, I knew that CherryPy was a contender, so I was definitely interested to see what version 3 had to offer. Sylvain's book is is a good starting point for the information I wanted.

Review

"CherryPy Essentials" is a fairly short book (251 pages), especially given the breadth of topics it covers. It could easily have been 2-3 times as long, if the author was wordy or repetitive, but the concise writing style means that there is a lot of good information packed into this short volume.

The outline is fairly typical for tech books:

  • What is this thing and why do I care? - chapter 1
  • How do I get it? - chapter 2
  • What can it do? - chapters 3-4
  • Build an example app - chapters 5-7
  • Advanced topics - chapters 8-10


The real substance of the book begins with chapter 3, which gives an overview of CherryPy. It includes a moderately sized introductory application which lets different authors post notes on a page. The sample code includes embedded comments and is used as a basis for a brief description of how CherryPy decides what to do when an HTTP request is received.

CherryPy comes with a host of modules to make building your application easier. The coverage given these modules in the "Library" section of chapter 3 probably does not do them justice. The author clearly states that his intent is not to create a reference guide, but I would have liked to see this section pulled out into its own chapter and expanded, possibly combined with the Tools list in chapter 4.

The more in-depth discussion of CherryPy in chapter 4 includes instructions for running multiple HTTP servers; various mechanisms for dispatching URLs to Python functions; serving static content; hooking into the core to add your own middleware (or "Tools" in CherryPy-parlance); and WSGI support. The same chapter also includes a description and concise example of how to use each Tool provided in the CherryPy core distribution, following a format not unlike the one I use for my Python Module of the Week series.

Chapters 5-7 discuss the design and development of a photo blog application, which is small enough for the reader to follow but large enough to delve into the details of how to build a real application with CherryPy. The presentation begins with the data model, then covers web services and user interface topics.

Some of chapter 5, which discusses working with databases, is unfocused and includes sections on topics such as backgrounds on database types and object-relational mapping libraries not actually used in the example applications. This material could have been eliminated without serious loss. It is interesting, but it detracts a bit from the overall coherence of this book. Once he selects the Dejavu ORM, the discussion refocuses and covers the mechanics of using that library to store and retrieve data.

Chapter 6 provides an excellent discussion of web services, REST, URL design, and the Atom publishing protocol. These were perhaps the most interesting sections in the book. The author clearly has a great deal of experience to share on these topics. I hope there is another book coming soon with a greater examination of these topics.

The coverage of presentation layer topics in chapter 7 begins with a brief history of HTML, up to the development of DHTML. Then the Kid templating language is introduced (thankfully without a comprehensive survey of all available Python templating languages :-). The UI for the example application is fairly simple, so only basic Kid features are really covered, but that is enough. The more complex DHTML work is handled by Mochikit, introduced here and used more extensively in chapter 8, which is devoted to Ajax.

The last 2 chapters of the book cover topics too frequently left out of other books: Testing and Deployment. The chapter on testing presents several tools which integrate well with CherryPy for automated testing of different aspects of your application, including webtest for unit testing, FunkLoad for load testing, and Selenium for UI testing. I had never seen these tools before, and the descriptions and examples were enough to make me add them to my "to be researched" list.

The final chapter covers deployment options for moving a CherryPy app into production. Options for using Apache, lighthttpd, WSGI, and SSL are covered, but no definitive "best practice" is suggested. I suppose the final choice should be made on more variables than could really be covered, but I would have liked to see clear guidelines for making a decision about which configuration to use in different situations.

Third-party Tools

Any good, modern, open source project does not stand alone, and CherryPy is no exception. "CherryPy Essentials" makes it clear that integrating with third-party tools is an important part of the design for CherryPy. Tools covered include:



Other tools are also mentioned, but covered in less detail.

Conclusion

"CherryPy Essentials" packs a surprising amount of information into a small space. The coverage is not exceptionally deep in any one area, but is fairly complete. The sample code is consistent and easy to read. The book is full of useful and interesting information and, while it occasionally suffers from disjoint flow, I can definitely recommend it to any Python programmer interested in the future of CherryPy development, and web technologies in general.


Special thanks to Ms. PyMOTW for proof reading this post for me.

Wednesday, May 9, 2007

unfluence

Back in January I described an idea for a site to build network graphs of people in public life. It looks like unfluence is doing something like what I envisioned, automatically. The data is based on state campaign donations from the National Institute on Money in Politics.

Oh, and it sounds like they are planning to release the source in some form, too.

Monday, May 7, 2007

clip-to-blog

I've noticed Jeremy Wagstaff using this service for a while, so I thought I would give it a try, too.
clipped from addons.mozilla.org

With Clipmarks, you can clip the best parts of web pages. Whether it's a paragraph, sentence, image or video, you can capture just the pieces you want without having to bookmark the entire page.

 blog it


Updated to fix the embedded HTML which Clipmark apparently doesn't like.

Sunday, May 6, 2007

ironic enough?


I can't decide if I should order one of these t-shirts from Alexa or not.

PyMOTW: bisect

Module: bisect
Purpose: Maintains a list in sorted order without having to call sort each time an item is added to the list.
Python Version: 1.4

Description:

The bisect module implements an algorithm for inserting elements into a list while maintaining the list in sorted order. This can be much more efficient than repeatedly sorting a list, or explicitly sorting a large list after it is constructed.

Example:

Let's look at a simple example using bisect.insort(), which inserts items into a list in sorted order.

import bisect
import random

# Use a constant seed to ensure that we see
# the same pseudo-random numbers each time
# we run the loop.
random.seed(1)

# Generate 20 random numbers and
# insert them into a list in sorted
# order.
l = []
for i in range(1, 20):
r = random.randint(1, 100)
position = bisect.bisect(l, r)
bisect.insort(l, r)
print '%2d %2d' % (r, position), l


The output for that script is:

14  0 [14]
85 1 [14, 85]
77 1 [14, 77, 85]
26 1 [14, 26, 77, 85]
50 2 [14, 26, 50, 77, 85]
45 2 [14, 26, 45, 50, 77, 85]
66 4 [14, 26, 45, 50, 66, 77, 85]
79 6 [14, 26, 45, 50, 66, 77, 79, 85]
10 0 [10, 14, 26, 45, 50, 66, 77, 79, 85]
3 0 [3, 10, 14, 26, 45, 50, 66, 77, 79, 85]
84 9 [3, 10, 14, 26, 45, 50, 66, 77, 79, 84, 85]
44 4 [3, 10, 14, 26, 44, 45, 50, 66, 77, 79, 84, 85]
77 9 [3, 10, 14, 26, 44, 45, 50, 66, 77, 77, 79, 84, 85]
1 0 [1, 3, 10, 14, 26, 44, 45, 50, 66, 77, 77, 79, 84, 85]
45 7 [1, 3, 10, 14, 26, 44, 45, 45, 50, 66, 77, 77, 79, 84, 85]
73 10 [1, 3, 10, 14, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85]
23 4 [1, 3, 10, 14, 23, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85]
95 17 [1, 3, 10, 14, 23, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85, 95]
91 17 [1, 3, 10, 14, 23, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85, 91, 95]


The first column shows the new random number. The second column shows the position where the number will be inserted into the list. The remainder of each line is the current sorted list.

This is a simple example, and for the amount of data we are manipulating it might be faster to simply build the list and then sort it once. But for long lists, significant time and memory savings can be achieved using an insertion sort algorithm such as this.

You probably noticed that the result set above includes a few repeated values (45 and 77). The bisect module provides 2 ways to handle repeats. New values can be inserted to the left of existing values, or to the right. The insort() function is actually an alias for insort_right(), which inserts after the existing value. The corresponding function insort_left() inserts before the existing value.

If we manipulate the same data using bisect_left() and insort_left(), we end up with the same sorted list but notice that the insert positions are different for the duplicate values.

# Reset the seed
random.seed(1)

# Use bisect_left and insort_left.
l = []
for i in range(1, 20):
r = random.randint(1, 100)
position = bisect.bisect_left(l, r)
bisect.insort_left(l, r)
print '%2d %2d' % (r, position), l


14  0 [14]
85 1 [14, 85]
77 1 [14, 77, 85]
26 1 [14, 26, 77, 85]
50 2 [14, 26, 50, 77, 85]
45 2 [14, 26, 45, 50, 77, 85]
66 4 [14, 26, 45, 50, 66, 77, 85]
79 6 [14, 26, 45, 50, 66, 77, 79, 85]
10 0 [10, 14, 26, 45, 50, 66, 77, 79, 85]
3 0 [3, 10, 14, 26, 45, 50, 66, 77, 79, 85]
84 9 [3, 10, 14, 26, 45, 50, 66, 77, 79, 84, 85]
44 4 [3, 10, 14, 26, 44, 45, 50, 66, 77, 79, 84, 85]
77 8 [3, 10, 14, 26, 44, 45, 50, 66, 77, 77, 79, 84, 85]
1 0 [1, 3, 10, 14, 26, 44, 45, 50, 66, 77, 77, 79, 84, 85]
45 6 [1, 3, 10, 14, 26, 44, 45, 45, 50, 66, 77, 77, 79, 84, 85]
73 10 [1, 3, 10, 14, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85]
23 4 [1, 3, 10, 14, 23, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85]
95 17 [1, 3, 10, 14, 23, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85, 95]
91 17 [1, 3, 10, 14, 23, 26, 44, 45, 45, 50, 66, 73, 77, 77, 79, 84, 85, 91, 95]


In addition to the Python implementation, there is a faster C implementation available. If the C version is present, that implementation overrides the pure Python implementation automatically when you import the bisect module.

References:

Insertion Sort
bisect_example.py
Python Module of the Week


Updated 5/20/2007 with technorati tags.
Updated 9/5/2007 with minor formatting changes.

Technorati Tags:
,