Monday, September 21, 2009

Updating Python 3.x docs for GHOP

GHOP planning is under way. The first step, as before, is to come up with the list of tasks that would let junior high and high school students contribute to Python.

We had good success last time with documentation-related tasks, such as writing examples, proofreading, or updating the standard library docs. This year Titus had the bright idea to run a survey to see what the community felt needed work.

This is your chance to vote for library modules that need help in the form of examples or HOWTOs, so head over to the Google moderator page and vote today!

Sunday, September 20, 2009

PyMOTW: resource - System resource management

resource – System resource management

Purpose:Manage the system resource limits for a Unix program.
Python Version:1.5.2

The functions in resource help you probe the current resources consumed by a process, and place limits on them to control how much load your program places on a system.

Current Usage

Use getrusage() to probe the resources used by the current process and/or its children. The return value is a data structure containing several resource metrics based on the current state of the system.

Note

Not all of the resource values gathered are displayed here. Refer to the stdlib docs for a more complete list.

import resource
import time

usage = resource.getrusage(resource.RUSAGE_SELF)

for name, desc in [
('ru_utime', 'User time'),
('ru_stime', 'System time'),
('ru_maxrss', 'Max. Resident Set Size'),
('ru_ixrss', 'Shared Memory Size'),
('ru_idrss', 'Unshared Memory Size'),
('ru_isrss', 'Stack Size'),
('ru_inblock', 'Block inputs'),
('ru_oublock', 'Block outputs'),
]:
print '%-25s (%-10s) = %s' % (desc, name, getattr(usage, name))

Because the test program is extremely simple, the results aren’t that interesting:

$ python resource_getrusage.py
User time (ru_utime ) = 0.015466
System time (ru_stime ) = 0.013189
Max. Resident Set Size (ru_maxrss ) = 0
Shared Memory Size (ru_ixrss ) = 0
Unshared Memory Size (ru_idrss ) = 0
Stack Size (ru_isrss ) = 0
Block inputs (ru_inblock) = 0
Block outputs (ru_oublock) = 1

Resource Limits

Separate from the current actual usage, it is possible to check the limits imposed on the application, and then change them.

import resource

for name, desc in [
('RLIMIT_CORE', 'core file size'),
('RLIMIT_CPU', 'CPU time'),
('RLIMIT_FSIZE', 'file size'),
('RLIMIT_DATA', 'heap size'),
('RLIMIT_STACK', 'stack size'),
('RLIMIT_RSS', 'resident set size'),
('RLIMIT_NPROC', 'number of processes'),
('RLIMIT_NOFILE', 'number of open files'),
('RLIMIT_MEMLOCK', 'lockable memory address'),
]:
limit_num = getattr(resource, name)
soft, hard = resource.getrlimit(limit_num)
print 'Maximum %-25s (%-15s) : %20s %20s' % (desc, name, soft, hard)

The return value for each limit is a tuple containing the soft limit imposed by the current configuration and the hard limit imposed by the operating system.

$ python resource_getrlimit.py
Maximum core file size (RLIMIT_CORE ) : 0 9223372036854775807
Maximum CPU time (RLIMIT_CPU ) : 9223372036854775807 9223372036854775807
Maximum file size (RLIMIT_FSIZE ) : 9223372036854775807 9223372036854775807
Maximum heap size (RLIMIT_DATA ) : 6291456 9223372036854775807
Maximum stack size (RLIMIT_STACK ) : 8388608 67104768
Maximum resident set size (RLIMIT_RSS ) : 9223372036854775807 9223372036854775807
Maximum number of processes (RLIMIT_NPROC ) : 266 532
Maximum number of open files (RLIMIT_NOFILE ) : 256 9223372036854775807
Maximum lockable memory address (RLIMIT_MEMLOCK ) : 9223372036854775807 9223372036854775807

The limits can be changed with setrlimit(). For example, to control the number of files a process can open the RLIMIT_NOFILE value can be set to use a smaller soft limit value.

import resource
import os

soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
print 'Soft limit starts as :', soft

resource.setrlimit(resource.RLIMIT_NOFILE, (4, hard))

soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
print 'Soft limit changed to :', soft

random = open('/dev/random', 'r')
print 'random has fd =', random.fileno()
try:
null = open('/dev/null', 'w')
except IOError, err:
print err
else:
print 'null has fd =', null.fileno()
$ python resource_setrlimit_nofile.py
Soft limit starts as : 256
Soft limit changed to : 4
random has fd = 3
[Errno 24] Too many open files: '/dev/null'

It can also be useful to limit the amount of CPU time a process should consume, to avoid eating up too much time. When the process runs past the allotted amount of time, it it sent a SIGXCPU signal.

import resource
import sys
import signal
import time

# Set up a signal handler to notify us
# when we run out of time.
def time_expired(n, stack):
print 'EXPIRED :', time.ctime()
raise SystemExit('(time ran out)')

signal.signal(signal.SIGXCPU, time_expired)

# Adjust the CPU time limit
soft, hard = resource.getrlimit(resource.RLIMIT_CPU)
print 'Soft limit starts as :', soft

resource.setrlimit(resource.RLIMIT_CPU, (1, hard))

soft, hard = resource.getrlimit(resource.RLIMIT_CPU)
print 'Soft limit changed to :', soft
print

# Consume some CPU time in a pointless exercise
print 'Starting:', time.ctime()
for i in range(200000):
for i in range(200000):
v = i * i

# We should never make it this far
print 'Exiting :', time.ctime()

Normally the signal handler should flush all open files and close them, but in this case we just print a message and exit.

$ python resource_setrlimit_cpu.py
Soft limit starts as : 9223372036854775807
Soft limit changed to : 1

Starting: Sun Sep 20 12:11:39 2009
EXPIRED : Sun Sep 20 12:11:40 2009
(time ran out)

See also

resource
The standard library documentation for this module.
signal
For details on registering signal handlers.

PyMOTW Home

The canonical version of this article

Saturday, September 19, 2009

Book Review: The Success of Open Source



For the past few weeks I've been wrapped up reading Steven Weber's The Success of Open Source. Published in 2004, it is a look at what the open source movement is and how it works, from the perspective of a political scientist. This is no trite look at why people would choose to give away the fruits of their labor. His analysis is serious and well considered. He stresses several times that his goal is to ask questions rather than answer them, but he does offer some observations about the open source movement as a larger social movement and how it might spread to other parts of the culture.

Weber starts out by explaining his goal for the book, to study the political and economic foundations of open source communities and processes. He makes two assertions, around which the rest of the book is framed:

1. The open source phenomenon is an important "puzzle" for social scientists who study cooperation.

2. OSS communities have been fundamentally impacted by the internet.

Early History:

The second chapter covers the basic facts of the early history of open source, well before it was called that. From the PACT compiler project for IBM mainframes, through the failure of Multics, and the unintended consequence of the AT&T consent decree that lead to the original licensing terms for Unix, he covers some details that aren't a part of the usual story that includes DARPA, BSD, the fragmentation of the Unix market, and FSF and the GNU project. The writing is engaging, and I could recommend the book on this history section alone.

How Does OSS Work?:

Chapter 3 tries to answer the question, "What is Open Source and How Does It Work?". It covers some essential software project characteristics such as the division of labor between "analyst" and "programmer" and how that historically lead to problems because the designer of software was too far removed from the end-user.

The essence of software design, like the writing of poetry, is a creative process. The role of technology and organization is to liberate that creativity to the greatest extent possible and to facilitate its translation into working code. Neither new technology nor a "better" division of labor can replace the creative essence that drives the project.


Weber builds on Brooke's Law to say that success of a project isn't just about getting more people involved, but also about how they are organized. He points out that open source is much more about the process than the resulting product, which is an artifact of the organization and creative energies of the participants. He identifies four fundamental organization schemes that repeat in open source projects:

1. A hierarchy, where patches flow up to a more or less central maintainer, as with Linux.

2. The concentric circles used by the BSD project, in which maintainers closer to the center have more rights and privileges, but within a circle they are essentially equal.

3. The pumpkin holder or token-based system used by the developers of Perl.

4. A democratic voting system, such as used to approve changes in Apache.

One assertion Weber makes relates to the different cultures that evolve around BSD vs. GPL-licensed projects. His claim is that core developers in BSD-licensed projects do not depend as much on submissions from the end user as GPL projects do. His evidence for this is the various BSD operating systems and Linux. I think his sample size is too small, though. I'm not convinced that the license has much to do with "dependence" on contributions. I think the attitude of the core developers, and their willingness to accept patches, is more important.

Evolution of Open Source:

Chapter four talks about the "maturation" of three major projects (Linux, BSD, and Apache) as they evolved in the 1990's, the "golden age" of open source. He covers several pivotal events during that period and how the community identities gelled as a result of passing through critical times like the fracturing of BSD and other Unixes, flame wars and other crises among the Linux maintainers, and the conflict caused by the "ideological passion" of Richard Stallman and the FSF. This chapter was an interesting retrospective and it really pulled together a cohesive picture of what happened that brought us to where we are today.

Motivation and Organization:

Chapter five examines the microfoundations of open source made up of the motivations of individual contributors. For example, he says that open source developers self-select as a way to boost their egos by using acceptance of their code as a "signal" of its quality to developers who are not necessarily skilled enough to recognize quality on their own.

It is clearly the best programmers who have the strongest incentive to show others just how good they are. If you are mediocre, the last thing you want is for people to see your source code.


Ego boosting is one of 6 motivating factors he discusses, and is not necessarily the most important for most developers.

Chapter six looks at how individual developers come together to form groups and focus their creative energies with constructive contributions. He studies the social and economic pressures for and against forking a project, and comes to an interesting conclusion: The leader of a project needs the fellow contributors more than they need him. When a fork is created, the new leader has to convince potential followers that the new project will be better or more popular than the old one. So while forking may give the leader more visibility, that only works if he is successful at attracting followers, in which case he is just as likely to be a successful contributor to the original project.

Business Models and Legal Questions:

No examination of open source software would be complete without a discussion of intellectual property law and how open source licenses work with various business models. Weber covers the way OSS subverts the traditional business model of vendor lock-in and leads to new models.

He starts with the models identified by Frank Hecker and Robert Young: support sellers (IBM); loss leaders (hardware vendors); sell it, free it (Netscape); accessorizing with books and training (O'Reilly); and service enabler (HP). Then he moves on to "less pure" models including BitKeeper's delivery of "commercially crippled" versions; VA Linux's web sites and conferences; and RedHat's packaging and enterprise support model. He also covers companies that build commercial software on top of open source, such as Sun and Apple's use of BSD in their operating systems.

The Code That Changed The World:

Weber begins his final chapter by comparing the impact of OSS to the Japanese manufacturing innovations described in The Machine That Changed the World : The Story of Lean Production and re-emphasizing the importance of process over product.

The Toyota "system" was not a car, and it was not uniquely Japanese. ... Open source is not a piece of software, and it is not unique to a group of hackers.


This leads in to the rest of the conclusion, where he brings together observations on intellectual property rights law, the limiting factors for specialization and division of labor and how they impact organizational structures, and the challenges of relating hierarchical versus network organizations. He also offers some observations about how open source techniques and attitudes can be applied directly in other fields such as family practice medicine and genomics.

Recommendations:

Weber covers a lot of material, and his writing is clear, for the most part (especially for an academic :-). I enjoyed reading the first seven chapters, but got a little bogged down in the chapter eight. I was disappointed at his reluctance to draw more definite conclusions in a few cases, but by remaining neutral he was able to focus on framing several thought-provoking social and economic questions about the open source movement.

Saturday, September 5, 2009

PyMOTW: fractions - Rational Numbers

fractions – Rational Numbers

Purpose:Implements a class for working with rational numbers.
Python Version:2.6 and later

The Fraction class implements numerical operations for rational numbers based on the API defined by Rational in numbers.

Creating Fraction Instances

As with decimal, new values can be created in several ways. One easy way is to create them from separate numerator and denominator values:

import fractions

for n, d in [ (1, 2), (2, 4), (3, 6) ]:
f = fractions.Fraction(n, d)
print '%s/%s = %s' % (n, d, f)

The lowest common denominator is maintained as new values are computed.

$ python fractions_create_integers.py
1/2 = 1/2
2/4 = 1/2
3/6 = 1/2

Another way to create a Fraction is using a string representation of <numerator> / <denominator>:

import fractions

for s in [ '1/2', '2/4', '3/6' ]:
f = fractions.Fraction(s)
print '%s = %s' % (s, f)
$ python fractions_create_strings.py
1/2 = 1/2
2/4 = 1/2
3/6 = 1/2

Strings can also use the more usual decimal or floating point notation of [<digits>].[<digits>].

import fractions

for s in [ '0.5', '1.5', '2.0' ]:
f = fractions.Fraction(s)
print '%s = %s' % (s, f)
$ python fractions_create_strings_floats.py
0.5 = 1/2
1.5 = 3/2
2.0 = 2

There are class methods for creating Fraction instances directly from other representations of rational values such as float or decimal.

import fractions

for v in [ 0.1, 0.5, 1.5, 2.0 ]:
print '%s = %s' % (v, fractions.Fraction.from_float(v))

Notice that for floating point values that cannot be expressed exactly the rational representation may yield unexpected results.

$ python fractions_from_float.py
0.1 = 3602879701896397/36028797018963968
0.5 = 1/2
1.5 = 3/2
2.0 = 2

Using decimal representations of the values gives the expected results.

import decimal
import fractions

for v in [ decimal.Decimal('0.1'),
decimal.Decimal('0.5'),
decimal.Decimal('1.5'),
decimal.Decimal('2.0'),
]:
print '%s = %s' % (v, fractions.Fraction.from_decimal(v))
$ python fractions_from_decimal.py
0.1 = 1/10
0.5 = 1/2
1.5 = 3/2
2.0 = 2

Arithmetic

Once the fractions are instantiated, they can be used in mathematical expressions as you would expect.

import fractions

f1 = fractions.Fraction(1, 2)
f2 = fractions.Fraction(3, 4)

print '%s + %s = %s' % (f1, f2, f1 + f2)
print '%s - %s = %s' % (f1, f2, f1 - f2)
print '%s * %s = %s' % (f1, f2, f1 * f2)
print '%s / %s = %s' % (f1, f2, f1 / f2)
$ python fractions_arithmetic.py
1/2 + 3/4 = 5/4
1/2 - 3/4 = -1/4
1/2 * 3/4 = 3/8
1/2 / 3/4 = 2/3

Approximating Values

A useful feature of Fraction is the ability to convert a floating point number to an approximate rational value by limiting the size of the denominator.

import fractions
import math

print 'PI =', math.pi

f_pi = fractions.Fraction(str(math.pi))
print 'No limit =', f_pi

for i in range(1, 100, 5):
limited = f_pi.limit_denominator(i)
print '{0:8} = {1}'.format(i, limited)
$ python fractions_limit_denominator.py
PI = 3.14159265359
No limit = 314159265359/100000000000
1 = 3
6 = 19/6
11 = 22/7
16 = 22/7
21 = 22/7
26 = 22/7
31 = 22/7
36 = 22/7
41 = 22/7
46 = 22/7
51 = 22/7
56 = 22/7
61 = 179/57
66 = 201/64
71 = 223/71
76 = 223/71
81 = 245/78
86 = 267/85
91 = 267/85
96 = 289/92

See also

fractions
The standard library documentation for this module.
decimal
The decimal module provides an API for fixed and floating point math.
numbers
Numeric abstract base classes.

PyMOTW Home

The canonical version of this article