Sunday, March 30, 2008

PyMOTW: urllib

The urllib module provides a simple interface for network resource access.

Module: urllib
Purpose: Accessing remote resources that don't need authentication, cookies, etc.
Python Version: 1.4 and later

Although urllib can be used with gopher and ftp, these examples all use http.

HTTP GET:

The test server for these examples is in BaseHTTPServer_GET.py, from the PyMOTW examples for the BaseHTTPServer module. Start the server in one terminal window, then run these examples in another.

An HTTP GET operation is the simplest use of urllib. Simply pass the URL to urlopen() to get a "file-like" handle to the remote data.

import urllib

response = urllib.urlopen('http://localhost:8080/')
print 'RESPONSE:', response
print 'URL :', response.geturl()

headers = response.info()
print 'DATE :', headers['date']
print 'HEADERS :'
print '---------'
print headers

data = response.read()
print 'LENGTH :', len(data)
print 'DATA :'
print '---------'
print data


The example server takes the incoming values and formats a plain text response to send back. The return value from urlopen() gives access to the headers from the HTTP server through the info() method, and the data for the remote resource via methods like read() and readlines().


$ python urllib_urlopen.py
RESPONSE: <addinfourl at 10180248 whose fp = <socket._fileobject object at 0x935c30>>
URL : http://localhost:8080/
DATE : Sun, 30 Mar 2008 16:27:10 GMT
HEADERS :
---------
Server: BaseHTTP/0.3 Python/2.5.1
Date: Sun, 30 Mar 2008 16:27:10 GMT

LENGTH : 221
DATA :
---------
CLIENT VALUES:
client_address=('127.0.0.1', 54354) (localhost)
command=GET
path=/
real path=/
query=
request_version=HTTP/1.0

SERVER VALUES:
server_version=BaseHTTP/0.3
sys_version=Python/2.5.1
protocol_version=HTTP/1.0



The file-like object is also iterable:

import urllib

response = urllib.urlopen('http://localhost:8080/')
for line in response:
print line.rstrip()


Since the lines are returned with newlines and carriage returns intact, this example strips them before printing the output.


$ python urllib_urlopen_iterator.py
CLIENT VALUES:
client_address=('127.0.0.1', 54380) (localhost)
command=GET
path=/
real path=/
query=
request_version=HTTP/1.0

SERVER VALUES:
server_version=BaseHTTP/0.3
sys_version=Python/2.5.1
protocol_version=HTTP/1.0


Encoding Arguments:

Arguments can be passed to the server by encoding them and appending them to the URL.

import urllib

query_args = { 'q':'query string', 'foo':'bar' }
encoded_args = urllib.urlencode(query_args)
print 'Encoded:', encoded_args

url = 'http://localhost:8080/?' + encoded_args
print urllib.urlopen(url).read()


Notice that the query, in the list of client values, contains the encoded query arguments.


$ python urllib_urlencode.py
Encoded: q=query+string&foo=bar
CLIENT VALUES:
client_address=('127.0.0.1', 54415) (localhost)
command=GET
path=/?q=query+string&foo=bar
real path=/
query=q=query+string&foo=bar
request_version=HTTP/1.0

SERVER VALUES:
server_version=BaseHTTP/0.3
sys_version=Python/2.5.1
protocol_version=HTTP/1.0



To pass a sequence of values using separate occurrences of the variable in the query string, pass doseq=True to urlencode().

import urllib

query_args = { 'foo':['foo1', 'foo2'] }
print 'Single :', urllib.urlencode(query_args)
print 'Sequence:', urllib.urlencode(query_args, doseq=True)



$ python urllib_urlencode_doseq.py
Single : foo=%5B%27foo1%27%2C+%27foo2%27%5D
Sequence: foo=foo1&foo=foo2


To decode the query string, see the FieldStorage class from the cgi module.

Special characters within the query arguments that might cause parse problems with the URL on the server side are "quoted" when passed to urlencode(). To quote them locally to make safe versions of the strings, you can use the quote() or quote_plus() functions directly.

import urllib

url = 'http://localhost:8080/~dhellmann/'
print 'urlencode() :', urllib.urlencode({'url':url})
print 'quote() :', urllib.quote(url)
print 'quote_plus():', urllib.quote_plus(url)


Notice that quote_plus() is more aggressive about the characters it replaces.


$ python urllib_quote.py
urlencode() : url=http%3A%2F%2Flocalhost%3A8080%2F%7Edhellmann%2F
quote() : http%3A//localhost%3A8080/%7Edhellmann/
quote_plus(): http%3A%2F%2Flocalhost%3A8080%2F%7Edhellmann%2F


To reverse the quote operations, use unquote() or unquote_plus(), as appropriate.

import urllib

print urllib.unquote('http%3A//localhost%3A8080/%7Edhellmann/')
print urllib.unquote_plus('http%3A%2F%2Flocalhost%3A8080%2F%7Edhellmann%2F')



$ python urllib_unquote.py
http://localhost:8080/~dhellmann/
http://localhost:8080/~dhellmann/


HTTP POST:

The test server for these examples is in BaseHTTPServer_POST.py, from the PyMOTW examples for the BaseHTTPServer module. Start the server in one terminal window, then run these examples in another.

To POST data to the remote server, instead of using GET, simply pass the encoded query arguments as data to urlopen().

import urllib

query_args = { 'q':'query string', 'foo':'bar' }
encoded_args = urllib.urlencode(query_args)
url = 'http://localhost:8080/'
print urllib.urlopen(url, encoded_args).read()



$ python urllib_urlopen_post.py
Client: ('127.0.0.1', 54545)
Path: /
Form data:
q=query string
foo=bar



You can send any byte-string as data, if the server expects something other than url-encoded form arguments in the posted data.

Paths vs. URLs:

Some operating systems use different values for separating the components of paths in local files than URLs. To make your code portable, you should use the functions pathname2url() and url2pathname() to convert back and forth. Since I am working on a Mac, I have to explicitly import the Windows versions of the functions. Using the versions of the functions exported by urllib gives you the correct defaults for your platform, so you do not need to do this.

import os

from urllib import pathname2url, url2pathname

print '== Default =='
path = '/a/b/c'
print 'Original:', path
print 'URL :', pathname2url(path)
print 'Path :', url2pathname('/d/e/f')
print

from nturl2path import pathname2url, url2pathname

print '== Windows, without drive letter =='
path = path.replace('/', '\\')
print 'Original:', path
print 'URL :', pathname2url(path)
print 'Path :', url2pathname('/d/e/f')
print

print '== Windows, with drive letter =='
path = 'C:\\' + path.replace('/', '\\')
print 'Original:', path
print 'URL :', pathname2url(path)
print 'Path :', url2pathname('/d/e/f')


There are two Windows examples, with and without the drive letter at the prefix of the path.


$ python urllib_pathnames.py
== Default ==
Original: /a/b/c
URL : /a/b/c
Path : /d/e/f

== Windows, without drive letter ==
Original: \a\b\c
URL : /a/b/c
Path : \d\e\f

== Windows, with drive letter ==
Original: C:\\a\b\c
URL : ///C|/a/b/c
Path : \d\e\f


Simple Retrieval with Cache:

Retrieving data is a common operation, and urllib includes the urlretrieve() function so you don't have to write your own. urlretrieve() takes arguments for the URL, a temporary file to hold the data, a function to report on download progress, and data to pass if the URL refers to a form where data should be POSTed. If no filename is given, urlretrieve() creates a temporary file. You can delete the file yourself, or treat the file as a cache and use urlcleanup() to remove it.

This example uses GET to retrieve some data from a web server:

import urllib
import os

def reporthook(blocks_read, block_size, total_size):
if not blocks_read:
print 'Connection opened'
return
if total_size < 0:
# Unknown size
print 'Read %d blocks' % blocks_read
else:
amount_read = blocks_read * block_size
print 'Read %d blocks, or %d/%d' % (blocks_read, amount_read, total_size)
return

try:
filename, msg = urllib.urlretrieve('http://blog.doughellmann.com/', reporthook=reporthook)
print
print 'File:', filename
print 'Headers:'
print msg
print 'File exists before cleanup:', os.path.exists(filename)

finally:
urllib.urlcleanup()

print 'File still exists:', os.path.exists(filename)


Since the server does not return a Content-length header, urlretrieve() does not know how big the data should be, and passes -1 as the total_size argument to reporthook().


$ python urllib_urlretrieve.py
Connection opened
Read 1 blocks
Read 2 blocks
Read 3 blocks
Read 4 blocks
Read 5 blocks
Read 6 blocks
Read 7 blocks
Read 8 blocks
Read 9 blocks
Read 10 blocks
Read 11 blocks
Read 12 blocks
Read 13 blocks
Read 14 blocks
Read 15 blocks
Read 16 blocks
Read 17 blocks
Read 18 blocks
Read 19 blocks

File: /var/folders/9R/9R1t+tR02Raxzk+F71Q50U+++Uw/-Tmp-/tmp3HRpZP
Headers:
Content-Type: text/html; charset=UTF-8
Last-Modified: Tue, 25 Mar 2008 23:09:10 GMT
Cache-Control: max-age=0 private
ETag: "904b02e0-c7ff-47f6-9f35-cc6de5d2a2e5"
Server: GFE/1.3
Date: Sun, 30 Mar 2008 17:36:48 GMT
Connection: Close

File exists before cleanup: True
File still exists: False


URLopener:

urllib provides a URLopener base class, and FancyURLopener with default handling for the supported protocols. If you find yourself needing to change their behavior, you are probably better off looking at the urllib2 module, added in Python 2.1 (to be covered in a future PyMOTW).

References:

RFC 2616 - HTTP Specification
cgi - For decoding query arguments.
PyMOTW: BaseHTTPServer
urllib2 - For more complex URL access needs
Python Module of the Week Home
Download Sample Code


Technorati Tags:
,


Sunday, March 23, 2008

PyMOTW: collections

The collections module includes container data types beyond the builtin types list and dict.

Module: collections
Purpose: Container data types.
Python Version: 2.4 and later

Deque:

A double-ended queue, or "deque", supports adding and removing elements from either end. The more commonly used stacks and queues are degenerate forms of deques, where the inputs and outputs are restricted to a single end.

Since deques are a type of sequence container, they support some of the same operations that lists support, such as examining the contents with __getitem__(), determining length, and removing elements from the middle by matching identity.

import collections

d = collections.deque('abcdefg')
print 'Deque:', d
print 'Length:', len(d)
print 'Left end:', d[0]
print 'Right end:', d[-1]

d.remove('c')
print 'remove(c):', d



$ python collections_deque.py
Deque: deque(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
Length: 7
Left end: a
Right end: g
remove(c): deque(['a', 'b', 'd', 'e', 'f', 'g'])


A deque can be populated from either end, termed "left" and "right" in the Python implementation.

import collections

# Add to the right
d = collections.deque()
d.extend('abcdefg')
print 'extend :', d
d.append('h')
print 'append :', d

# Add to the left
d = collections.deque()
d.extendleft('abcdefg')
print 'extendleft:', d
d.appendleft('h')
print 'appendleft:', d


Notice that extendleft() iterates over its input and performs the equivalent of an appendleft() for each item. The end result is the deque contains the input sequence in reverse order.


$ python collections_deque_populating.py
extend : deque(['a', 'b', 'c', 'd', 'e', 'f', 'g'])
append : deque(['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h'])
extendleft: deque(['g', 'f', 'e', 'd', 'c', 'b', 'a'])
appendleft: deque(['h', 'g', 'f', 'e', 'd', 'c', 'b', 'a'])


Similarly, the elements of the deque can be consumed from both or either end, depending on the algorithm you're applying.

import collections

print 'From the right:'
d = collections.deque('abcdefg')
while True:
try:
print d.pop()
except IndexError:
break

print 'From the left:'
d = collections.deque('abcdefg')
while True:
try:
print d.popleft()
except IndexError:
break



$ python collections_deque_consuming.py
From the right:
g
f
e
d
c
b
a
From the left:
a
b
c
d
e
f
g


Since deques are thread-safe, you can even consume the contents from both ends at the same time in separate threads.

import collections
import threading
import time

candle = collections.deque(xrange(11))

def burn(direction, nextSource):
while True:
try:
next = nextSource()
except IndexError:
break
else:
print '%8s: %s' % (direction, next)
time.sleep(0.1)
print '%8s done' % direction
return

left = threading.Thread(target=burn, args=('Left', candle.popleft))
right = threading.Thread(target=burn, args=('Right', candle.pop))

left.start()
right.start()

left.join()
right.join()



$ python collections_deque_both_ends.py
Left: 0
Right: 10
Left: 1
Right: 9
Left: 2
Right: 8
Left: 3
Right: 7
Left: 4
Right: 6
Left: 5
Right done
Left done


Another useful capability of the deque is to rotate it in either direction, to skip over some item(s).

import collections

d = collections.deque(xrange(10))
print 'Normal :', d

d = collections.deque(xrange(10))
d.rotate(2)
print 'Right rotation:', d

d = collections.deque(xrange(10))
d.rotate(-2)
print 'Left rotation :', d


Rotating the deque to the right (using a positive rotation) takes items from the right end and moves them to the left end. Rotating to the left (with a negative value) takes items from the left end and moves them to the right end. It may help to visualize the items in the deque as being engraved along the edge of a dial.


$ python collections_deque_rotate.py
Normal : deque([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
Right rotation: deque([8, 9, 0, 1, 2, 3, 4, 5, 6, 7])
Left rotation : deque([2, 3, 4, 5, 6, 7, 8, 9, 0, 1])


defaultdict:

The standard dictionary includes the method setdefault() for retrieving a value and establishing a default if the value does not exist. By contrast, defaultdict lets you specify the default up front when it is initialized.

import collections

def default_factory():
return 'default value'

d = collections.defaultdict(default_factory, foo='bar')
print d
print d['foo']
print d['bar']



$ python collections_defaultdict.py
defaultdict(<function default_factory at 0x7ca70>, {'foo': 'bar'})
bar
default value


This works well as long as it is appropriate for all keys to use that same default. It can be especially useful if the default is a type used for aggregating or accumulating values, such as a list, set, or even integer. The standard library documentation includes several examples of using defaultdict this way.

References:

Wikipedia: Deque
Deque Recipes
defaultdict examples
James Tauber: Evolution of Default Dictionaries in Python
Python Module of the Week Home
Download Sample Code


Technorati Tags:
,


Monday, March 17, 2008

Which module should I write about next?

I had some great feedback about the PyMOTW series from several of you at PyCon this weekend. Unfortunately, when I put you on the spot, no one had suggestions for what to write about next. I've been going through the library more or less randomly, and in the absence of a better idea I can continue with that plan. On the other hand, if there's a topic that you would really like more details on, let me know and I'll try to bump it up in the queue. My regular work schedule is pretty slammed right now, so smaller (or simpler) modules will be given higher priority than anything like re or socket, both of which have entire books written about them.

So, speak up and let me know what to write about next week.

Sunday, March 16, 2008

PyMOTW: datetime

The datetime module includes functions and classes for doing date parsing, formatting, and arithmetic.

Module: datetime
Purpose: Date/time value manipulation.
Python Version: 2.3 and later

Times:

Time values are represented with the time class. Times have attributes for hour, minute, second, and microsecond. They also, optionally, include time zone information. The arguments to initialize a time instance are optional, but the default of 0 is unlikely to be what you want.

import datetime

t = datetime.time(1, 2, 3)
print t
print 'hour :', t.hour
print 'minute:', t.minute
print 'second:', t.second
print 'microsecond:', t.microsecond
print 'tzinfo:', t.tzinfo



$ python datetime_time.py
01:02:03
hour : 1
minute: 2
second: 3
microsecond: 0
tzinfo: None


A time instance only holds values of time, and not dates.

import datetime

print 'Earliest :', datetime.time.min
print 'Latest :', datetime.time.max
print 'Resolution:', datetime.time.resolution


The min and max class attributes reflect the valid range of times in a single day.


$ python datetime_time_minmax.py
Earliest : 00:00:00
Latest : 23:59:59.999999
Resolution: 0:00:00.000001


The resolution for time is limited to microseconds. More precise values are truncated.

import datetime

for m in [ 1, 0, 0.1, 0.6 ]:
print '%02.1f :' % m, datetime.time(0, 0, 0, microsecond=m)


In fact, using floating point numbers for the microsecond argument generates a DeprecationWarning.


$ python datetime_time_resolution.py
/Users/dhellmann/Documents/PyMOTW/in_progress/datetime/datetime_time_resolution.py:14: DeprecationWarning: integer argument expected, got float
print '%02.1f :' % m, datetime.time(0, 0, 0, microsecond=m)
1.0 : 00:00:00.000001
0.0 : 00:00:00
0.1 : 00:00:00
0.6 : 00:00:00


Dates:

Basic date values are represented with the date class. Instances have attributes for year, month, and day. It is easy to create a date representing today's date using the today() class method.

import datetime

today = datetime.date.today()
print today
print 'ctime:', today.ctime()
print 'tuple:', today.timetuple()
print 'ordinal:', today.toordinal()
print 'Year:', today.year
print 'Mon :', today.month
print 'Day :', today.day


This example prints today's date in several formats:


$ python datetime_date.py
2008-03-13
ctime: Thu Mar 13 00:00:00 2008
tuple: (2008, 3, 13, 0, 0, 0, 3, 73, -1)
ordinal: 733114
Year: 2008
Mon : 3
Day : 13


There are also class methods for creating instances from integers (using proleptic Gregorian ordinal values, which starts counting from Jan. 1 of the year 1) or POSIX timestamp values.

import datetime
import time

o = 733114
print 'o:', o
print 'fromordinal(o):', datetime.date.fromordinal(o)
t = time.time()
print 't:', t
print 'fromtimestamp(t):', datetime.date.fromtimestamp(t)


This example illustrates the different value types used by fromordinal() and fromtimestamp().


$ python datetime_date_fromordinal.py
o: 733114
fromordinal(o): 2008-03-13
t: 1205436039.53
fromtimestamp(t): 2008-03-13


The range of date values supported can be determined using the min and max attributes.

import datetime

print 'Earliest :', datetime.date.min
print 'Latest :', datetime.date.max
print 'Resolution:', datetime.date.resolution


The resolution for dates is whole days.


$ python datetime_date_minmax.py
Earliest : 0001-01-01
Latest : 9999-12-31
Resolution: 1 day, 0:00:00


Another way to create new date instances uses the replace() method of an existing date. For example, you can change the year, leaving the day and month alone.

import datetime

d1 = datetime.date(2008, 3, 12)
print 'd1:', d1

d2 = d1.replace(year=2009)
print 'd2:', d2



$ python datetime_date_replace.py
d1: 2008-03-12
d2: 2009-03-12


timedeltas:

Using replace() is not the only way to calculate future/past dates. You can use datetime to perform basic arithmetic on date values via the timedelta class. Subtracting dates produces a timedelta, and a timedelta can be added or subtracted from a date to produce another date. The internal values for timedeltas are stored in days, seconds, and microseconds.

import datetime

print "microseconds:", datetime.timedelta(microseconds=1)
print "milliseconds:", datetime.timedelta(milliseconds=1)
print "seconds :", datetime.timedelta(seconds=1)
print "minutes :", datetime.timedelta(minutes=1)
print "hours :", datetime.timedelta(hours=1)
print "days :", datetime.timedelta(days=1)
print "weeks :", datetime.timedelta(weeks=1)


Intermediate level values passed to the constructor are converted into days, seconds, and microseconds.


$ python datetime_timedelta.py
microseconds: 0:00:00.000001
milliseconds: 0:00:00.001000
seconds : 0:00:01
minutes : 0:01:00
hours : 1:00:00
days : 1 day, 0:00:00
weeks : 7 days, 0:00:00


Arithmetic:

Date math uses the standard arithmetic operators. This example with date objects illustrates using timedeltas to compute new dates, and subtracting date instances to produce timedeltas (including a negative delta value).

import datetime

today = datetime.date.today()
print 'Today :', today

one_day = datetime.timedelta(days=1)
print 'One day :', one_day

yesterday = today - one_day
print 'Yesterday:', yesterday

tomorrow = today + one_day
print 'Tomorrow :', tomorrow

print 'tomorrow - yesterday:', tomorrow - yesterday
print 'yesterday - tomorrow:', yesterday - tomorrow



$ python datetime_date_math.py
Today : 2008-03-13
One day : 1 day, 0:00:00
Yesterday: 2008-03-12
Tomorrow : 2008-03-14
tomorrow - yesterday: 2 days, 0:00:00
yesterday - tomorrow: -2 days, 0:00:00


Comparing Values:

Both date and time values can be compared using the standard operators to determine which is earlier or later.

import datetime
import time

print 'Times:'
t1 = datetime.time(12, 55, 0)
print '\tt1:', t1
t2 = datetime.time(13, 5, 0)
print '\tt2:', t2
print '\tt1 < t2:', t1 < t2

print 'Dates:'
d1 = datetime.date.today()
print '\td1:', d1
d2 = datetime.date.today() + datetime.timedelta(days=1)
print '\td2:', d2
print '\td1 > d2:', d1 > d2



$ python datetime_comparing.py
Times:
t1: 12:55:00
t2: 13:05:00
t1 < t2: True
Dates:
d1: 2008-03-13
d2: 2008-03-14
d1 > d2: False


Combining Dates and Times:

You should use the datetime class to hold values consisting of both date and time components. Like with date, there are several convenient class methods to make creating datetime objects from other common values easier.

import datetime

print 'Now :', datetime.datetime.now()
print 'Today :', datetime.datetime.today()
print 'UTC Now:', datetime.datetime.utcnow()

d = datetime.datetime.now()
for attr in [ 'year', 'month', 'day', 'hour', 'minute', 'second', 'microsecond']:
print attr, ':', getattr(d, attr)


As you might expect, the datetime instance has all of the attributes of a date and time object.


$ python datetime_datetime.py
Now : 2008-03-15 22:58:14.770074
Today : 2008-03-15 22:58:14.779804
UTC Now: 2008-03-16 03:58:14.779858
year : 2008
month : 3
day : 15
hour : 22
minute : 58
second : 14
microsecond : 780399


Just as with date, the datetime class provides convenient class methods for creating new instances. Of course it includes fromordinal() and fromtimestamp(). In addition, combine() can be useful if you already have a date instance and time instance and want to create a datetime.

import datetime

t = datetime.time(1, 2, 3)
print 't :', t

d = datetime.date.today()
print 'd :', d

dt = datetime.datetime.combine(d, t)
print 'dt:', dt



$ python datetime_datetime_combine.py
t : 01:02:03
d : 2008-03-16
dt: 2008-03-16 01:02:03


Formatting and Parsing:

The default string representation of a datetime object uses the ISO 8601 format (YYYY-MM-DDTHH:MM:SS.mmmmmm). Alternate formats can be generated using strftime(). Similarly, if your input data includes timestamp values parsable with time.strptime() strptime() is a convenient way to convert them to datetime instances.

import datetime

format = "%a %b %d %H:%M:%S %Y"

today = datetime.datetime.today()
print 'ISO :', today

s = today.strftime(format)
print 'strftime:', s

d = datetime.datetime.strptime(s, format)
print 'strptime:', d.strftime(format)



$ python datetime_datetime_strptime.py
ISO : 2008-03-16 08:08:16.275134
strftime: Sun Mar 16 08:08:16 2008
strptime: Sun Mar 16 08:08:16 2008


Time Zones:

Within datetime, time zones are represented by subclasses of datetime.tzinfo. Since tzinfo is an abstract base class, you need to define a subclass and provide appropriate implementations for a few methods to make it useful. Unfortunately, datetime does not include any actual implementations ready to be used. Ironically, the documentation does provide a few sample implementations. Refer to the tzinfo page for examples using fixed offsets as well as a DST-aware class and more details about creating your own class.

References:

PLEAC - Dates and Times
WikiPedia: Proleptic Gregorian calendar
PyMOTW: calendar
PyMOTW: time
Python Module of the Week Home
Download Sample Code

Updated 18 Mar with PLEAC link in references.


Technorati Tags:
,


Sunday, March 9, 2008

PyMOTW: time

The time module provides functions for working with dates and times.

Module: time
Purpose: Functions for manipulating times.
Python Version: 1.4 or earlier

Description:

The time module exposes C library functions for manipulating dates and times. Since it is tied to the underlying C implementation, some details (such as the start of the epoch and maximum date value supported) are platform-specific. Refer to the library documentation for complete details.

Wall Clock Time:

One of the core functions of the time module is time.time(), which returns the number of seconds since the start of the epoch as a floating point value.

import time

print 'The time is:', time.time()


Although the value is always a float, actual precision is platform-dependent.


$ python time_time.py
The time is: 1205079300.54


The float representation is useful when storing or comparing dates, but not as useful for producing human readable representations. For logging or printing time time.ctime() can be more useful.

import time

print 'The time is :', time.ctime()
later = time.time() + 15
print '15 secs from now :', time.ctime(later)


Here the second output line shows how to use ctime() to format a time value other than the current time.


$ python time_ctime.py
The time is : Sun Mar 9 12:18:02 2008
15 secs from now : Sun Mar 9 12:18:17 2008


Processor Clock Time:

While time() returns a wall clock time, clock() returns processor clock time. The values returned from clock() should be used for performance testing, benchmarking, etc. since they reflect the actual time used by the program, and can be more precise than the values from time().

import hashlib
import time

# Data to use to calculate md5 checksums
data = open(__file__, 'rt').read()

for i in range(5):
h = hashlib.sha1()
print time.ctime(), ': %0.3f %0.3f' % (time.time(), time.clock())
for i in range(100000):
h.update(data)
cksum = h.digest()


In this example, the formatted ctime() is printed along with the floating point values from time(), and clock() for each iteration through the loop. If you want to run the example on your system, you may have to add more cycles to the inner loop or work with a larger amount of data to actually see a difference.


$ python time_clock.py
Sun Mar 9 12:41:53 2008 : 1205080913.260 0.030
Sun Mar 9 12:41:53 2008 : 1205080913.682 0.440
Sun Mar 9 12:41:54 2008 : 1205080914.103 0.860
Sun Mar 9 12:41:54 2008 : 1205080914.518 1.270
Sun Mar 9 12:41:54 2008 : 1205080914.932 1.680


Typically, the processor clock doesn't tick if your program isn't doing anything.

import time

for i in range(6, 1, -1):
print '%s %0.2f %0.2f' % (time.ctime(), time.time(), time.clock())
print 'Sleeping', i
time.sleep(i)


In this example, the loop does very little work by going to sleep after each iteration. The time.time() value increases even while the app is asleep, but the time.clock() value does not.


$ python time_clock_sleep.py
Sun Mar 9 12:46:36 2008 1205081196.20 0.02
Sleeping 6
Sun Mar 9 12:46:42 2008 1205081202.20 0.02
Sleeping 5
Sun Mar 9 12:46:47 2008 1205081207.20 0.02
Sleeping 4
Sun Mar 9 12:46:51 2008 1205081211.20 0.02
Sleeping 3
Sun Mar 9 12:46:54 2008 1205081214.21 0.02
Sleeping 2


Calling time.sleep() yields control from the current thread and asks it to wait for the system to wake it back up. If your program has only one thread, this effectively blocks the app and it does no work.

struct_time:

Storing times as elapsed seconds is useful in some situations, but there are times when you need to have access to the individual fields of a date (year, month, etc.). The time module defines struct_time for holding date and time values with components broken out so they are easy to access. There are several functions that work with struct_time values instead of floats.

import time

print 'gmtime :', time.gmtime()
print 'localtime:', time.localtime()
print 'mktime :', time.mktime(time.localtime())

print
t = time.localtime()
print 'Day of month:', t.tm_mday
print ' Day of week:', t.tm_wday
print ' Day of year:', t.tm_yday


gmtime() returns the current time in UTC. localtime() returns the current time with the current time zone applied. mktime() takes a struct_time and converts it to the floating point representation.


$ python time_struct.py
gmtime : (2008, 3, 9, 16, 58, 19, 6, 69, 0)
localtime: (2008, 3, 9, 12, 58, 19, 6, 69, 1)
mktime : 1205081899.0

Day of month: 9
Day of week: 6
Day of year: 69


Parsing and Formatting Times:

The two functions strptime() and strftime() convert between struct_time and string representations of time values. There is a long list of formatting instructions available to support input and output in different styles. The complete list is documented in the library documentation for the time module.

This example converts the current time from a string, to a struct_time instance, and back to a string.

import time

now = time.ctime()
print now
parsed = time.strptime(now)
print parsed
print time.strftime("%a %b %d %H:%M:%S %Y", parsed)


The output string is not exactly like the input, since the day of the month is prefixed with a zero.


$ python time_strptime.py
Sun Mar 9 13:01:19 2008
(2008, 3, 9, 13, 1, 19, 6, 69, -1)
Sun Mar 09 13:01:19 2008


Working with Time Zones:

The functions for determining the current time depend on having the time zone set, either by your program or by using a default time zone set for the system. Changing the time zone does not change the actual time, just the way it is represented.

To change the time zone, set the environment variable TZ, then call tzset(). Using TZ, you can specify the time zone with a lot of detail, right down to the start and stop times for daylight savings time. It is usually easier to use the time zone name and let the underlying libraries derive the other information, though.

This example program changes the time zone to a few different values and shows how the changes affect other settings in the time module.

import time
import os

def show_zone_info():
print '\tTZ :', os.environ.get('TZ', '(not set)')
print '\ttzname:', time.tzname
print '\tZone : %d (%d)' % (time.timezone, (time.timezone / 3600))
print '\tDST :', time.daylight
print '\tTime :', time.ctime()
print

print 'Default :'
show_zone_info()

for zone in [ 'US/Eastern', 'US/Pacific', 'GMT', 'Europe/Amsterdam' ]:
os.environ['TZ'] = zone
time.tzset()
print zone, ':'
show_zone_info()


My default time zone is US/Eastern, so setting TZ to that has no effect. The other zones used change the tzname, daylight flag, and timezone offset value.


$ python time_timezone.py
Default :
TZ : (not set)
tzname: ('EST', 'EDT')
Zone : 18000 (5)
DST : 1
Time : Sun Mar 9 13:06:53 2008

US/Eastern :
TZ : US/Eastern
tzname: ('EST', 'EDT')
Zone : 18000 (5)
DST : 1
Time : Sun Mar 9 13:06:53 2008

US/Pacific :
TZ : US/Pacific
tzname: ('PST', 'PDT')
Zone : 28800 (8)
DST : 1
Time : Sun Mar 9 10:06:53 2008

GMT :
TZ : GMT
tzname: ('GMT', 'GMT')
Zone : 0 (0)
DST : 0
Time : Sun Mar 9 17:06:53 2008

Europe/Amsterdam :
TZ : Europe/Amsterdam
tzname: ('CET', 'CEST')
Zone : -3600 (-1)
DST : 1
Time : Sun Mar 9 18:06:53 2008



References:

datetime module
locale module
calendar module
PyMOTW: calendar

Python Module of the Week Home
Download Sample Code


Technorati Tags:
,


Sunday, March 2, 2008

PyMOTW: EasyDialogs

Use EasyDialogs to include Mac OS-native dialogs in your Python scripts.

Module: EasyDialogs
Purpose: Provides simple interfaces to Carbon dialogs from Python.
Python Version: At least 2.0, Macintosh-only (see References below for a Windows implementation)

Description:

The EasyDialogs module includes classes and functions for working with simple message and prompt dialogs, as well as stock dialogs for querying the user for file or directory names. The dialogs use the Carbon API. See Apple's Navigation Services Reference for more details about some of the options not covered in detail here.

Messages:

A simple Message function displays modal dialog containing a text message for the user.

import EasyDialogs

EasyDialogs.Message('This is a Message dialog')


MessageDialog.png


It is easy to change the label of the "OK" button using the ok argument.

import EasyDialogs

EasyDialogs.Message('The button label has changed', ok='Continue')


MessageDialog_continue.png


ProgressBar:

The ProgressBar class manages a modeless dialog with a progress meter. It can operate in determinate (when you know how much work there is to be done) or indeterminate (when you want to show that your app is working, but do not know how much work needs to be done) modes. The constructor takes arguments for the dialog title, the maximum value, and a label to describe the current phase of operation.

In determinate mode, set the maxval argument to the number of steps, amount of data to download, etc. Then use the incr() method to step the progress from 0 to maxval.

import EasyDialogs
import time

meter = EasyDialogs.ProgressBar('Making progress...',
maxval=10,
label='Starting',
)
for i in xrange(1, 11):
phase = 'Phase %d' % i
print phase
meter.label(phase)
meter.inc()
time.sleep(1)
print 'Done with loop'
time.sleep(1)

del meter
print 'The dialog should be gone now'

time.sleep(1)


$ python EasyDialogs_ProgressBar.py 
Phase 1
Phase 2
Phase 3
Phase 4


ProgressBar_partial.png


Phase 5
Phase 6
Phase 7
Phase 8
Phase 9
Phase 10
Done with loop


ProgressBar_complete.png


The dialog should be gone now


Explicitly deleting the ProgressBar instance using del removes it from the screen.

If you are measuring progress in uneven steps, you can use set() to change the progress meter instead of incr().

import EasyDialogs
import time

meter = EasyDialogs.ProgressBar('Making progress...',
maxval=1000,
label='Starting',
)
for i in xrange(1, 1001, 123):
msg = 'Bytes: %d' % i
meter.label(msg)
meter.set(i)
time.sleep(1)


ProgressBar_set_partial.png


Simple Prompts:

EasyDialogs also lets you ask the user for information. Use AskString to display a modal dialog to prompt the user for a simple string.

import EasyDialogs

response = EasyDialogs.AskString('What is your favorite color?', default='blue')
print 'RESPONSE:', response


AskString.png


The return value depends on the user's response. It is either the text they enter:


$ python EasyDialogs_AskString.py
RESPONSE: blue


or None if they press the Cancel button.


$ python EasyDialogs_AskString.py
RESPONSE: None


The string response has a length limit of 254 characters. If the value entered is longer than that, it is truncated.

import EasyDialogs
import string

default = string.ascii_letters * 10
print 'len(default)=', len(default)
response = EasyDialogs.AskString('Enter a long string', default=default)
print 'len(response)=', len(response)


AskString_too_long.png



$ python EasyDialogs_AskString_too_long.py
len(default)= 520
len(response)= 254


Passwords:

Use AskPassword to prompt the user for secret values that should not be echoed back to the screen in clear-text.

import EasyDialogs

response = EasyDialogs.AskPassword('Password:', default='s3cr3t')
print 'Shh!:', response


AskPassword.png



$ python EasyDialogs_AskPassword.py
Shh!: s3cr3t


The Ok/Cancel behavior for AskPassword is the same as AskString.

Files and Directories:

There are special functions for requesting file or directory names. These use the native file selector dialogs, so the user does not have to type in the paths. For example, to ask the user which file to open, use AskFileForOpen.

import EasyDialogs
import os

filename = EasyDialogs.AskFileForOpen(
message='Select a Python source file',
defaultLocation=os.getcwd(),
wanted=unicode,
)

print 'Selected:', filename


The wanted=unicode argument tells AskFileForOpen to return the name of the file as a unicode string. The other possible return types include ASCII string, and some Apple data structures for working with file references.

By specifing defaultLocation, this example initializes the dialog to the current working directory. The user is still free to navigate around the filesystem, of course.

Other options to AskFileForOpen let you filter the values displayed, control the type codes of files visible to the user, and interact with the dialog through callbacks. Refer to the module documentation and Apple's reference guide for more details.

AskForFileOpen.png



$ python EasyDialogs_AskFileForOpen.py
Selected: /Users/dhellmann/Documents/PyMOTW/in_progress/EasyDialogs/EasyDialogs_AskFileForOpen.py


To prompt the user to provide a new filename when saving a file, use AskFileForSave.

import EasyDialogs
import