Sunday, October 25, 2009

PyMOTW: sys, Part 3: Memory Management and Limits

Memory Management and Limits

sys includes several functions for understanding and controlling memory usage.

Reference Counts

Python helps you manage memory with garbage collection. An object is automatically marked to be collected when its reference count drops to zero. To examine the reference count of an existing object, use getrefcount().

import sys

one = []
print 'At start :', sys.getrefcount(one)

two = one

print 'Second reference :', sys.getrefcount(one)

del two

print 'After del :', sys.getrefcount(one)

Notice that the count is actually one higher than expected because there is a temporary reference to the object held by getrefcount() itself.

$ python sys_getrefcount.py
At start : 2
Second reference : 3
After del : 2

See also

gc
Control the garbage collector via the functions exposed in gc.

Object Size

Knowing how many references an object has may help you figure out where you have a cycle or a leak in your memory, but it isn’t enough to determine what objects are consuming the most memory. For that, you also need to know how big objects are.

import sys

class OldStyle:
pass

class NewStyle(object):
pass

for obj in [ [], (), {}, 'c', 'string', 1, 2.3,
OldStyle, OldStyle(), NewStyle, NewStyle(),
]:
print '%10s : %s' % (type(obj).__name__, sys.getsizeof(obj))

The size is reported in bytes.

$ python sys_getsizeof.py
list : 36
tuple : 28
dict : 140
str : 25
str : 30
int : 12
float : 16
classobj : 48
instance : 36
type : 452
NewStyle : 32

For a more accurate estimate of the space used by a class, you can provide a __sizeof__() method to compute the value by aggregating the sizes of attributes of an object.

import sys

class MyClass(object):
def __init__(self):
self.a = 'a'
self.b = 'b'
return
def __sizeof__(self):
return object.__sizeof__(self) + \
sum(sys.getsizeof(v) for v in self.__dict__.values())

my_inst = MyClass()
print sys.getsizeof(my_inst)
$ python sys_getsizeof_custom.py
82

Recursion

Allowing infinite recursion in a Python application may introduce a stack overflow in the interpreter itself, leading to a crash. To eliminate this situation, the interpreter lets you control the maximum recursion depth using setrecursionlimit() and getrecursionlimit().

import sys

print 'Initial limit:', sys.getrecursionlimit()

sys.setrecursionlimit(10)

print 'Modified limit:', sys.getrecursionlimit()

def generate_recursion_error(i):
print 'generate_recursion_error(%s)' % i
generate_recursion_error(i+1)

try:
generate_recursion_error(1)
except RuntimeError, err:
print 'Caught exception:', err

Once the recursion limit is reached, the interpreter raises a RuntimeError exception so your program has an opportunity to handle the situation.

$ python sys_recursionlimit.py
Initial limit: 1000
Modified limit: 10
generate_recursion_error(1)
generate_recursion_error(2)
generate_recursion_error(3)
generate_recursion_error(4)
generate_recursion_error(5)
generate_recursion_error(6)
generate_recursion_error(7)
generate_recursion_error(8)
Caught exception: maximum recursion depth exceeded while getting the str of an object

Maximum Values

Along with the runtime configurable values, sys includes variables defining the maximum values for types that vary from system to system.

import sys

print 'maxint :', sys.maxint
print 'maxsize :', sys.maxsize
print 'maxunicode:', sys.maxunicode

maxint is the largest representable regular integer. maxsize is the maximum size of a list, dictionary, string, or other data structure dictated by the C interpreter’s size type. maxunicode is the largest integer Unicode point supported by the interpreter as currently configured.

$ python sys_maximums.py
maxint : 2147483647
maxsize : 2147483647
maxunicode: 65535

Floating Point Values

The structure float_info contains information about the floating point type representation used by the interpreter, based on the underlying system’s float implementation.

import sys

print 'Smallest difference (epsilon):', sys.float_info.epsilon
print
print 'Digits (dig) :', sys.float_info.dig
print 'Mantissa digits (mant_dig):', sys.float_info.mant_dig
print
print 'Maximum (max):', sys.float_info.max
print 'Minimum (min):', sys.float_info.min
print
print 'Radix of exponents (radix):', sys.float_info.radix
print
print 'Maximum exponent for radix (max_exp):', sys.float_info.max_exp
print 'Minimum exponent for radix (min_exp):', sys.float_info.min_exp
print
print 'Maximum exponent for power of 10 (max_10_exp):', sys.float_info.max_10_exp
print 'Minimum exponent for power of 10 (min_10_exp):', sys.float_info.min_10_exp
print
print 'Rounding for addition (rounds):', sys.float_info.rounds

Note

These values depend on the compiler and underlying system, so you may have different results. These examples were produced on OS X 10.5.8.

$ python sys_float_info.py
Smallest difference (epsilon): 2.22044604925e-16

Digits (dig) : 15
Mantissa digits (mant_dig): 53

Maximum (max): 1.79769313486e+308
Minimum (min): 2.22507385851e-308

Radix of exponents (radix): 2

Maximum exponent for radix (max_exp): 1024
Minimum exponent for radix (min_exp): -1021

Maximum exponent for power of 10 (max_10_exp): 308
Minimum exponent for power of 10 (min_10_exp): -307

Rounding for addition (rounds): 1

See also

Your system’s float.h contains more details about these settings.

PyMOTW Home

The canonical version of this article

6 comments:

cybergrind said...

and what about size of __dict__?
'__dict__' in obj.__dict__ :: False

Doug Hellmann said...

Since self.__dict__ is part of the implementation of classes in python, I expect its size to be included in object's __sizeof__ computation.

cybergrind said...

#!/usr/bin/env python
import sys
from itertools import imap

#script from: http://code.activestate.com/recipes/546530/
import getsize


s = sys.getsizeof

class A(object):
def getsize(self):
m = 0
for i in imap(s, iter(self.__dict__)):
m += i
return m

a = A()

for i in xrange(1, 10000):
setattr(a, 'a'*i, i)

print 'Size of a: %s'%s(a)
print 'Size of a items: %s'%a.getsize()
print 'Size of a.__dict__: %s'%s(a.__dict__)
print '"__dict__" in a.__dict__: %s'%('__dict__' in a.__dict__)
print 'All sized summed: %s'%(a.getsize() + s(a) + s(a.__dict__))
print 'Size asizeof: %s'%getsize.asizeof(a)



out:
Size of a: 64
Size of a items: 50394960
Size of a.__dict__: 786712
"__dict__" in a.__dict__: False
All sized summed: 51181736
Size asizeof: 51456712


Seems that __dict__ isn't included.
except script from activestate, you can use guppy/heapy module for memory profiling, but standart sys.getsizeof - strange thing...

tartley.com said...

Hi. Thanks for yet another great post.

> For a more accurate estimate of
> the space used by a class, you
> can provide a __sizeof__() method

Does this imply that sys.getsizeof() is sometimes not accurate? When and why and by how much?

Hugs,

Doug Hellmann said...

@tartley.com - getsizeof() reports the memory used to represent the object handed to it, but not any of that objects contents. I've included an updated example on http://www.doughellmann.com/PyMOTW/sys/limits.html#object-size illustrating how the attribute sizes are not included unless you provide your own __sizeof__() method.

Doug Hellmann said...

@cybergrind - Your example seems to be adding the numbers to 10000 rather than the sizes of the attributes.