I've been working on a tool named "gdb-heap" [1] which extends gdb with
command to analyse the dynamic memory usage of user-space processes. It
has lots of heuristics for dealing with CPython.
As an experiment, I tried attaching it to a "yum update" process.
This is on a 64-bit Fedora 13 machine.
The yum process is showing up in "top" as:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
7815 root 20 0 402m 98m 8556 t 0.0 1.3 0:03.00 yum
I believe I attached to it during depsolving: yum was printing messages
of the form:
---> Package python-twisted-core-zsh.x86_64 0:8.2.0-6.fc13 set to be
updated
---> Package python-webhelpers.noarch 0:1.0-0.2.b7.fc13 set to be
updated
etc.
Method:
Using git HEAD of gdb-heap (0c6b3506098e21f42c43ccac6796e847e86033ea),
with:
yum-3.2.27-4.fc13.noarch
rpm-4.8.0-14.fc13.x86_64
python-2.6.5-9.fc14.x86_64 (local build, but nothing special)
$ sudo yum update
$ sudo PYTHONPATH=$(pwd) gdb --eval-command="python import gdbheap"
attach $(pidof -x yum)
(gdb) heap
It took a couple of minutes for gdb-heap to scrape data from the yum
process and analyse it.
The top entries in the result were:
Domain Kind Detail Count Allocated size
------------- ---------------------------- ----------------------------------- ------- -------------------------
python tuple 216,816 17,845,416
uncategorized 1296 bytes 11,999 15,550,704
python str 163,082 9,619,016
cpython PyDictEntry table 822 7,938,032
rpm Header blob 97 7,722,368
python list 73,138 5,265,936
cpython PyListObject ob_item table 66,186 3,633,520
uncategorized 1778224 bytes 2 3,556,448
python dict 10,844 3,124,832
cpython PyDictEntry table YumAvailablePackageSqlite.__dict__ 3,082 2,450,896
uncategorized 1781760 bytes 1 1,781,760
cpython PyDictEntry table RPMInstalledPackage.__dict__ 511 1,577,984
cpython PySetObject setentry table 137 1,236,672
cpython PyDictEntry table TransactionMember.__dict__ 343 1,059,184
python dict YumAvailablePackageSqlite.__dict__ 3,082 887,984
cpython PyDictEntry table interned 1 790,528
python str bytecode 4,169 708,520
python code 4,169 500,280
python function 4,027 483,240
pyarena pool_header overhead 8,642 414,816
uncategorized 52016 bytes 6 312,096
pyarena alignment wastage 138 293,936
python type 262 253,184
cpython PyUnicodeObject buffer 2,938 238,712
python YumAvailablePackageSqlite 3,082 197,248
...which isn't necessarily very illuminating: 17MB of tuples, 15MB of
1296-byte blocks of unknown type (these could actually be 1280-byte
mallocs, with 16 byte padding), 9MB of strings.
I notice 97 rpm "Header blobs" taking 7MB of ram; does yum open up rpm
files? This is data referenced by rpm.hdr python objects. Do these
get cleaned up immediately after use, or do they need to stick around
for a while? Not sure if savings are possible.
Moving down the profile, locating the 17778224-byte allocations (I've
implemented a mini query language for locating blocks by criteria):
(gdb) heap select size == 1778224
Start End Domain Kind Detail Hexdump
------------------ ------------------ ------------- ---- ------------- ----------------------------------------------------------------------------------
0x000000000360a810 0x00000000037bca3f uncategorized 1778224 bytes 00 00 00 43 00 00 86 60 00 00 00 3f 00 00 00 07 00 00 80 fc |...C...`...?........|
0x00000000068596c0 0x0000000006a0b8ef uncategorized 1778224 bytes 00 00 00 43 00 00 86 60 00 00 00 3f 00 00 00 07 00 00 80 fc |...C...`...?........|
and using gdb-heap's "hexdump" command on those blocks shows a buffer
with some uint32 values, then what looks like rpm metadata; perhaps it's
all rpm metadata? Berkeley DB? Not sure.
(gdb) hexdump 0x000000000360a810
0x000000000360a810 -> 0x000000000360a82f 00 00 00 43 00 00 86 60 00 00 00 3f 00 00 00 07 00 00 80 fc 00 00 00 10 00 00 00 64 00 00 00 08 |...C...`...?...............d....|
0x000000000360a830 -> 0x000000000360a84f 00 00 00 00 00 00 00 01 00 00 03 e8 00 00 00 06 00 00 00 02 00 00 00 01 00 00 03 e9 00 00 00 06 |................................|
0x000000000360a850 -> 0x000000000360a86f 00 00 00 10 00 00 00 01 00 00 03 ea 00 00 00 06 00 00 00 16 00 00 00 01 00 00 03 ec 00 00 00 09 |................................|
(snip)
0x000000000360ac50 -> 0x000000000360ac6f 2d 70 79 6c 6f 6e 73 00 30 2e 39 2e 37 00 32 2e 66 63 31 32 00 50 79 6c 6f 6e 73 20 77 65 62 20 |-pylons.0.9.7.2.fc12.Pylons web |
0x000000000360ac70 -> 0x000000000360ac8f 66 72 61 6d 65 77 6f 72 6b 00 54 68 65 20 50 79 6c 6f 6e 73 20 77 65 62 20 66 72 61 6d 65 77 6f |framework.The Pylons web framewo|
0x000000000360ac90 -> 0x000000000360acaf 72 6b 20 69 73 20 61 69 6d 65 64 20 61 74 20 6d 61 6b 69 6e 67 20 77 65 62 61 70 70 73 20 61 6e |rk is aimed at making webapps an|
0x000000000360acb0 -> 0x000000000360accf 64 20 6c 61 72 67 65 20 70 72 6f 67 72 61 6d 6d 61 74 69 63 0a 77 65 62 73 69 74 65 20 64 65 76 |d large programmatic.website dev|
0x000000000360acd0 -> 0x000000000360acef 65 6c 6f 70 6d 65 6e 74 20 69 6e 20 50 79 74 68 6f 6e 20 65 61 73 79 2e 20 53 65 76 65 72 61 6c |elopment in Python easy. Several|
0x000000000360acf0 -> 0x000000000360ad0f 20 6b 65 79 20 70 6f 69 6e 74 73 3a 0a 0a 2a 20 41 20 66 72 61 6d 65 77 6f 72 6b 20 74 6f 20 6d | key points:..* A framework to m|
I was a little disappointed with this result; I was hoping that gdb-heap
might suggest some substantial memory-saving wins, but nothing jumps out
from the top entries in the memory profile.
Having said that, I notice about 5.7MB of PyDictEntry tables for
YumAvailablePackageSqlite.__dict__ etc. Have you considered using the
"__slots__" optimization here? [2] You provide something like:
class Foo(object)
__slots__ = ('name', 'version', 'epoch', 'etc')
and this locks down the attributes of instances to just the ones listed.
This typically will save half the memory, since the table of attribute
name pointers can be stored per-class, rather than on the left-hand-side
of every instance dictionary. This assumes that the set of attributes
is limited. Unfortunately, it's only a 2-3% saving in this case.
(There's also a saving from not storing the PyDictObjects themselves,
but this is much less; about 288 bytes per object on 64bit = 1.7MB in
this case).
I'm attaching the full report on heap usage by category from the
process.
There's clearly a bug in the final lines:
uncategorized -559571704885180256 bytes 1 -,559,571,704,885,180,256
TOTAL 605,352 -,559,571,704,793,262,616
which I'm looking into; I believe the correct lines should have read:
TOTAL 605,352 91,917,640
(i.e roughly 6*10^6 allocated blocks of memory, occupying 91*10^6 bytes;
it doesn't currently report on fragmentation within glibc's heap, which
may be one reason for the discrepancy relative to the 98MB figure given
by "top".
Unfortunately "gdb-heap" consumes a large amount of memory itself (about
6GB to do the above analysis). I'm looking at fixing this (pointing it
at itself, naturally).
Hope this is helpful
Dave
[1] https://fedorahosted.org/gdb-heap/
[2] http://docs.python.org/reference/datamodel.html#slots