Python How to use Guppy/Heapy for tracking down Memory Usage














































Python How to use Guppy/Heapy for tracking down Memory Usage



How to use Guppy/Heapy for tracking down Memory Usage


Heapy is powerful but has a learning curve. 

 This article's main purpose is to show you how to ask heapy various kinds of questions. It may or may not show a few cases where pkgcore uses more memory than it should too.

First, get an x86. Heapy currently does not like 64 bit archs much.

Emerge it:

emerge guppy

Open up an interactive python prompt:

>>> from guppy import hpy
>>> from pkgcore.config import load_config
>>> c = load_config()
>>> hp = hpy()

Just to show heapy%u2019s internal tricks are:

>>> dir(hp)
['__doc__', '__getattr__', '__init__', '__module__', '__setattr__', '_hiding_tag_', '_import', '_name', '_owner', '_share']
>>> help(hp)
Help on class _GLUECLAMP_ in module guppy.etc.Glue:

_GLUECLAMP_ = <guppy.heapy.Use interface at 0x-484b8554>

This object is your %u201Cstarting point%u201D, but as you can see the underlying machinery is not giving away any useful usage instructions.

Do everything that allocates some memory but is not the problem you are tracking down now. Then do:

>>> hp.setrelheap()

Everything allocated before this call will not be in the data sets you get later.

Now do your memory-intensive thing:

>>> l = list(x for x in c.repo["gentoo"] if x.data)

Keep an eye on system memory consumption. You want to use up a lot but not all of your system ram for nicer statistics.

>>> h = hp.heap()

 This object is basically a snapshot of what%u2019s reachable in ram (minus the stuff excluded through setrelheap earlier) which you can do various fun tricks with. Its str() is a summary:

>>> h
Partition of a set of 1449133 objects. Total size = 102766644 bytes.
Index Count % Size % Cumulative % Kind (class / dict of class)
0 985931 68 46300932 45 46300932 45 str
1 24681 2 22311624 22 68612556 67 dict of pkgcore.ebuild.ebuild_src.package
2 49391 3 21311864 21 89924420 88 dict (no owner)
3 115974 8 3776948 4 93701368 91 tuple
4 152181 11 3043616 3 96744984 94 long
5 36009 2 1584396 2 98329380 96 weakref.KeyedRef
6 11328 1 1540608 1 99869988 97 dict of pkgcore.ebuild.ebuild_src.ThrowAwayNameSpace
7 24702 2 889272 1 100759260 98 types.MethodType
8 11424 1 851840 1 101611100 99 list
9 24681 2 691068 1 102302168 100 pkgcore.ebuild.ebuild_src.package
<54 more rows. Type e.g. '_.more' to view.>

(You might want to keep an eye on ram usage: heapy made the process grow another dozen mb here. It gets slow if it starts swapping, so if that happens reduce your data set).

So here we can see that (we have a ton of strings in memory. We also have various kinds of dicts. Dicts are treated a bit specially: the "dict of pkgcore.ebuild.ebuild_src.package"simply means  "all the dicts that are __dict__ attributes of instances of that class". "dict (no owner)" are all the dicts that are not used as __dict__ attribute.


In the next article we're going to discuss how to Get current memory usage of a program using Guppy.


Happy Pythoning..!!


More Articles of Aditi Kothiyal:

Name Views Likes
Python AdaBoost Mathematics Behind AdaBoost 421 1
Python PyCaret How to optimize the probability threshold % in binary classification 2069 0
Python K-means Predicting Iris Flower Species 1322 2
Python PyCaret How to ignore certain columns for model building 2624 0
Python PyCaret Experiment Logging 679 0
Python PyWin32 Open a File in Excel 941 0
Python Guppy GSL Introduction 219 2
Python Usage of Guppy With Example 1100 2
Python Naive Bayes Tutorial 552 2
Python Guppy Recent Memory Usage of a Program 892 2
Introduction to AdaBoost 289 1
Python AdaBoost Implementation of AdaBoost 512 1
Python AdaBoost Advantages and Disadvantages of AdaBoost 3713 1
Python K-Means Clustering Applications 332 2
Python Random Forest Algorithm Decision Trees 439 0
Python K-means Clustering PREDICTING IRIS FLOWER SPECIES 456 1
Python Random Forest Algorithm Bootstrap 475 0
Python PyCaret Util Functions 441 0
Python K-means Music Genre Classification 1763 1
Python PyWin Attach an Excel file to Outlook 1541 0
Python Guppy GSL Document and Test Example 248 2
Python Random Forest Algorithm Bagging 386 0
Python AdaBoost An Example of How AdaBoost Works 279 1
Python PyWin32 Getting Started PyWin32 602 0
Python Naive Bayes in Machine Learning 374 2
Python PyCaret How to improve results from hyperparameter tuning by increasing "n_iter" 1723 0
Python PyCaret Getting Started with PyCaret 2.0 356 1
Python PyCaret Tune Model 1325 1
Python PyCaret Create your own AutoML software 320 0
Python PyCaret Intoduction to PyCaret 296 1
Python PyCaret Compare Models 2696 1
Python PyWin Copying Data into Excel 1152 0
Python Guppy Error: expected function body after function declarator 413 2
Python Coding Random forest classifier using xgBoost 246 0
Python PyCaret How to tune "n parameter" in unsupervised experiments 658 0
Python PyCaret How to programmatically define data types in the setup function 1403 0
Python PyCaret Ensemble Model 805 1
Python Random forest algorithm Introduction 227 0
Python k-means Clustering Example 337 1
Python PyCaret Plot Model 1243 1
Python Hamming Distance 715 0
Python Understanding Random forest algorithm 309 0
Python PyCaret Sort a Dictionary by Keys 244 0
Python Coding Random forest classifier using sklearn 340 0
Python Guppy Introduction 368 2
Python How to use Guppy/Heapy for tracking down Memory Usage 1069 2
Python AdaBoost Summary and Conclusion 231 1
Python PyCaret Create Model 365 1
Python k -means Clusturing Introduction 325 2
Python k-means Clustering With Example 348 2

Comments