Introduction
Deep Memory Profiler (DMP) is a 1) whole-process, 2) timeline-based and 3) post-mortem memory profiler for Chromium. See the Design Doc if interested.
Memory bloat has been a serious issue in Chromium for years. Bloat is harder to fix than leak and errors. We have memory checkers like Valgrind and Address Sanitizer for leak and errors, but no handy tools for bloat. DMP is an easy-to-use tool for bloat to know “who is the memory eater”. (It would be helpful also for leak.)
Announcement
Future announcements (e.g. command line changes) will be done in
dmprof@chromium.org. Subscribe it if you're interested.
Sub-profilers
How to Use
DMP profiles memory usage post-mortem. You need to 1) get memory dumps while Chrome is running, and then 2) analyze the dumps. The analyzing script dmprof is available in src/tools/deep_memory_profiler of the Chromium source tree, or can be retrieved by downloading and running download.sh.
Phase 1: Get dumps
You have three options to get dumps.
Get dumps (A): Standalone Chromium
- Build or download the latest Chromium Debug static build. (Shared build doesn't work for now.)
- A GYP flag 'disable_debugallocation=1' is recommended. (See http://codereview.chromium.org/11794029 and http://codereview.chromium.org/11266019.)
- Run the Chromium Debug build with a command-line option --no-sandbox and the following environment variables :
- HEAPPROFILE=/path/to/prefix
- HEAP_PROFILE_MMAP=1
- HEAP_PROFILE_TIME_INTERVAL=<interval seconds between dumping>
- DEEP_HEAP_PROFILE=1
- Remember the Process ID of your target renderer! (c.f. about:memory.)
- Find the dumps /path/to/prefix.*.heap which are dumped every HEAP_PROFILE_TIME_INTERVAL seconds.
- Names of the dumps include Process IDs. For example, prefix.12345.0002.heap for pid = 12345. '0002' is a sequence number of the dump.
$ HEAPPROFILE=$HOME/prof/prefix HEAP_PROFILE_TIME_INTERVAL=20 HEAP_PROFILE_MMAP=1 DEEP_HEAP_PROFILE=1 out/Debug/chrome --no-sandbox
Get dumps (B): WebDriver (ChromeDriver)
A WebDriver (ChromeDriver) API function is available to dump. Try dump_heap_profile with environment variables HEAPPROFILE, HEAP_PROFILE_MMAP, DEEP_HEAP_PROFILE specified. The usage is :
dump_heap_profile(reason='Why you want a dump.')
Note that:
- You need to set environment variables HEAPPROFILE, HEAP_PROFILE_MMAP and DEEP_HEAP_PROFILE before starting Chrome in your WebDriver tests.
- You may want to copy the dumps from your remote machine if you run ChromeDriver remotely.
Get dumps (C): PyAuto
A PyAuto API function is available. Try PyUITest.HeapProfilerDump with environment variables HEAPPROFILE, HEAP_PROFILE_MMAP, DEEP_HEAP_PROFILE specified. The usage is :
HeapProfilerDump(process_type='renderer' or 'browser',
reason='Why you want a dump.',
tab_index,
windex)
Note that you need to set environment variables HEAPPROFILE, HEAP_PROFILE_MMAP and DEEP_HEAP_PROFILE before starting Chrome in your PyAuto tests. See perf_endure.py as an example.
Phase 2: Analyze the dumps
- Run the analyzing script dmprof, and redirect its stdout to a CSV file. You can choose a policy with -p option. Details of policies are described below.
dmprof [-p POLICY] csv /path/to/first-one-of-the-dumps.heap > result.csv
- Copy the CSV into a spreadsheet application, for example, OpenOffice Calc and Google Spreadsheet.
- Draw a (stacked) line chart on the spreadseet for columns from FROM_HERE_FOR_TOTAL to UNTIL_HERE_FOR_TOTAL. (See the example.)
$ tools/deep_memory_profiler/dmprof csv ~/profile/00-test.12345.0002.heap > ~/profile/00-test.12345.result.csv
How to read the graph
Graphs are typical stacked ones which classifies all memory usage into some components. We have four kinds of graphs from each execution dump based on four built-in "policies": l0, l1, l2, t0.
We're preparing more breakdown policies. If you'd like to add a criteria for your component, 1) modify policy.*.json files, 2) modify polices.json and add a policy file or 3) tell dmikurube@.
Common components
| mustbezero & unhooked-absent |
Should be zero. |
| unhooked-anonymous |
VMA is mapped, but no information is recorded. No label is given in /proc/.../maps. |
| unhooked-file-exec |
VMA is mapped, but no information is recorded. An executable file is mapped. |
| unhooked-file-nonexec |
VMA is mapped, but no information is recorded. A non-executable file is mapped. |
| unhooked-file-stack |
VMA is mapped, but no information is recorded. Used as a stack. |
| unhooked-other |
VMA is mapped, but no information is recorded. Used for other purposes. |
| no-bucket |
Should be small. Out of record because it is an ignorable small blocks. |
| tc-unused |
Reserved by TCMalloc, but not used by malloc(). Headers, fragmentation and free-list. |
Policy "l0"
This policy applies the most rough classification.
| mmap-v8 |
mmap'ed for V8. It includes JavaScript heaps and JIT compiled code. |
| mmap-catch-all |
mmap'ed for other purposes. |
| tc-used-all |
All memory blocks allocated by malloc(). |
It breaks down memory usage into relatively specific components.
| mmap-v8-heap-newspace |
JavaScript new (nursery) heap for younger objects. |
| mmap-v8-heap-coderange |
Code produced at runtime including JIT-compiled JavaScript code. |
| mmap-v8-heap-pagedspace |
JavaScript old heap and many other object spaces. |
| mmap-v8-other |
Other regions mmap'ed by V8. |
| mmap-catch-all |
Any other mmap'ed regions. |
| tc-v8 |
Blocks allocated from V8. |
| tc-skia |
Blocks allocated from Skia. |
| tc-webkit-catch-all |
Blocks allocated from WebKit. |
| tc-unknown-string |
Blocks which are related to std::string. |
| tc-catch-all |
Any other blocks allocated by malloc(). |
Policy "l2"
It tries to breakdown memory usage into specific components. See
policy.l2.json for details, and tell dmikurube@ to add more components.
| mmap-v8-heap-newspace |
JavaScript new (nursery) heap for younger objects. |
| mmap-v8-heap-coderange |
Code produced at runtime including JIT-compiled JavaScript code. |
| mmap-v8-heap-pagedspace |
JavaScript old heap and many other spaces. |
| mmap-v8-other |
Other regions mmap'ed by V8. |
| mmap-catch-all |
Any other mmap'ed regions. |
| tc-webcore-fontcache |
Blocks used for FontCache. |
| tc-skia |
Blocks used for Skia. |
| tc-renderobject |
Blocks used for RenderObject. |
| tc-renderstyle |
Blocks used for RenderStyle. |
| tc-webcore-sharedbuf |
Blocks used for WebCore's SharedBuffer. |
| tc-webcore-XHRcreate |
Blocks used for WebCore's XMLHttpRequest (create). |
| tc-webcore-XHRreceived |
Blocks used for WebCore's XMLHttpRequest (received). |
| tc-webcore-docwriter-add |
Blocks used for WebCore's DocumentWriter. |
| tc-webcore-node-and-doc |
Blocks used for WebCore's HTMLElement, Text, and other Node objects. |
| tc-webcore-node-factory |
Blocks created by WebCore's HTML*Factory. |
| tc-webcore-element-wrapper |
Blocks created by WebCore's createHTML*ElementWrapper. |
| tc-webcore-stylepropertyset |
Blocks used for WebCore's StylePropertySet (CSS). |
| tc-webcore-style-createsheet |
Blocks created by WebCore's StyleElement::createSheet. |
| tc-webcore-cachedresource |
Blocks used for WebCore's CachedResource. |
| tc-webcore-script-execute |
Blocks created by WebCore's ScriptElement::execute. |
| tc-webcore-events-related |
Blocks related to WebCore's events (EventListener and so on) |
| tc-webcore-document-write |
Blocks created by WebCore's Document::write. |
| tc-webcore-node-create-renderer |
Blocks created by WebCore's Node::createRendererIfNeeded. |
| tc-webcore-render-catch-all |
Any other blocks related to WebCore's Render. |
| tc-webcore-setInnerHTML-except-node |
Blocks created by setInnerHTML. |
| tc-wtf-StringImpl-user-catch-all |
Blocks used for WTF::StringImpl. |
| tc-wtf-HashTable-user-catch-all |
Blocks used for WTF::HashTable. |
| tc-webcore-everything-create |
Blocks created by WebCore's any create() method. |
| tc-webkit-from-v8-catch-all |
Blocks created by V8 via WebKit functions. |
| tc-webkit-catch-all |
Any other blocks created by WebKit. |
| tc-v8-catch-all |
Any other blocks created in V8. |
| tc-toplevel-string |
All std::string objects created at the top-level. |
| tc-catch-all |
Any other blocks by malloc(). |
Policy "t0"
It classifies memory blocks based on their type_info.
| mmap-v8 |
mmap'ed for V8. It includes JavaScript heaps and JIT compiled code. |
| mmap-catch-all |
mmap'ed for other purposes. |
| tc-std-string |
std::string objects. |
| tc-WTF-String |
WTF::String objects. |
| tc-no-typeinfo-StringImpl |
No type_info (not allocated by 'new'), but allocated for StringImpl. |
| tc-Skia |
Skia objects. |
| tc-WebCore-Style |
WebCore's style objects. |
| tc-no-typeinfo-other |
Any other blocks without type_info. |
| tc-other |
All objects with other type_info. |