the Chromium logo

The Chromium Projects

Windows Binary Sizes

Chrome binaries sometimes increase in size for unclear reasons and investigating these regressions can be quite tricky. There are a a few tools that help with investigating these regressions or just looking for wasted bytes in DLLs or EXEs. The tools are:

Details of how and when to use these tools are shown below.

If looking for size saving opportunities then “ShowGlobals.exe file.pdb” can be used to find duplicated or large global variables that may be unnecessary. Typical (shortened for this document) results look like this - the first set of entries are duplicated globals, the second set of entries are large globals:

#Dups  DupSize  Size   Section  Symbol-name
 805      805                    std::piecewise_construct
 3        204                    rgb_red
 3        204                    rgb_green
 3        204                    rgb_blue
 187      187                    WTF::in_place
 4        160                    extensions::api::g_factory
 ...
                 122784    2     kBrotliDictionary
                 65536     2     jpeg_nbits_table
                 57080     2     propsVectorsTrie_index
                 53064     3     unigram_table
                 50364     2     kNetworkingPrivate
                 47152     3     device::UsbIds::vendors_
 ...

The actual output is tab separated and can be most easily visualized by pasting into a spreadsheet to ensure that the columns line up.

Some fixed issues include:

The most egregious problems shown by these reports have been fixed but some remaining issues include:

The large kBrotliDictionary and jpeg_nbits_table arrays can be seen, but those are used and are in the read-only section, so there is nothing to be done.

When investigating a regression pdb_compare_globals.py can be used to find out what large or duplicated global variables have showed up between two builds. Just pass both PDBs and a summary of the changes will be printed. This uses ShowGlobals.exe to generate the list of interesting global variables and then prints a diff.

When investigating a regression (or testing a fix) it can be useful to use pe_summarize.py to print the size of all of the sections with a PE, or to compare to PE files (two versions of chrome.dll, for instance). This is the ultimate measure of success for a change - has it made the binary smaller or larger, and if so where?

If an unwanted global variable is being linked in then linker_verbose_tracking.py can be used to help answer the question of “why?” First you need to find out what object file defines the variable. For instance, ff_cos_131072 and the other ff_* globals are defined in rdft.c. When rdft.obj is pulled in then Chrome gets significantly bigger. Some of this is discussed in comments #25 to #27 here: https://bugs.chromium.org/p/chromium/issues/detail?id=624274#c27.

In order to get verbose linker output you need to modify the appropriate BUILD.gn file to add the /verbose linker flag. For chrome.dll I make the following modification:

diff --git a/chrome/BUILD.gn b/chrome/BUILD.gn
index 58586fc..c15d463 100644
--- a/chrome/BUILD.gn
+++ b/chrome/BUILD.gn
@@ -354,6 +354,7 @@ if (is_win) {
"/DELAYLOAD:winspool.drv",
"/DELAYLOAD:ws2_32.dll",
"/DELAYLOAD:wsock32.dll",
+ "/verbose",
\]
if (!is_component_build) {

Then build chrome.dll, redirecting the verbose output to a text file:

> ninja -C out\release chrome.dll >verbose.txt

Alternately you can use the techniques discussed in the The Chromium Chronicle: Preprocessing Source to get the linker command and then manually re-run that command with /verbose appended, redirecting to a text file.

Then linker_verbose_tracking.py is used to find why a particular object file is being pulled in, in this case mime_util.obj:

> python linker_verbose_tracking.py verbose.txt mime_util.obj

Because there are multiple object files called mime_util.obj the script will be searching for for all of them, as shown in the first line of output:

> python linker_verbose_tracking.py verbose.txt mime_util.obj
 Searching for \[u'net.lib(mime_util.obj)', u'base.lib(mime_util.obj)'\]

You can specify which version you want to search for by including the .lib name in your command-line search parameter, which is just used for sub-string matching:

> python linker_verbose_tracking.py verbose.txt base.lib(mime_util.obj)

Typical output looks like this:

> python tools\win\linker_verbose_tracking.py verbose08.txt drop_data.obj
 Database loaded - 3844 xrefs found
 Searching for common_sources.lib(drop_data.obj)
 common_sources.lib(drop_data.obj).obj pulled in for symbol Metadata::Metadata...
     common.lib(content_message_generator.obj)

 common.lib(content_message_generator.obj).obj pulled in for symbol ...
     Command-line obj file: url_loader.mojom.obj

In this case this tells us that drop_data.obj is being pulled in indirectly through a chain of references that starts with url_loader.mojom.obj. url_loader.mojom.obj is in a source_set which means that it is on the command-line. I then change the source_set to a static_library and redo the steps. If all goes well then the unwanted .obj file will eventually stop being pulled in and I can use pe_summarize.py to measure the savings, and ShowGlobals to look for the next large global variable.

Sometimes this technique - of changing source_set to static_library - doesn’t help. In those case it can be important to figure out follow the chain of symbols and see if it can be broken. In one case (crrev.com/2559063002) SkGeometry.obj (skia) was pulling in PeriodicWave.obj (Blink) because of log2f. Investigation showed that SkGeometry.obj referenced the math.h function log2f, while PeriodicWave.obj defined it as an inline function. The chain was broken by not defining log2f as an inline function (letting math.h do that) and a significant amount of code size was saved.

Note that some size regressions only happen with certain build configurations. Ideally all testing would be done with PGO builds, but that is unwieldy, so release (non-component?) builds are probably the best starting point. If a size regression reproes on this configuration then investigate there and fix it. But, it is advisable to verify the fix on a full official build - with both is_official_build and full_wpo_on_official set to true. Failing to do this test can lead to a fix that doesn’t actually help on the builds that we ship to customers, and this can go unnoticed for months.

See crrev.com/2556603002 for an example of using this technique. In this case it was sufficient to change a single source_set to a static_library. The size savings of 900 KB was verified using:

> python pe_summarize.py out\release\chrome.dll
Size of out\release\chrome.dll is 42.127872 MB
      name:   mem size  ,  disk size
     .text: 33.900375 MB
    .rdata:  6.325718 MB
     .data:  0.718696 MB,  0.274944 MB
      .tls:  0.000025 MB
  CPADinfo:  0.000036 MB
   .rodata:  0.003216 MB
  .crthunk:  0.000064 MB
    .gfids:  0.001052 MB
    _RDATA:  0.000288 MB
     .rsrc:  0.175088 MB
    .reloc:  1.443124 MB

Size of size_reduction\chrome.dll is 41.211392 MB
      name:   mem size  ,  disk size
     .text: 33.188599 MB
    .rdata:  6.164966 MB
     .data:  0.707848 MB,  0.264704 MB
      .tls:  0.000025 MB
  CPADinfo:  0.000036 MB
   .rodata:  0.003216 MB
  .crthunk:  0.000064 MB
    .gfids:  0.001052 MB
    _RDATA:  0.000288 MB
     .rsrc:  0.175088 MB
    .reloc:  1.409388 MB

Change from out\release\chrome.dll to size_reduction\chrome.dll
     .text: -711776 bytes change
    .rdata: -160752 bytes change
     .data: -10848 bytes change
    .reloc: -33736 bytes change
Total change: -917112 bytes