This document describes the proposal to add aligned instruction bundle support (in short - "bundling") in LLVM and its implementation in the MC module. SpecificationFor the purpose of supporting the Software Fault Isolation (SFI) mechanisms required by Native Client, the following directives are added to the LLVM assembler:
With the following semantics: When aligned instruction bundle mode ("bundling" in short) is enabled ( Furthermore, the For example, consider the following: .bundle_align_mode 4mov1mov2mov3Assuming that each of the mov instructions is 7 bytes long and mov1 is aligned to a 16-byte boundary, two bytes of NOP padding will be inserted between mov2 and mov3 to make sure that mov3 does not cross a 16-byte bundle boundary.A slightly modified example: .bundle_align_mode 4mov1.bundle_lockmov2mov3.bundle_unlockHere, since the bundle-locked sequence mov2 mov3 cannot cross a bundle boundary, 9 bytes of NOP padding will be inserted between mov1 and mov2.An example to demonstrate the align_to_end option:.bundle_align_mode 4mov1mov2.bundle_lock align_to_endmov3mov4.bundle_unlockNormally, only two bytes of NOP padding would be required between mov2 and mov3 to ensure that bundle-locked sequence does not cross a bundle boundary. However, since align_to_end was provided, an additional two bytes of NOP padding will be inserted so that the sequence ends at a boundary.For information on how this ability is used for software fault isolation by Native Client, see the following resources:
Implementation in LLVM MCAs proposed, bundling is a feature of the assembler. Therefore, it is implemented in the MC module of LLVM. Specifically, the following parts are affected:
The following description will focus on the path: Text assembly -> ELF object streamer -> Assembly -> Object file emission. This path can be roughly divided to three stages:
Parsing and emitting sections and fragmentsIn order to implement bundling, we use the existing assembly parsing facilities in MC, adding support for the new directives. The existing section and fragment abstractions are used, with some flags added to keep the state of the bundling directives encountered. Specifically:
When bundling mode is turned on in the assembler ( BundleAlignSize > 0), the following rules apply to emitting fragments:
Laying out fragmentsBundling blends well into the existing layout mechanism in the MC assembler, since its effects are somewhat similar to relaxation. Some fragments may need to grow due to padding, which may require re-layout of subsequent fragments and recomputation of fixups. Therefore, the MC assembler employs an iterative layout algorithm. The following diagram will help explain the layout of fragments w.r.t. bundling:
Note
that since we create a single data fragment for a bundle-locked group,
the above applies to such groups as well as single instructions. Writing fragments to the object fileNote: the write target of the assembler is abstracted into a stream object, which can also write into memory. "Object file" here implies this stream. In the last step, the assembler writes the list of fragments into sections according to their layout order. In this step, the BundlePadding field created during layout is used to add NOP padding (by calling writeNopData on the target-specific assembler backend) of appropriate size before fragments that require it.Performance impactsIt's interesting to study the performance impact of adding the bundling feature on MC's assembler normal operation (without actual bundling directives). Memory consumption of fragment objectsAmount of bytes needed for the various MCFragment objects on x86-64:
Explanation: the single-byte BundlePadding field in MCEncodedFragment was placed in space reserved for alignment earlier. The same was done for the boolean HasInstructions in MCDataFragment.Assembler runtimellvm-mc was run on a large assembly file (produced by compiling gcc 3.5 into a single assembly file) as follows:sudo nice -n -20 perf stat -r 10 llvm-mc -filetype=obj gcc.s -o gcc.oThere were no noticeable difference in the runtime of llvm-mc with and without the bundling patch. |

