This document describes the standards to which bitcode files must conform to be compatible with PNaCl. This will be a subset of the LLVM bitcode along with PNaCl-specific metadata. Here is a link to the LLVM bitcode reference: http://llvm.org/docs/LangRef.html.
NOTE: This document is a draft, and is subject to change while PNaCl remains under development.
File TypesPNaCl recognizes three kinds of bitcode files:
Object files (.po)
Object files represent application code which has not yet been linked together. These files may contain symbols which are declared, but not defined.
Shared object files (.pso)
Shared object files represent application code which is intended to be shared between multiple applications. Normally, shared object files are produced by a linker which adds additional metadata to the bitcode file (see PNaCl-specific Metadata for details).
Executable files (.pexe)
Executable files represent a complete application. They are produced by a linker which adds additional metadata to the bitcode file (see PNaCl-specific Metadata for details). Executables files should have no undefined symbols, except those which are defined in an explicit external library dependency.
NOTE: The filename extension is not actually required, but they are useful as shorthand for each filetype. You may continue to use ".o" instead of ".po" in Makefiles for object files, even though they are bitcode.
Module Properties
Target Datalayout
Only fields that identify PNaCl as little endian and ILP32, are set for purposes of bitcode optimization.
target datalayout = "e-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-p:32:32:32-v128:128:128"
Target Triple
target triple = "le32-unknown-nacl"
PNaCl-specific Metadata for Shared Linking (NOTE: not part of V1 ABI)
OutputFormat
object, shared, or executable. Stored using named metadata.
deplibs
List of library dependencies. Present for "shared" and "executable" files. Stored using existing field.
SOName
SOName of this library. Present only for "shared" files. Stored using named metadata.
Versioning
TODO: After this is implemented, document it.
Assembly
No assembly is permitted, including both module level and inline assembly.
Supported Intrinsic Variables, Instructions, and Functions
Bitcode may use any architecture-neutral LLVM intrinsics.
Compiler intrinsics llvm.used llvm.compiler.used TEST=any test (we use this for various things)
llvm.global_ctors llvm.global_dtors TEST=native_client/tests/toolchain/initfini* (indirectly)
Variable argument intrinsics (see below for more information) llvm.va_start llvm.va_end llvm.va_copy TEST=see any printf test, or native_client/tests/callingconv (indirectly)
LibC intrinsics
llvm.memcpy.* (TODO document alignment restrictions for memcpy, memmove, memset) llvm.memmove.* llvm.memset.* llvm.sqrt.* (This is only valid for numbers >= -0.0) llvm.powi.* llvm.sin.* llvm.cos.* llvm.pow.* llvm.exp.* llvm.log.* llvm.fma.* (can we really support this, or can we only support llvm.fmuladd.*?) llvm.fabs.* llvm.floor.* TEST=TODO -- partly covered by native_client/tests/math/float_math (indirectly)
Bit Manipulation Intrinsics
llvm.bswap.* llvm.ctpop.* llvm.ctlz.* llvm.cttz.* TEST=native_client/tests/toolchain/llvm_bitmanip_intrinsics.c
Arithmetic with Overflow Intrinsics llvm.sadd.with.overflow.* TEST=TODO
Specialized Arithmetic Intrinsics llvm.fmuladd.* (similar to llvm.fma.*, but does not guarantee a fused multiply add) TEST=TODO
Debugger Intrinsics Supported for transient bitcode files during development/debug sessions. The format is close to DWARF but may change, so it is not guaranteed to be supported. Run "pnacl-strip --strip-debug" before shipping any bitcode. TEST=TODO
Exception Handling Intrinsics llvm.eh.* TEST=tests/toolchain/eh_* (indirectly)
General Intrinsics llvm.trap llvm.expect llvm.donothing TEST=TODO
Half Precision Floating Point Intrinsics llvm.convert.to.fp16 llvm.convert.from.fp16
Trampoline Intrinsics llvm.init.trampoline llvm.adjust.trampoline
Used to support nested functions (see nest parameter attribute). llvm.init.trampoline currently expects a target specific trampoline size and alignment. Perhaps a "large enough" trampoline size will suffice, but this has not been tested.
General Intrinsics llvm.stackprotector
Data TypesEndianness
Bitcode will assume that the machine is little-endian.
Primitive Types
Supported bitcode types: i8, i16, i32, i64, float, double, pointers, arrays, void, metadata. No other primitive types (like x8_fp80, fp128, ppc_fp128) are yet supported.
Derived Types
Aggregate types (arrays, structures, and opaque structures). Vectors may not make it into V1 (TODO) for ABI reasons. Pointers to other types. Functions.
C Data Types
| C Type | Size (bytes) | Alignment (bytes) | Bitcode type | | void | undefined | undefined | void | | char | 1 | 1 | i8 | | short | 2 | 2 | i16 | | int | 4 | 4 | i32 | | long | 4 | 4 | i32 | | long long | 8 | 8 | i64 | | float | 4 | 4 | float | | double | 8 | 8 | double | | long double | 8 | 8 | double | | void* | 4 | 4 | i8* | | function pointer | 4 | 4
| function pointer type (specific to function signature) | | va_list | 16 | 4 | 4 x i32 | | struct | - | - | Use bitcode struct type See the LLVM struct reference |
Bitfields
Assumes little endian layout (double check).
AttributesSupported function attributes
- alwaysinline
- inlinehint
- naked
- noinline
- noreturn
- nounwind
- optsize
- readnone
- readonly
- returns_twice
Unsupported function attributes
- alignstack(<n>)
- is_nsdialect
- nonlazybind
- ssp
- sspreq
Supported function parameter attributes
- zeroext
- signext
- inreg (up to 2. Supported but not guaranteed to have any effect. Only the first 2 arguments may be labelled inreg, if they are of integral type, or a single 64 bit integer may be labeled inreg. The first non-integral type in the arg sequence consumes the remaining available registers.)
- byval
- sret
- noalias
- nocapture
Unsupported function parameter attributes
- nest (see trampoline intrinsics) -- this is used for nested functions, but the trampoline area size and alignment is currently target specific so needs investigation.
Supported linkage types
private, linker_private, linker_private_weak, linker_private_weak_auto, internal, available_externally, linkonce, weak, common, appending, extern_weak, linkonce_odr, weak_odr, externally_visible
Supported calling conventions
ccc, fastcc, coldcc
Calling ConventionsStack Variables
Bitcode must not make any assumption about stack direction, alignment, or layout. Always use alloca to allocate space on the stack.
Function Arguments
When invoking a function, do not lower or expand function arguments. Always use correctly typed arguments.
For example, here is the correct signature of strncmp:
declare i32 @strncmp(i8*, i8*, i32) nounwind readonly
This C code:
long long example(long long x, double y);
Should produce:
declare i64 @example(i64 %x, double %y) nounwind
Passing Structures by Value
Structures passed as arguments (by value) must be represented as a single argument to the function.
%struct.foo = type { i32, i8 }
declare void @example(%struct.foo* nocapture byval %arg) nounwind
How the backend for a particular architecture actually lowers byval is backend-specific. This may or may not match the sys v ABI.
Returning Structures
declare void @example(%struct.foo* sret %result) nounwind
Variadic Functions
Use "..." in bitcode to denote functions with variable arguments. declare i32 @printf(i8*, ...) nounwind
Variadic functions should use the built-in LLVM intrinsics for accessing variable arguments.
declare void @llvm.va_start(i32*) nounwind
declare void @llvm.va_copy(i32*, i32*) nounwind
declare void @llvm.va_end(i32*) nounwind
%0 = va_arg i32* %ap, i32
setjmp and longjmp are provided as functions (TODO: do we want to use intrinsics instead?) struct jmp_buf is currently padded to be a very large buffer (to handle exotic architectures).
C++Name Mangling
We use the Itanium C++ mangling scheme.
Initialization
Constructors and destructors for static objects are listed in the bitcode arrays @llvm.global_ctors and @llvm.global_dtors.
Exception tables / Unwind Information
Unwind information (e.g., DWARF) is not directly exposed to bitcode. It is only generated into the .o file.
C++ should invoke LLVM intrinsics and functions from libgcc_eh to perform exception handling.
Bitcode can assume that the functions defined in libgcc_eh.a (or libgcc_s.so) are present.
LLVM instructions and intrinsics supporting unwinding: invoke landingpad resume TEST=native_client/tests/toolchain/eh_*
llvm.returnaddress TEST=native_client/tests/toolchain/return_address.c
NOTE: more work is needed to make the "struct _Unwind_Exception" truly platform neutral
Debugging InformationLLVM debug metadata is not part of the stable PNaCl bitcode ABI. For debugging, we recommend only having debug information in nexes and not pexes. I.e., run pnacl-translate to convert pexes to nexes before running a debugger. Before shipping, run pnacl-finalize to strip debug info from the bitcode.
Documentation on LLVM debug intrinsics: http://llvm.org/docs/SourceLevelDebugging.html#format_common_intrinsics
Operating System InterfaceStartup
Bitcode may not define _start. Instead, bitcode applications should define main.
Main
main has the following signature:
define i32 @main(i32 %argc, i8** %argv, i8** %envp) nounwind
|