LLVM 9.0.0 Release Notes¶
- Introduction
- Known Issues
- Non-comprehensive list of changes in this release
- Noteworthy optimizations
- Changes to the LLVM IR
- Changes to building LLVM
- Changes to the AArch64 Backend
- Changes to the ARM Backend
- Changes to the MIPS Target
- Changes to the PowerPC Target
- Changes to the SystemZ Target
- Changes to the X86 Target
- Changes to the AMDGPU Target
- Changes to the RISCV Target
- Changes to LLDB
- External Open Source Projects Using LLVM 9
- Additional Information
Introduction¶
This document contains the release notes for the LLVM Compiler Infrastructure, release 9.0.0. Here we describe the status of LLVM, including major improvements from the previous release, improvements in various subprojects of LLVM, and some of the current users of the code. All LLVM releases may be downloaded from the LLVM releases web site.
For more information about LLVM, including information about the latest release, please check out the main LLVM web site. If you have questions or comments, the LLVM Developer’s Mailing List is a good place to send them.
Known Issues¶
These are issues that couldn’t be fixed before the release. See the bug reports for the latest status.
- PR40547 Clang gets miscompiled by GCC 9.
Non-comprehensive list of changes in this release¶
- Two new extension points, namely
EP_FullLinkTimeOptimizationEarly
andEP_FullLinkTimeOptimizationLast
are available for plugins to specialize the legacy pass manager full LTO pipeline. llvm-objcopy/llvm-strip
got support for COFF object files/executables, supporting the most common copying/stripping options.- The CMake parameter
CLANG_ANALYZER_ENABLE_Z3_SOLVER
has been replaced byLLVM_ENABLE_Z3_SOLVER
. - The RISCV target is no longer “experimental” (see Changes to the RISCV Target below for more details).
- The ORCv1 JIT API has been deprecated. Please see Transitioning from ORCv1 to ORCv2.
- Support for target-independent hardware loops in IR has been added, with PowerPC and Arm implementations.
Noteworthy optimizations¶
LLVM will now remove stores to constant memory (since this is a contradiction) under the assumption the code in question must be dead. This has proven to be problematic for some C/C++ code bases which expect to be able to cast away ‘const’. This is (and has always been) undefined behavior, but up until now had not been actively utilized for optimization purposes in this exact way. For more information, please see: bug 42763 and post commit discussion.
The optimizer will now convert calls to
memcmp
into a calls tobcmp
in some circumstances. Users who are building freestanding code (not depending on the platform’s libc) without specifying-ffreestanding
may need to either pass-fno-builtin-bcmp
, or provide abcmp
function.LLVM will now pattern match wide scalar values stored by a succession of narrow stores. For example, Clang will compile the following function that writes a 32-bit value in big-endian order in a portable manner:
void write32be(unsigned char *dst, uint32_t x) { dst[0] = x >> 24; dst[1] = x >> 16; dst[2] = x >> 8; dst[3] = x >> 0; }
into the x86_64 code below:
write32be: bswap esi mov dword ptr [rdi], esi ret
(The corresponding read patterns have been matched since LLVM 5.)
LLVM will now omit range checks for jump tables when lowering switches with unreachable default destination. For example, the switch dispatch in the C++ code below
int g(int); enum e { A, B, C, D, E }; int f(e x, int y, int z) { switch(x) { case A: return g(y); case B: return g(z); case C: return g(y+z); case D: return g(x-z); case E: return g(x+z); } }
will result in the following x86_64 machine code when compiled with Clang. This is because falling off the end of a non-void function is undefined behaviour in C++, and the end of the function therefore being treated as unreachable:
_Z1f1eii: mov eax, edi jmp qword ptr [8*rax + .LJTI0_0]
LLVM can now sink similar instructions to a common successor block also when the instructions have no uses, such as calls to void functions. This allows code such as
void g(int); enum e { A, B, C, D }; void f(e x, int y, int z) { switch(x) { case A: g(6); break; case B: g(3); break; case C: g(9); break; case D: g(2); break; } }
to be optimized to a single call to
g
, with the argument loaded from a lookup table.
Changes to the LLVM IR¶
- Added
immarg
parameter attribute. This indicates an intrinsic parameter is required to be a simple constant. This annotation must be accurate to avoid possible miscompiles. - The 2-field form of global variables
@llvm.global_ctors
and@llvm.global_dtors
has been deleted. The third field of their element type is now mandatory. Specify i8* null to migrate from the obsoleted 2-field form. - The
byval
attribute can now take a type parameter:byval(<ty>)
. If present it must be identical to the argument’s pointee type. In the next release we intend to make this parameter mandatory in preparation for opaque pointer types. atomicrmw xchg
now allows floating point typesatomicrmw
now supportsfadd
andfsub
Changes to building LLVM¶
- Building LLVM with Visual Studio now requires version 2017 or later.
Changes to the AArch64 Backend¶
- Assembly-level support was added for: Scalable Vector Extension 2 (SVE2) and Memory Tagging Extensions (MTE).
Changes to the ARM Backend¶
- Assembly-level support was added for the Armv8.1-M architecture, including the M-Profile Vector Extension (MVE).
- A pipeline model was added for Cortex-M4. This pipeline model is also used to tune for cores where this gives a benefit too: Cortex-M3, SC300, Cortex-M33 and Cortex-M35P.
- Code generation support for M-profile low-overhead loops.
Changes to the MIPS Target¶
- Support for
.cplocal
assembler directive. - Support for
sge
,sgeu
,sgt
,sgtu
pseudo instructions. - Support for
o
inline asm constraint. - Improved support of GlobalISel instruction selection framework. This feature is still in experimental state for MIPS targets though.
- Various code-gen improvements, related to improved and fixed instruction selection and encoding and floating-point registers allocation.
- Complete P5600 scheduling model.
Changes to the PowerPC Target¶
- Improved handling of TOC pointer spills for indirect calls
- Improve precision of square root reciprocal estimate
- Enabled MachinePipeliner support for P9 with
-ppc-enable-pipeliner
. - MMX/SSE/SSE2 intrinsics headers have been ported to PowerPC using Altivec.
- Machine verification failures cleaned, EXPENSIVE_CHECKS will run MachineVerification by default now.
- PowerPC scheduling enhancements, with customized PPC specific scheduler strategy.
- Inner most loop now always align to 32 bytes.
- Enhancements of hardware loops interaction with LSR.
- New builtins added, eg:
__builtin_setrnd
. - Various codegen improvements for both scalar and vector code
- Various new exploitations and bug fixes, e.g: exploited P9
maddld
.
Changes to the SystemZ Target¶
- Support for the arch13 architecture has been added. When using the
-march=arch13
option, the compiler will generate code making use of new instructions introduced with the vector enhancement facility 2 and the miscellaneous instruction extension facility 2. The-mtune=arch13
option enables arch13 specific instruction scheduling and tuning without making use of new instructions. - Builtins for the new vector instructions have been added and can be
enabled using the
-mzvector
option. Support for these builtins is indicated by the compiler predefining the__VEC__
macro to the value10303
. - The compiler now supports and automatically generates alignment hints on vector load and store instructions.
- Various code-gen improvements, in particular related to improved instruction selection and register allocation.
Changes to the X86 Target¶
- Fixed a bug in generating DWARF unwind information for 32 bit MinGW
Changes to the AMDGPU Target¶
- Function call support is now enabled by default
- Improved support for 96-bit loads and stores
- DPP combiner pass is now enabled by default
- Support for gfx10
Changes to the RISCV Target¶
The RISCV target is no longer “experimental”! It’s now built by default,
rather than needing to be enabled with LLVM_EXPERIMENTAL_TARGETS_TO_BUILD
.
The backend has full codegen support for the RV32I and RV64I base RISC-V instruction set variants, with the MAFDC standard extensions. We support the hard and soft-float ABIs for these targets. Testing has been performed with both Linux and bare-metal targets, including the compilation of a large corpus of Linux applications (through buildroot).
Changes to LLDB¶
- Backtraces are now color highlighting in the terminal.
- DWARF4 (debug_types) and DWARF5 (debug_info) type units are now supported.
- This release will be the last where
lldb-mi
is shipped as part of LLDB. The tool will still be available in a downstream repository on GitHub.
External Open Source Projects Using LLVM 9¶
Mull - Mutation Testing tool for C and C++¶
Mull is an LLVM-based tool for mutation testing with a strong focus on C and C++ languages.
Portable Computing Language (pocl)¶
In addition to producing an easily portable open source OpenCL implementation, another major goal of pocl is improving performance portability of OpenCL programs with compiler optimizations, reducing the need for target-dependent manual optimizations. An important part of pocl is a set of LLVM passes used to statically parallelize multiple work-items with the kernel compiler, even in the presence of work-group barriers. This enables static parallelization of the fine-grained static concurrency in the work groups in multiple ways.
TTA-based Co-design Environment (TCE)¶
TCE is an open source toolset for designing customized processors based on the Transport Triggered Architecture (TTA). The toolset provides a complete co-design flow from C/C++ programs down to synthesizable VHDL/Verilog and parallel program binaries. Processor customization points include register files, function units, supported operations, and the interconnection network.
TCE uses Clang and LLVM for C/C++/OpenCL C language support, target independent optimizations and also for parts of code generation. It generates new LLVM-based code generators “on the fly” for the designed TTA processors and loads them in to the compiler backend as runtime libraries to avoid per-target recompilation of larger parts of the compiler chain.
Zig Programming Language¶
Zig is a system programming language intended to be an alternative to C. It provides high level features such as generics, compile time function execution, and partial evaluation, while exposing low level LLVM IR features such as aliases and intrinsics. Zig uses Clang to provide automatic import of .h symbols, including inline functions and simple macros. Zig uses LLD combined with lazily building compiler-rt to provide out-of-the-box cross-compiling for all supported targets.
Additional Information¶
A wide variety of additional information is available on the LLVM web page, in particular in the documentation section. The web page also contains versions of the
API documentation which is up-to-date with the Subversion version of the source
code. You can access versions of these documents specific to this release by
going into the llvm/docs/
directory in the LLVM tree.
If you have any questions or comments about LLVM, please feel free to contact us via the mailing lists.