There is lots to report this month...
* GCC now has experimental support for offloading.
Offloading is the ability for the compiler to separate out portions of the program to be compiled by a second, different compiler. Normally this second compiler would target a different architecture which can be accessed from the primary architecture. Like a CPU offloading work onto a GPU in a graphics card.
Currently only the Intel MIC architecture is supported. See here for more information: https://gcc.gnu.org/wiki/Offloading
* The strings program from the binutils package now defaults to using the --all option to scan the entire file. Before the default used to be --data, which would only scan data sections in the file.
The reason for the change is that the --data option uses the BFD library to locate data sections within the binary, which exposes the strings program to any flaws in that library. Since security researchers often use strings to examine potential viruses this meant that these flaws could affect them.
* GCC now has built-in pointer boundary checking: -fcheck-pointer-bounds
This adds pointer bounds checking instrumentation to the generated code. Warning messages about memmory access errors may also be produced at compile time unless disabled by -Wno-chkp. Additional options can be used to disable bounds checking in certain situations, eg on reads or writes etc. It is also possible to use attributes to disable bounds checking on specific functions and structures.
* GCC now has some built-in functions to perform integer arithmetic with overflow checking. For example:
bool __builtin_sadd_overflow (int a, int b, int *res)
bool __builtin_ssubl_overflow (long int a, long int b, long int *res)
bool __builtin_umul_overflow (unsigned int a, unsigned int b, unsigned int *res)
These built-in functions promote the first two operands into infinite precision signed type and perform addition (or subtraction or multiplication) on those promoted operands. The result is then cast to the type the third pointer argument points to and stored there. If the stored result is equal to the infinite precision result, the built-in functions return false, otherwise they return true.
* GCC now has experimental support for NVidia's NVPTX architecure. Currently only compilation is supported. Assembly and linking are not yet available.
* Two new options to disable warnings have been introduced to GCC:
Stops warnings about shifts by a negative amount.
Stops warnings when the shift amount being more than the width of the type being shifted.
* A new optimization has been added to GCC: -flra-remat
This enables "rematerialization" during the register assignment pass (lra). What happens is that instead of storing a value in a register the optimizations chooses to recaclulate the value when needed. Thus freeing up the register for other purposes. Obviously this is only done when the optimization calculates that it will be worth it. This new optimization is enabled automatically at -O2, -O3 and -Os.
* A new profling option has been added to GCC: -fauto-profile[=<file>]
This enables sampling based feedback directed optimizations, and optimizations generally profitable only with profile feedback available. If <file> is specified, GCC looks in <file> to find the profile feedback data files.
In order to collect the profile data you need to have:
1. A linux system with linux perf support.
2. (optional) An Intel processor with last branch record (LBR) support. This is to guarantee accurate instruction level profile, which is important for AutoFDO performance.
To collect the profile, first use linux perf to collect raw profile. (See https://perf.wiki.kernel.org/). For example:
perf record -e br_inst_retired:near_taken -b -o perf.data -- <your_program>
Then use create_gcov tool, which takes raw profile and unstripped binary to generate AutoFDO profile that can be used by GCC. (See https://github.com/google/autofdo).
create_gcov --binary=your_program.unstripped --profile=perf.data --gcov=profile.afdo
* New optimization has been added to GCC: -fschedule-fusion
This performs a target dependent pass over the instruction stream to schedule instructions of same type together because target machine can execute them more efficiently if they are adjacent to each other in the instruction flow.
Enabled by default at levels -O2, -O3, -Os.
* The ARM backend to GCC now supports a new option: -masm-syntax-unified
This tells the backend that it should assume that any inline assembler is using unified asm syntax. This matters for targets which only support Thumb1 as be defaul they assume that divided syntax is being used.
* The MIPS backend to GCC now supports two additional variants of the o32 ABI. These are intended to enable a transition from 32-bit to 64-bit registers. These are FPXX (-mfpxx) and FP64A (-mfp64 -mno-odd-spreg).
The FPXX extension mandates that all code must execute correctly when run using 32-bit or 64-bit registers. The code can be interlinked with either FP32 or FP64, but not both.
The FP64A extension is similar to the FP64 extension but forbids the use of odd-numbered single-precision registers. This can be used in conjunction with the FRE mode of FPUs in MIPS32R5 processors and allows both FP32 and FP64A code to interlink and run in the same process without changing FPU modes.
* The linker supports a new command line option: --fix-cortex-a53-835769
This enables a link-time workaround for erratum 835769 present on certain early revisions of Cortex-A53 processors. The workaround is disabled by default.
* The linker supports a new command line option: --print-sysroot
This will display the sysroot that was configured into the linker when it was built. If the linker was configured without sysroot support nothing will be printed.