CT_00_029 -- Improve constant synthesis
When the Zbkb extension is enabled an arbitrary 64bit constant can be loaded in a maximum of 5 instructions. A lui+addi for the upper and lower 32 bits and a pack to merge them. High performance uarchs will execute this in 2c and even a simplistic uarch would probably take a maximum of 5c. Contrast to the what we do now where we push complex constants into the constant pool. That's probably 3 instructions and 5c on most uarchs. Naturally we would like to see constant synthesis improved when Zbkb is enabled. The basic support for Zbkb is development-complete. So the effort here is really just to use it in constant synthesis.
Known areas to improve:
Use bseti when profitable. Many cases of poor constant synthesis are failure to use bseti.Use "uw" variants of shifts/arithmetic instructions when the constant has bit 0x80000000 set.Use shNadd for constants evenly divisible by 9, 5 or 3.Use blcri in conjunction with lui or addi to clear a small number of bits, particularly in the high 33 bits of a 64bit wordAdjust constant so low 13 bits are 0x1800, recursively re-synthesize and restore low bits with trailing addi.Synthesize C' from C using bit inversion. Synthesize C', then invert the result. If that's better than synthesizing C directly, then use the inversion sequence.Use pack for repeating constantsDepends on reassociation of constant in logical ops with shiftsDepends on basic Zbkb support
Use pack as synthesis of last resort.
Stakeholders/Partners
RISE:
Ventana: Jeff Law – general oversight / guidance and implementation work.
External:
RAU: Sevak Sargsyan & Lyut Nersisyan – Lyut did the initial Zbkb work under contract to Ventana.
Dependencies
Status
Updates
- Project done.
- zbkb as synthesis of last support submitted upstream for CI testing
- Additional ideas for improvements in constant synthesis pushed to 2H2024.
- Pack for repeating constants is integrated.
- Lyut's patterns to allow generation of pack, packw, packh automatically have been integrated.
- Reassociation of constants in logical ops with shifts integrated upstream
- Using pack to handle constants with equal high/low halves implemented
- Synthesizing the inverted constant, then inverting the result integrated upstream
- Adjusting the constant to have 0s in the low bits, synthesizing, then restoring low bits with addi integrated upstream.
- bclr method for generating constants integrated upstream
- Adjustment of low 13 bits to make easier to synthesize constant development submitted upstream
- Updated list of various deficiencies in GCC's constant synthesis
- Note that some of these are recently fixed (first week of May)
- Hoping to have this done within the next week or so.
- Basic Zbkb patterns written. Haven't started on synthesis changes though.
- Noted as a 1H2024 work item.