When the Zbkb extension is enabled an arbitrary 64bit constant can be loaded in a maximum of 5 instructions. A lui+addi for the upper and lower 32 bits and a pack to merge them. High performance uarchs will execute this in 2c and even a simplistic uarch would probably take a maximum of 5c. Contrast to the what we do now where we push complex constants into the constant pool. That's probably 3 instructions and 5c on most uarchs. Naturally we would like to see constant synthesis improved when Zbkb is enabled. The basic support for Zbkb is development-complete. So the effort here is really just to use it in constant synthesis.
Known areas to improve:
Use bseti when profitable. Many cases of poor constant synthesis are failure to use bseti.Use "uw" variants of shifts/arithmetic instructions when the constant has bit 0x80000000 set.Use shNadd for constants evenly divisible by 9, 5 or 3.Use blcri in conjunction with lui or addi to clear a small number of bits, particularly in the high 33 bits of a 64bit word- Adjust constant so low 13 bits are 0x1800, recursively re-synthesize and restore low bits with trailing addi.
- Explore if we are using NOT effectively. ie, synthesize the inverted constant, then NOT the result to get the desired constant.
- Use pack for repeating constants
- Use pack as synthesis of last resort
There are also cases that could be improved for designs without Zbkb, but which do have Zbs. Consider a constant with just 4 bits set. Say two non-consecutive bits in the high 32bit part of a 64bit word, then two bits down in the low 12 bits. Such constants will tend to end up in the constant pool. But this could be implemented with two bsets+addi. That is going to be the same size and almost certainly faster than a constant pool reference.
Stakeholders/Partners
RISE:
Ventana: Jeff Law – general oversight / guidance and implementation work.
External:
RAU: Sevak Sargsyan & Lyut Nersisyan – Lyut did the initial Zbkb work under contract to Ventana.
Dependencies
Status
Updates
- bclr method for generating constants going through upstream CI now
- Adjustment of low 13 bits to make easier to synthesize constant development done
- Updated list of various deficiencies in GCC's constant synthesis
- Note that some of these are recently fixed (first week of May)
- Hoping to have this done within the next week or so.
- Basic Zbkb patterns written. Haven't started on synthesis changes though.
- Noted as a 1H2024 work item.