GCC seems to be particularly poor at utilizing the Zbs extension, particularly for 32bit objects on rv64 archs. The core issue is that if a Zbs instruction modifies bit 31, then the compiler likely needs to emit a sign extension to satisfy various architecture/ABI requirements.
However, there are several cases where the compiler can know it is safe to avoid the extension. For example for (1 << N), if we know N can never have the value 31, then we know bit 31 will not be modified and thus no explicit extension is needed.
We have also seen cases where we have multiple variable bit manipulation instructions modifying the same bit. For example a bclr followed by a bset. This happens in perlbench's bit manipulation routines for example. WIth a bit of work the redundant bit operation can be removed.
GCC will sometimes generate unexpected RTL for setting/clearing a bit. Usually these are represented as logical operations; however, in some cases than can be represented as a bit extract/deposit (zero_extract in GCC's terminology). These show up in the GCC benchmark within spec. Patterns for these cases should be added to the compiler.
For an explicit zero-extension from of 1 << N from 32 to 64 bits, we can generate bset directly. We know this will never set bits 32..63 and thus it is already zero-extended.
To invert the result of a bext, we can use bext+seqz.
xalancbmk's bitset implementation has a redundant bit clear before setting the same bit. This can be fixed in a generic way with an additional logical simplification pattern- bext can be used to extract a single bit, storing the result into an SImode object, even for rv64 since bits 1..63 will be zero'd by the (&1) operation in the bext specification.
Stakeholders/Partners
RISE:
Ventana: Jeff Law – general oversight / guidance & implementation
Ventana: Raphael Zinsly – implementations
External:
Dependencies
Status
Updates
- Raphael's code for using bext to extract a single bit, storing the result in an SImode object for rv64 has been submitted
- (X | Y) & ~Y → X & ~Y simplification added to logical simplifications, eliminating xalancbmk's redundancy in its bitset code
- There's probably about a dozen issues identified with patches that are ready or nearly ready for upstreaming. First patch is going through upstream process right now.
- Ventana has discovered (and fixed internally) roughly a dozen cases where GCC was failing to utilize the Zbs extension as well as it could/should
- Performance testing of those changes should start shortly
- Plan is to start upstreaming them as soon as gcc-15 is open for development.
- Noted as a 1H2024 work item.