Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 12 Next »

GCC seems to be particularly poor at utilizing the Zbs extension, particularly for 32bit objects on rv64 archs.  The core issue is that if a Zbs instruction modifies bit 31, then the compiler likely needs to emit a sign extension to satisfy various architecture/ABI requirements.


However, there are several cases where the compiler can know it is safe to avoid the extension.


  1. xalancbmk's bitset implementation has a redundant bit clear before setting the same bit.  This can be fixed in a generic way with an additional logical simplification pattern 
  2. bext can be used to extract a single bit, storing the result into an SImode object, even for rv64 since bits 1..63 will be zero'd by the (&1) operation in the bext specification.
  3. ~(1 << N) & C can be safely used for a 32bit object on rv64 when C has 33 or more leading zeros
  4. (1 << N) | C and (1 << N) ^ C can be safely used when the logical XOR/IOR is done in DImode since we don't have to worry about sign-extending a DImode object
  5. An explicit extension of SImode (1 << N) to DImode can be handled with a simple bset with x0 as a source operand
  6. Occasionally GCC will use a "zero_extract" as a destination for some bifield insertions which can be handled with bset/bclr
  7. When the shift count is masked such that we know bit 31 is not changed we can more aggressively generate Zbs instructions.

Stakeholders/Partners

RISE:

Ventana: Jeff Law – general oversight / guidance & implementation

Ventana: Raphael Zinsly – implementations

External:


Dependencies



Status

Development

IN PROGRESS


Development Timeline1H2024
Upstreaming

IN PROGRESS



Upstream Version

gcc-15 (target)

(Spring 2025)





Contacts

Jeff Law (Ventana)


DependenciesNone


Updates

  • (1 <<N) | C and (1 << N) ^ C for DImode objects has been integrated
  • Explicit zero extension of (1 << N) in SImode  using bset has been submitted & integrated
  • Handling of zero_extract destinations for single bit insertions has been submitted

 

  • Raphael's code for using bext to extract a single bit, storing the result in an SImode object for rv64 has been integrated
  • Raphael's code to handle ~(1 << N) & C where C has at least 33 leading zeros has been integrated
  • Jeff's code to handle (1 << N) | C and (1 << N) ^ C for DImode objects has been submitted

 

  • Raphael's code for using bext to extract a single bit, storing the result in an SImode object for rv64 has been submitted

 

  • (X | Y) & ~Y → X & ~Y simplification added to logical simplifications, eliminating xalancbmk's redundancy in its bitset code

 

  • There's probably about a dozen issues identified with patches that are ready or nearly ready for upstreaming.  First patch is going through upstream process right now.

 

  • Ventana has discovered (and fixed internally) roughly a dozen cases where GCC was failing to utilize the Zbs extension as well as it could/should
  • Performance testing of those changes should start shortly
  • Plan is to start upstreaming them as soon as gcc-15 is open for development.

 

  • Noted as a 1H2024 work item.
  • No labels