/
CT_00_051 -- Zicond with if-conversion improvements (GCC)

CT_00_051 -- Zicond with if-conversion improvements (GCC)

About

The ZiCondops extension provides a conditional zero primitive upon which subsets of conditional move and conditional arithmetic/logical operations can be implemented.   Transforming control flow into conditional operations can improve code performance by eliminating branch mispredict costs as well as reducing the load on the branch predictors.  The earlier in the optimizer pipeline these transformations are performed the more likely they are to expose secondary optimization opportunities as well since the transformations result in larger basic blocks (a fundamental unit of code most compiler optimizations work on).


This item is meant to track additional opportunities to optimize code using the zicond extension


  • Improve gimple→rtl expansion
    • Conditional moves can be expressed in multiple forms in gimple.  The multiplication by a boolean variant could have initial RTL generation step improved to first try conditional move expanders, then fall back to other approaches
    • The RISC-V specific support (riscv_expand_conditional_move) should be improved
      • Conditionally extend the comparison inputs to word_mode so that we can handle < word mode comparisons
      • Extend the true/false arms to word_mode and conditionally use a word mode temporary and Jivan's narrowing subreg trick to handle some of the sub-word cases
  • Improvement of if-conversion pass in GCC to handle SUBREG and zero/sign extended objects
    • Requires adjustments to the conditional execution pass in RTL
    • extensions probably aren't hard to support, subregs will require deeper thought
  • Improve combination of conditional moves
    • In some cases a generalized conditional move can be reformulated as conditional arithmetic
    • Instead of selecting on the final output of say a shift, recognize that we can emit a conditional move on the shift count and then unconditionally shift
  • If-convert the conditional in the move_one_fast loop of deepsjeng
    • Two approaches
      • Improve min/max discovery in gimple which should simplify the conditional code to optimizable form in the RTL if-converter code
      • Improve the RTL if-converter code to better handle multiple if-convertable instrutions
      • Add backend pattern to recognize an if-then-else as a min/max
      • Robin has submitted some code for this, but it needs to be adjusted for reviewer feedback
  • Cost model adjustments
    • As touched on in upstream bug 112462, when we have a condition other than (reg) eq/ne (const_int 0) we need to bump the cost of using zicond as the condition will need canonicalization.
    • Similarly we may need to bump the cost depending on the true/false arms
    • May want to do some refactoring so that we can share code across costing & expansion.
  • When one arm of a conditional move can be trivially derived from the other, say by adding a small constant, we can emit a single zicond + adjustment rather than a fully generalized conditional move via 2 zicond instructions.   Conceptually this is similar to how we handle something like x = cond ? C1 : C2, we just need to detect it earlier.  See these examples on godbolt.
    Matching this style would be one approach and probably generally profitable for the first case:
    (set (reg:DI 135 [ <retval> ])
        (plus:DI (if_then_else:DI (reg:DI 145)
                (const_int 0 [0])
                (reg:DI 143))
            (reg:DI 147)))
    
    Obviously we could replace the PLUS with a variety of operators.
    
    Another approach would likely be to match (which falls into the sub-word cases)
    (set (reg:DI 147)
        (if_then_else:DI (reg:DI 145)
            (sign_extend:DI (plus:SI (subreg:SI (reg:DI 138) 0)
                    (const_int 5 [0x5])))
            (const_int 0 [0])))
    
    
    

Analysis has shown that the most common missed if-conversion cases for RISC-V are related to mode changing operators such as SUBREG, ZERO_EXTEND and SIGN_EXTEND which are commonly used when operating on 32bit objects for rv64..  ESWIN and Ventana have differing implementations in this space that need to be resolved.  The core concern with the ESWIN implementation is that it directly modifies the objects in the IL, which in turn means that it's difficult (potentially impossible) to correctly handle certain cases (shifts).  In contrast the Ventana implementation emits new IL for the converted sequence and deletes the old parts of the IL.


Stakeholders/Partners

RISE:

Ventana: Raphael Zinsly, Jeff Law, Robin Dapp ESWIN: Fei Gao

External:


Dependencies



Status

Development

IN PROGRESS


Development Timeline1H2025
Upstreaming

IN PROGRESS



Upstream Version

gcc-16 (target)

(Spring 2026)





Contacts

Raphael Zinsly (Ventana)

Jeff Law (Ventana)


DependenciesNone


Updates


  

  • Remaining items from 2H2024 rolled into new task