/
CT_00_042 -- Additional Constant Synthesis Improvements
CT_00_042 -- Additional Constant Synthesis Improvements
Constant synthesis was greatly improved during 2024. The areas noted here are exploratory and thus some ideas may be useful, some may not.
Areas to explore:
- When possible derive the upper and lower halves from a common sequence, so for example sharing a lui
- Can this be handled during synthesis?
- May also be capturable during reload_move2add, but hard register dependencies may get in the way
- Reevaluate the mvconst_internal pattern
- Natural tension between exposing the synthesis early vs late.
- Exposing early makes synthesis instructions optimizable, but leads to longer, more complex sequences which may inhibit optimization in some cases
- Combine in particular is sensitive to these issues as it has limits on the number of instructions to process at a time
- Synthesis counts against those limits (and-shift32.c is a great example)
- Exposing late leads to an inability to optimize the synthesis itself. dup-{1,2,3} are a good example
- No good solutions
- General desire to move away from the mvconst_internal pattern
- Limiting constants handled by that pattern should generally be OK direction-wise as long as we don't regress code generation
- Combine needs improvement in its handling of REG_EQUAL notes. Right now they're only used for 2→1 cases, but we need them across the board.
- Refactor the relevant code
- Consider staging in the improvements
- Pay particular attention to note distribution which can be quite hairy
- Local constant propagation after combine
- Constant propagator only supposed to work with cfglayout mode
- Perhaps not allow changing JUMP_INSNs would allow it to properly work outside cfglayout mode?
- reload_cse_move2add
- Limited in its scope due to hard register dependencies.
- Perhaps utilize REG_EQUAL notes and handle common cases for depedencies?
- vsetvl optimization
- Also limited by hard register dependencies, particularly for 2**n constants that don't fit into immediate field
- Ideally if we come up with a solution for reload_cse_move2add we could do something similar here?
- Is it worth the effort?
- Partially redundant constants
- Some paths compute the constant more than once, while some paths not at all
- Thus poor fit for PRE
- If register pressure is low, then it may be profitable to speculatively hoist more aggressively
Stakeholders/Partners
RISE:
Ventana: Jeff Law & Raphael Zinsly – general oversight / guidance and implementation work.
Rivos: Vineet Gupta – generally interested in constant synthesis and has some state on #5 above.
External:
Dependencies
Status
Updates
- Items not likely to land in 2H 2024 moved out into new item for 2025
, multiple selections available,
Related content
CT_00_010 - Improve Long branch/jump support (GCC)
CT_00_010 - Improve Long branch/jump support (GCC)
Read with this
CT_00_031 -- Additional Constant Synthesis Improvements
CT_00_031 -- Additional Constant Synthesis Improvements
More like this
2025-1H - Compilers and Toolchains Priorities
2025-1H - Compilers and Toolchains Priorities
Read with this
CT_00_029 -- Improve constant synthesis
CT_00_029 -- Improve constant synthesis
More like this
CT_00_027 -- Improve ceil/round code generation in GCC
CT_00_027 -- Improve ceil/round code generation in GCC
More like this
CT_00_028 -- Investigate and improve Scalar code generation for cactuBSSN
CT_00_028 -- Investigate and improve Scalar code generation for cactuBSSN
More like this