CT_00_007 - Fusion Support (GCC)
About
Many RISC-V processors support "Instruction Fusion" or "Macro-op Fusion" to improve performance. The basic idea is certain instructions often show up together in a particular order to implement certain idioms. For example lui+addi for constant synthesis. Under the right conditions the processor can "fuse" the two instructions together to reduce the latency of the second instruction, reduce internal processor resources, etc.
Fusion typically requires the instructions to be consecutive in the instruction stream. The goal of this project is to define, in a relatively generic way, a method to describe what fusions a particular micro-architecture supports and provide mechanisms to keep those instructions consecutive in the instruction stream.
It is expected that a typical set of supported fusions can reduce the operation count within the processor's execution units by 1-3%.
Stakeholders/Partners
RISE:
Ventana: 1 FTE (VRULL Contract). Initial implementation (Philipp Tomsich)
Ventana: 1 FTE ~2 weeks. Raphael Zinsly: Improve implementation to cover missed cases
Ventana: 1 FTE ~2 weeks. Jivan Hakobyan: Improve tooling to analyze instruction trace data for missed cases
External:
Dependencies
Status
Updates
- Infrastructure for fusion upstreamed to GCC. Currently supports 10 fusion cases supported by Veyron V1
- Other ports can re-use those cases trivially and the framework is generic enough to add additional cases over time
- Development of store-store fusion support is effectively complete
- Some data on how to evaluate store-pair fusion available, but it's very noisy
- Perhaps just focus on squashing out the obvious cases from the instruction stream data and call it done
- Working through implementation details on store-store case
- Thinking is to start upstreaming once store-store case is handled reasonably well
- Raphael has prototype to implement missing fusion case
- Under evaluation using tools from Jivan (dynamic instruction stream)
- Unclear how large end benefit will be, not sure if we have good insights from our emulator to tell us when this happens
- WIP to implement missing "fusion" case for Veyron V1 from Raphael
- Note Stakeholders/Partners in a consistent way
– Dates on or before June 1 are approximate
- Project reported as priority for 2H23