About
The WRF benchmark in spec2017 is reasonably friendly for vectorization, performance gains relative to a single FPU scalar implementation should be on the order of 40%. Verify the benchmark vectorizes and sees a comparable performance improvement on RISC-V.
Stakeholders/Partners
RISE:
Ventana: Jeff Law
External:
Dependencies
Status
Updates
- Currently seeing a 46% reduction in dynamic instructions
- Actual improvement from vectorization seen on x86_64 – 37%
- Actual improvement from vectorization seen on aarch64 – 37%
- Again, we're counting dynamic instructions on RISC-V and actual improvement on the competitive architectures
- Need to have a dynamic instruction count improvements at or better than the real improvement seen on the competitive architectures
- Conclusion: Hitting the mark for this phase. Next step is to verify performance on real hardware
- Project reported as a priority for 1H2024