...
The fotonik3d benchmark in spec2017 is reasonably friendly for vectorization, performance gains relative to a single FPU scalar implementation should be on the order of 40%12%. Verify the benchmark vectorizes and sees a comparable performance improvement on RISC-V.
Much like we've seen with cam4, fotonik seems to be vectorizing well and shows a 64% reduction in instruction counts on the k1, but the performance with vector is terrible at a 74% regression in cycle counts. We believe this is due to a weak vector unit, possibly combined with additional weaknesses elsewhere in the design.
Stakeholders/Partners
RISE:
Ventana: Robin Dapp – lead developer
Ventana: Jeff Law
External:
Dependencies
...
Page Properties | ||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
...
|
Updates
- Update with icount/cycle data from the k1.
- We're seeing roughly a 46% reduction in dynamic instruction counts. This is far higher than the performance improvements seen on other targets
- x86_64 sees a 17% performance improvement from vector
- aarch64 sees a 10% performance improvement from vector
- Conclusion: We're doing well on this benchmark with RVV
- Project reported as a priority for 1H2024
...