About

The fotonik3d benchmark in spec2017 is reasonably friendly for vectorization, performance gains relative to a single FPU scalar implementation should be on the order of 12%. Verify the benchmark vectorizes and sees a comparable performance improvement on RISC-V.

Much like we've seen with cam4, fotonik seems to be vectorizing well and shows a 64% reduction in instruction counts on the k1, but the performance with vector is terrible at a 74% regression in cycle counts. We believe this is due to a weak vector unit, possibly combined with additional weaknesses elsewhere in the design.

Stakeholders/Partners

RISE:

Ventana: Robin Dapp – lead developer

Ventana: Jeff Law

External:

Dependencies

Status

Development	COMPLETE
Development Timeline	1H2024
Upstreaming	COMPLETE
Upstream Version	gcc-14 Spring 2024
Contacts	Robin Dapp (Ventana) Jeff Law (Ventana)
Dependencies	Closure needs performance testing

Updates

29 May 2024

Update with icount/cycle data from the k1.

14 Mar 2024

We're seeing roughly a 46% reduction in dynamic instruction counts. This is far higher than the performance improvements seen on other targets
- x86_64 sees a 17% performance improvement from vector
- aarch64 sees a 10% performance improvement from vector
- Conclusion: We're doing well on this benchmark with RVV

29 Dec 2023

Project reported as a priority for 1H2024

CT_00_019 -- Vectorize fotonik benchmark from spec2017