CT_00_019 -- Vectorize fotonik benchmark from spec2017

About

The fotonik3d benchmark in spec2017 is reasonably friendly for vectorization, performance gains relative to a single FPU scalar implementation should be on the order of 12%.   Verify the benchmark vectorizes and sees a comparable performance improvement on RISC-V.


Much like we've seen with cam4, fotonik seems to be vectorizing well and shows a 64% reduction in instruction counts on the k1, but the performance with vector is terrible at a 74% regression in cycle counts.  We believe this is due to a weak vector unit, possibly combined with additional weaknesses elsewhere in the design.  




Stakeholders/Partners

RISE:

Ventana: Robin Dapp – lead developer

Ventana: Jeff Law


External:



Dependencies


Status

Development

COMPLETE


Development Timeline1H2024
Upstreaming

COMPLETE


Upstream Version

gcc-14

Spring 2024



Contacts

Robin Dapp (Ventana)

Jeff Law (Ventana)


Dependencies

Closure needs

performance testing



Updates

 

  • Update with icount/cycle data from the k1.  

 

  • We're seeing roughly a 46% reduction in dynamic instruction counts.  This is far higher than the performance improvements seen on other targets
    • x86_64 sees a 17% performance improvement from vector
    • aarch64 sees a 10% performance improvement from vector
    • Conclusion: We're doing well on this benchmark with RVV 

 

  • Project reported as a priority for 1H2024