About

The parest benchmark in spec2017 is reasonably friendly for vectorization, performance gains relative to a single FPU scalar implementation should be on the order of 35%.   Verify the benchmark vectorizes and sees a comparable performance improvement on RISC-V.


parest when run on the k1 chip (BPI-F3) shows a 51.72% decrease in dynamic instruction counts, but a 1.94% regression in cycle counts.  As has been discussed on the cam4 work item, we believe this is an artifact of weaknesses in the k1's vector unit.




Stakeholders/Partners

RISE:

Ventana: Robin Dapp – lead developer

Ventana: Jeff Law


External:



Dependencies


Status


Development


Development Timeline1H2024
Upstreaming


Upstream Version

gcc-14

Spring 2024




Contacts

Robin Dapp (Ventana)

Jeff Law (Ventana)


Dependencies

None




Updates