CT_00_022 -- Vectorize bwaves from spec2017

About

The bwaves benchmark has some vector opportunities.  x86_64 is seeing roughly a 10% improvement due to vectorization, but aarch64 is getting no measurable improvement.  This may point to a problem with VLA style vectorization.


Things have gone a bit backwards over the last several months.  We're now seeing a 5% regression in dynamic instruction counts with vector and even larger regressions in cycle counts when run on the k1 board.  The most pressing need here is to figure out what's going on with the instruction counts.  Odds are we're not going to see improvement on the k1 design due weakness in the vector architecture.




Stakeholders/Partners

RISE:

Ventana: Robin Dapp – lead developer

Ventana: Jeff Law


External:



Dependencies


Status

Development

IN PROGRESS


Development Timeline1H2024
Upstreaming

INPROGRESS


Upstream Version

gcc-15

Spring 2025



Contacts

Robin Dapp (Ventana)

Jeff Law (Ventana)


Dependencies

None



Updates

 

  • Data from k1.  Moving to 2H2024.

  • Seeing a roughly 5% improvement in dynamic instruction counts
    • x86_64 sees a 10% runtime improvement from vectorization
    • aarch64 sees no improvement from vectorization
    • Suspect there's a problem with VLA style vectorization.  Happy we're seeing a 5% count improvement, but not enough to get us on-par with x86 

 

  • Project reported as a priority for 1H2024