Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

About

The bwaves ,  cactubssn, and parest benchmarks in spec2017 have some vectorizable code and should show small, but measurable gains through vector enablement.  Gains for any given benchmark in the set are expected to be around 5%benchmark has some vector opportunities.  x86_64 is seeing roughly a 10% improvement due to vectorization, but aarch64 is getting no measurable improvement.  This may point to a problem with VLA style vectorization.


Things have gone a bit backwards over the last several months.  We're now seeing a 5% regression in dynamic instruction counts with vector and even larger regressions in cycle counts when run on the k1 board.  The most pressing need here is to figure out what's going on with the instruction counts.  Odds are we're not going to see improvement on the k1 design due weakness in the vector architecture.




Stakeholders/Partners

RISE:

Ventana: Robin Dapp – lead developer

Ventana: Jeff Law


External:



Dependencies


Status

Page Properties


Development

Status
colourRedBlue
titleNOT STARTEDIN PROGRESS


Development TimelineNA1H2024
Upstreaming

Status
colourRedBlue
titleNOT STARTEDINPROGRESS


Upstream Version

gcc-15

Spring 2025



Contacts

Robin Dapp (Ventana)

Jeff Law (Ventana)


Dependencies

None




Updates

 

  • Data from k1.  Moving to 2H2024.

  • Seeing a roughly 5% improvement in dynamic instruction counts
    • x86_64 sees a 10% runtime improvement from vectorization
    • aarch64 sees no improvement from vectorization
    • Suspect there's a problem with VLA style vectorization.  Happy we're seeing a 5% count improvement, but not enough to get us on-par with x86 

 

  • Project reported as a priority for 1H2024

...