The bwaves benchmark has some vector opportunities. x86_64 is seeing roughly a 10% improvement due to vectorization, but aarch64 is getting no measurable improvement. This may point to a problem with VLA style vectorization.

Things have gone a bit backwards over the last several months. We're now seeing a 5% regression in dynamic instruction counts with vector and even larger regressions in cycle counts when run on the k1 board. The most pressing need here is to figure out what's going on with the instruction counts. Odds are we're not going to see improvement on the k1 design due weakness in the vector architecture.

Stakeholders/Partners

RISE:

...

Page Properties

Development

Status


colour	Blue
title	IN PROGRESS

Development Timeline

1H2024

Upstreaming

Status


colour	Blue
title	INPROGRESS

Upstream Version

gcc-1415

Spring 20242025

Contacts

Robin Dapp (Ventana)

Jeff Law (Ventana)

Dependencies

None

Updates

28 May 2024

Data from k1. Moving to 2H2024.

14 Mar 2024

Seeing a roughly 5% improvement in dynamic instruction counts
- x86_64 sees a 10% runtime improvement from vectorization
- aarch64 sees no improvement from vectorization
- Suspect there's a problem with VLA style vectorization. Happy we're seeing a 5% count improvement, but not enough to get us on-par with x86

...

Versions Compared

Old Version 3

New Version Current

Key

Stakeholders/Partners

RISE:

Updates

Page Comparison

Versions Compared

Old Version 3

New Version Current

Key

Stakeholders/Partners

RISE:

Updates