CT_00_017 -- Investigate and improve Vector code generation for cactuBSSN
About
CactuBSSN was showing a 1-2% degradation in instruction counts with vector enabled. That has been fixed as of b7b387e1200f on the GCC trunk and we see roughly a 1% dynamic instruction count improvement.
Beyond fixing the regression, vector is not expected to provide much benefit for CactuBSSN. Testing on the k1 chip shows that 1.2% improvement in dynamic instruction counts using vector, but a 1% performance regression. As noted in the cam4 work item, we believe this is due to weaknesses in the k1 vector unit design.
Stakeholders/Partners
RISE:
Ventana: Jeff Law
External:
Dependencies
Status
Development | COMPLETE |
|
|---|---|---|
Development Timeline | 1H2024 |
|
Upstreaming | COMPLETE |
|
Upstream Version | gcc-14 Spring 2024
|
|
Contacts | Jeff Law (Ventana) |
|
Dependencies | None |
|
Updates
May 28, 2024
Add data from run on the k1 (BPI-F3 board)
Mar 14, 2024
Regression has been fixed on the GCC trunk
Competitive data shows no significant improvement with vector
The 1% reduction in instruction counts we see with vector enabled is not likely to move the needle in any significant way performane-wise
Considering this done/closed.
Dec 29, 2023
Project reported as a priority for 1H2024