CT_00_027 -- Improve ceil/round code generation in GCC

CT_00_027 -- Improve ceil/round code generation in GCC

About

Investigation of the 538.imagick benchmark from spec2017 shows that uses ceil/floor routines heavily.  While the Zfa extension can be used to optimize these calls into simple FP conversion instructions, it is believed that an alternate implementation based on just the F/D conversions can be implemented which will significantly improve performance on designs that do not implement Zfa.  It appears to be roughly a 9-10% improvement in the dynamic instruction count, but a 17% cycle improvement for 538.imagick.

 

LLVM already has this optimization in place.

 

 

 

Stakeholders/Partners

RISE:

Ventana: Jivan Hakobyan (under contract via RAU) – lead developer

Ventana: Jeff Law – general oversight

Rivos: ADLR – provided initial hint & data showing the extent of this problem

 

External:

 

 

Dependencies

 

Status

Development

COMPLETE

 

Development Timeline

1H2024

 

Upstreaming

COMPLETE

 

Upstream Version

gcc-15 (target)

Spring 2025

 

 

 

Contacts

Jeff Law (Ventana)

 

Dependencies

None

 

 

Updates

May 9, 2024

  • Jivan's patch has been upstreamed. 

Mar 20, 2024 

  • Patch posted upstream.  Seems to have general consensus to go forward pending final review when gcc-15 is open for devleopment

Mar 17, 2024 

  • Added note about actual performance improvement seen (17%).

  • Jivan has been asked to post his patch to gcc-patches list for review

Mar 13, 2024 

  • An implementation borrowing heavily from LLVM is under evaluation.  This implementation implements more efficient versions of ceil, round, nearbyint, rint, etc

  • In addition to using existing conversions to implement those functions, the implementation also includes sign extension removal for cases where the ultimate result is a GPR

  • Expecting this to save 300-400 billion instructions for imagick benchmark.  Enough that we expect to see a measurable (perhaps double digit) improvement in the benchmark's overall performanc

Jan 29, 2024  

  • Project reported as a priority for 1H2024