Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Investigation of the 538.imagick benchmark from spec2017 shows that uses ceil/floor routines heavily.  While the Zfa extension can be used to optimize these calls into simple FP conversion instructions, it is believed that an alternate implementation based on just the F/D conversions can be implemented which will significantly improve performance on designs that do not implement Zfa.  It appears to be roughly a 9-10% improvement in the dynamic instruction count.  


LLVM already has this optimization in place.

...