Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

About

Often the compiler has enough information to efficiently expand mem* or str* routines inline.  Say for example a 2 word copy or a string comparison against a constant.


By expanding these sequences inline, we can avoid the overhead of a function call and  expose more of the underlying semantics of the call to the optimizers, thus potentially allowing further optimization.  It also creates larger scheduling blocks and fewer optimization barriers.  VRULL has provided scalar implementations of the key routines.  Ventana has contributed vector versions of the key routines.


There are additional cases that can be handled in the vector space.  In particular when the source/destinations may overlap in a memory copy, if the entire amount copied fits in a vector register, then the runtime testing for forward vs backward copies can be avoided.  Support for these cases has been posted by Sergei at Rivos, but missed the gcc-14 development deadline.  It's unclear if we will make an exception for this work or just defer it to gcc-15.


Stakeholders/Partners

RISE:

Ventana: Jeff & Robin

Rivos: Palmer & Sergei


External:

VRULL: Christoph Mullner (under contract to Ventana)

Embecosm: Joern Rennecke


Dependencies


Status

Development

COMPLETE


Development TimelineNA
Upstreaming

IN PROGRESS


Upstream Version





Contacts

Jeff Law (Ventana)


Dependencies

None



Updates

 

  • Robin Dapp from Ventana has submitted & integrated vector versions of str[n]cmp, strlen
  • Christoph has submitted and integrated scalar versions of str[n]cmp, strlen, memcpy
  • Sergei Lewis has submitted vector version of  memset, memmove and memcmp.

 

  • Joern has submitted Embecosm's work to inline vectorized memcpy.

  

  • Project reported as a priority for 1H2024.
  • Christoph has submitted VRULL's work for str(n)cmp inline expansion and it has been integrated upstream


  • No labels