By expanding these sequences inline, we can avoid the overhead of a function call and expose more of the underlying semantics of the call to the optimizers, thus potentially allowing further optimization. It also creates larger scheduling blocks and fewer optimization barriers. VRULL has provided scalar implementations of the key routines. Ventana has contributed vector versions of the key routines.

There are additional cases that can be handled in the vector space. In particular when the source/destinations may overlap in a memory copy, if the entire amount copied fits in a vector register, then the runtime testing for forward vs backward copies can be avoided. Support for these cases has been posted by Sergei at Rivos, but missed the gcc-14 development deadline. It's unclear if we will make an exception for this work or just defer it to gcc-15.

Stakeholders/Partners

RISE:

Ventana: Jeff Law& Robin

Rivos: Palmer & Sergei

External:

VRULL: Christoph Mullner (under contract to Ventana)

Embecosm: Joern Rennecke

Dependencies

Status

Page Properties

Development

Status


colour	BlueGreen
title	IN PROGRESSCOMPLETE

Development Timeline

NA

Upstreaming

Status


colour	Blue
title	IN PROGRESS

Upstream Version

Contacts

Jeff Law (Ventana)

Dependencies

None

Updates

29 Dec 2023

Robin Dapp from Ventana has submitted & integrated vector versions of str[n]cmp, strlen
Christoph has submitted and integrated scalar versions of str[n]cmp, strlen, memcpy
Sergei Lewis has submitted vector version of memset, memmove and memcmp.

04 Oct 2023

Joern has submitted Embecosm's work to inline vectorized memcpy.

...

Versions Compared

Old Version 2

New Version 3

Key

Stakeholders/Partners

RISE:

External:

Dependencies

Status

Updates

Page Comparison

Versions Compared

Old Version 2

New Version 3

Key

Stakeholders/Partners

RISE:

External:

Dependencies

Status

Updates