CT_00_012 - mem* and str* -- inline expansion in GCC
About
This covers multiple efforts.
Scalar inline expansion of various mem* and str* routines in GCC. Essentially when some of the arguments are known the compiler can emit efficient versions of those routines while at the same time exposing the semantics to the optimizers allowing further optimization. The scalar inline expansion work in GCC is largely complete at this point. This work was largely done by VRULL under contract to Ventana.
Vector inline expansion. Similar to scalar expansion, but using vector instructions when possible. The bulk of this work has already been integrated for gcc-14 and is the combination of work from Ventana (Robin Dapp) and EMBECOSM (Joern Rennecke).
Additional vector inline expansion. memmove, memset and memcmp can also be expanded inline by the compiler using vector instructions. Sergei from Rivos has submitted an implementation of these routines, but the submission missed the gcc-14 deadline. The implementation looks pretty reasonable and is expected to integrate into GCC in the late spring. It is expected this work will provide another percent or so improvement on the GCC workload within spec2017.
VRULL's engineers are also working on improving scalar codegen for inlined cases in gcc. For example using cboz for clearing cache lines, zbb for str[n]cmp, etc.
Stakeholders/Partners
RISE:
Ventana: Jeff & Robin
Rivos: Palmer & Sergei
External:
VRULL: Christoph Muller
EMBECOSM: Joern Rennecke
Dependencies
Status
Development | COMPLETE |
|
|---|---|---|
Development Timeline | 1H2024 |
|
Upstreaming | COMPLETE |
|
Upstream Version | gcc-15 (target) Spring 2025 glibc-2.40 (target)
|
|
Contacts | Jeff Law (Ventana) |
|
Dependencies | None |
|
Updates
Jun 29, 2024
Sergei's last patch has been integrated upstream.
VRULL's patches have been integrated upstream.
We'll break out the glibc work into a 2H2024 item
Jun 23, 2024
Sergei's 2nd patch (setmem) has been updated for the current trunk and re-submitted for inclusion.
May 9, 2024
VRULL is currently upstreaming various bits to improve some of the scalar inline expansion of these routines
Ventana has upstreamed VRULL's patch that allows overlapping memory references when inlining memcpy and similar routines
We're hoping to move on Sergei's vector versions shortly
Jan 30, 2024
Added some additional text for the cases covered by this change as well as some performance expectations.
Jeff L has done some review work on Sergei's work and is planning to incorporate that work into his upstream GCC tester ASAP.
Dec 29, 2023
Robin Dapp from Ventana has submitted & integrated vector versions of str[n]cmp, strlen
Christoph has submitted and integrated scalar versions of str[n]cmp, strlen, memcpy
Sergei Lewis has submitted vector version of memset, memmove and memcmp.
Oct 4, 2023
Joern has submitted Embecosm's work to inline vectorized memcpy.
Sep 27, 2023
Project reported as a priority for 1H2024.
Christoph has submitted VRULL's work for str(n)cmp inline expansion and it has been integrated upstream