/
CT_01_008 - Autovectorization -- Improvements (LLVM)

CT_01_008 - Autovectorization -- Improvements (LLVM)

About

Enablement of auto-vectorization in LLVM for RISC-V, targeting the V extension version 1.0.   While the long term goal is to focus on vector length agnostic (VLA) approaches to vectorization, some of LLVM's vectorizer may still be biased towards fixed vector sizes.  Thus we expect to find cases that are not well handled using VLA approaches and we expect to support VLS approaches to vectorization a stop-gap alternatives.


LLVM's support for auto-vectorization on RISC-V appears to be improving regularly, but it is sensitive to having reasonable micro-architectural data available.  Thus it may be necessary to stub-out values for these key parameters when enabling auto-vectorization on a new micro-architecture, or to disable the costing model.

Stakeholders/Partners

RISE:

Ventana: 1 FTE focused on getting necessary uarch data ready

Ventana: 1 FTE Reference/target implementation of key x264 loops, breakdown of tasks that need to be solved to achieve desired code generation

SiFive: Craig Topper, Alexey Bataev

Rivos:

External:

Alex Bradbury


Dependencies

The most pressing upstream dependencies are:

  1. PSABI specification for vector argument passing and return values
  2. Kernel support to enable discovery of the V extension
  3. glibc support for libmvec to enable vector API for key math library functions such as sin, cos, sqrt, etc (does LLVM support libmvec calls?)


Status

Development

IN PROGRESS


Development TimelineNA
Upstreaming

IN PROGRESS


Upstream Version

Development Trunk


Contacts

Jeff Law (Ventana)

Alexey Bataev (SiFive)


Dependencies

PSABI vector spec

Kernel discovery

glibc libmvec



Updates

 

  • Moved to 1H2025
  • EVL vectorization with tail folding showing good gains on 525.x264_r in spec2017 on Banana Pi F3.
  • EVL-based vectorizer is currently stable. Lacks support for multi-exit loops and first-order recurrences. SLP vectorizer support segmented loads/stores, strided loads and partially strided stores (-1 stride). Expand/compress for SLP still WIP.

 

 

 

  • Improving vectorization split off as distinct project


Related content

CT_01_018 - Fixed length vector calling convention(LLVM)
CT_01_018 - Fixed length vector calling convention(LLVM)
More like this
2024-2H - Compilers and Toolchains Priorities
2024-2H - Compilers and Toolchains Priorities
Read with this
CT_01_001 - Autovectorization -- Basic Functionality (LLVM)
CT_01_001 - Autovectorization -- Basic Functionality (LLVM)
More like this
2025-1H - Compilers and Toolchains Priorities
2025-1H - Compilers and Toolchains Priorities
Read with this
CT_00_001 - Autovectorization -- Basic Functionality (GCC)
CT_00_001 - Autovectorization -- Basic Functionality (GCC)
More like this
CT_01_007 - CRC Optimization (LLVM)
CT_01_007 - CRC Optimization (LLVM)
Read with this