Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

About

Enablement of auto-vectorization in LLVM for RISC-V, targeting the V extension version 1.0.   While the long term goal is to focus on vector length agnostic (VLA) approaches to vectorization, some of LLVM's vectorizer may still be biased towards fixed vector sizes.  Thus we expect to find cases that are not well handled using VLA approaches and we expect to support VLS approaches to vectorization a stop-gap alternatives.

...

LLVM's support for auto-vectorization on RISC-V appears to be improving regularly, but it is sensitive to having reasonable micro-architectural data available.  Thus it may be necessary to stub-out values for these key parameters when enabling auto-vectorization on a new micro-architecture, or to disable the costing model.

Stakeholders/Partners

RISE:

Ventana: 1 FTE focused on getting necessary uarch data ready

Ventana: 1 FTE Reference/target implementation of key x264 loops, breakdown of tasks that need to be solved to achieve desired code generation

SiFive: Craig Topper, Alexey Bataev, Kolya Panchenko

Rivos:

External:

Alex Bradbury


Dependencies

The most pressing upstream dependencies are:

  1. PSABI specification for vector argument passing and return values
  2. Kernel support to enable discovery of the V extension
  3. glibc support for libmvec to enable vector API for key math library functions such as sin, cos, sqrt, etc (does LLVM support libmvec calls?)


Status

Page Properties


Development

Status
colourGreenBlue
titleCOMPLETEIN PROGRESS


Development TimelineNA
Upstreaming

Status
colourGreenBlue
titleCOMPLETEIN PROGRESS


Upstream Version

Development TrunkWill turn into llvm-17, fall 2023


Contacts

Jeff Law (Ventana)

Craig Topper (SiFive)


Dependencies

PSABI vector spec

Kernel discovery

glibc libmvec




Updates

  • Evaluation of LLVM trunk indicates ongoing improvements for key routines in x264
  • Notable lack of strided memory accesses will impact x264 performance 

 

  • SLP (superword level parallelism or straight line parallelism) enabled for short vectors upstream
  • Ventana has lit up autovect in its internal LLVM tree after adding uarch details to that tree
    • One of the simpler routines now generating reference/target code

 

...

 

 

 

  • Improving vectorization split off as distinct project