RISE TSC Meeting Notes
Feb 6, 2025
Kernel and Virtualization Deepdive- Anup Patel
What’s merged for Linux-6.14 ?
Linux-6.14-rc1 was released on 2nd February 2025
Noteworthy stuff merged for Linux-6.14:
Linux RISC-V
T-Head vector extensions support
KVM RISC-V
Svvptc, Zabha, and Ziccrse extension support for Guest/VM
Virtualize SBI system suspend extension for Guest/VM
Trap related exit statistics as SBI PMU firmware counters for Guest/VM
RVA23 profile: Discovery updates
HWPROBE additions in Linux-6.14
No updates
KVM ONE_REG additions in Linux-6.14
Zabha, Svvptc, and Ziccrse
Kernel HWPROBE (27-01-2024, wiki)
TBD - 7 (7.45%), NA - 47 (50%), COMPLETED - 40 (42.55%), TOTAL - 94
KVM ONE_REG (27-01-2024, wiki)
TBD - 24 (25.53%), NA - 13 (13.83%), COMPLETED - 57 (60.64%), TOTAL - 94
2024-1H: Project updates
2024-1H: Recently upstreamed projects
LK_03_023 - QEMU-KVM Zawrs support
LK_02_025 - KVM System Suspend Virtualization
LK_03_008 - QEMU-KVM AIA user-space irqchip_split support
2024-1H: Development status (27-01-2025, wiki)
TBD - 2 (4.08%), ONGOING - 1 (2.04%), COMPLETED - 46 (93.88%), TOTAL - 49
2024-1H: Upstreaming status (27-01-2025, wiki)
TBD - 3 (6.12%), ONGOING - 5 (10.21%), COMPLETED - 41 (83.67%), TOTAL - 49
2024-2H: Project updates
2024-2H: Recently development complete projects
LK_01_024 - Supervisor Counter delegation (Smcdeleg and Ssccfg) based perf
LK_01_025 - Control Transfer Record (Smctr and Ssctr) support in perf driver
LK_01_044 - Firmware Feature support
LK_01_045 - Message Proxy support
LK_01_046 - RPMI Clock driver using SBI MPXY
LK_02_028 - KVM Firmware Feature virtualization
LK_03_026 - KVMTOOL Svadu support
LK_03_034 - KVMTOOL Smnpm and Ssnpm support
2024-2H: Recently upstreamed projects
LK_02_027 - KVM Svvptc virtualization
2024-2H: Development status (27-01-2025, wiki)
TBD - 13 (35.14%), ONGOING - 1 (2.70%), COMPLETED - 23 (62.16%), TOTAL - 37
2024-2H: Upstreaming status (27-01-2025, wiki)
TBD - 18 (48.64%), ONGOING - 13 (35.14%), COMPLETED - 6 (16.22%), TOTAL - 37
Jan 30, 2025
Distro and Integration- Brian Harrington
The most recent discussion revolved around the development of a unified database for RISC-V processors and related tools, aiming to streamline information access and collaboration within the community. Additionally we discussed some needs/desired with regards to automated builders (specifically hosted by GitHub/GitLab). Here's a structured summary of the key points:
RISC-V Unified Database:
Hosted on GitHub at GitHub - riscv-software-src/riscv-unified-db: Machine-readable database of the RISC-V specification, and tools to generate various views
Purpose: To compile comprehensive details about RISC-V processors, platforms, and associated tools, facilitating easier access for developers.
Related Items:
Reminder of https://riscv.builders , building / compiling software for RISC-V, though called out that it’s still insufficient for the u-boot development desired by members.
Nathan Egge provided a PDF about LLVM-Mingw, detailing its use for cross-compiling RISC-V applications on Windows, which is crucial for developers in Windows environments.
Jan 23, 2025
System Libraries Deep Dive- Nathan Egge
Multimedia (and other) RVI extensions
Feedback from VideoLAN community fed into RVI proposals
dot prod, vector abs-diff (SAD), zip/unzip
Continued collaboration after VDD 2024
Got feedback on signed variants of SAD extension proposal
Plan to set up meeting across RISE, RVI Vector SIG and VideoLAN / FFmpeg
Goal: early feedback on algorithms instructions would be used in
May guide instruction coding (params) and find edge cases *before* ratification
RISE presence at FOSDEM 2025
Attend RISC-V devroom (thanks Bjorn)
Concurrent open-media devroom for in person collaboration
Community still asking for more diverse hardware
Larger VLEN, OOO cores, different IP to cross validate algorithms
Software Optimization Progress
XNNPACK still a priority
Ken Unger (Microchip) patches in flight quantized kernel optimizations for RVV
PyTorch progress
Philip Reams (Rivos) open to collaborating and providing public RISC-V packages
Build system issues discussed
sleef build issues
Currently broken for cross-compilation, not an easy fix
New project hosted on RISE gitlab for Mbed-TLS
Developer Images
Luca Barbato (Gentoo) sent a ROMA II laptop for testing
Working to get this booting to check status of graphics stack
clang-20 released, contains some fixes for auto-vectorization
Need to check if this allows zvl256b
gcc progress made fixing issues raised by Luca, but not ready yet
Minor issues with rustc ebuild on RISC-V addressed by Luca
System Libraries Priorities - H1 2025
High profile, in-demand projects, e.g., Tensorflow, Chromium, etc.
Planning to finalize this in next System Libraries meeting Dec 10, 2024
Instructio timings for optimization guide
Potential to use llvm-exegesis to extract these for help with RVV optimizations
Capture missing vector instructions from multimedia projects [1]
Key Idea: Collect gaps in extension opcodes preventing efficient multimedia DSP functions, e.g., no transpose op, no dotprods, missing sign-unsign mults, etc.
Work with multimedia developers
Testing framework for upstream projects
Issues raised by vendors and OSS contributors about inconsistent benchmarking
Example project + source code with best practices for statistically significant data
More guest speakers!
2025
Dec 5, 2024
System Libraries Deep Dive
System Libraries Priorities - H1 2024
SL_00_001 bionic (Done)
SL_00_006 chromium-zlib (Done)
SL_01_002 dav1d (In progress)
Landed 16bpc blend functions, presented how to contribute RVV to dav1d at VDD https://people.videolan.org/~negge/vdd24.pdf
Andes patch pending https://code.videolan.org/videolan/dav1d/-/merge_requests/1735
SL_01_003 x264 (In progress)
Bytedance added RISC-V support to build system, patches in review
https://code.videolan.org/videolan/x264/-/merge_requests/155
SL_01_004 Pixman (In progress)
Samsung added optimized pixel format conversion and blending
SL_00_006 DPDK (In progress)
14x speed up on CRC operations from Bytedance
Developer Images
Latest image based on Bianbu 2.0 and can be found here [1]
Boots Banana Pi BPI-F3 and possibly other k1/m1 based boards
Clang-19.1.2
Past experiments results
clang-20 snapshots fail to build [2]
gcc-15 still has problems, but it is reported that gcc-trunk has the alignment problems fixed.
The rust build system seems to fail to pick the Gentoo cross compiler put in config.toml (I'll investigate further why)
Presented these images at the RISC-V 101 session and RVI devboard sig [3]
[1] https://dev.gentoo.org/~lu_zero/riscv/gentoo-linux-k1_dev-sdcard-2.0.img.xz
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115789
[3] https://people.videolan.org/~negge/riscv101.pdf
Proposed RVV instructions for Multimedia
Initially discussed at RISC-V Summit EU 2024
Key Idea: Many implementers noticed the same ISA gaps porting multimedia libraries
Bytedance created a wiki page to track these instructions:
https://lf-rise.atlassian.net/wiki/spaces/HOME/pages/8588516/RISCV64+new+vector+instructions+requirements+for+video+multimediaVector transpose
Absolute difference
Zero-extended vmv.x.s
Rounded Shift Right Narrow,
Signed saturate and Narrow to Unsigned
Did a call for RISC-V Multimedia Instruction Wishlist at VDD 2024:
https://docs.google.com/presentation/d/16r4pT3YfI1UL5nX3na5SAnZ1DtHglbVPy64oPexwoN8/Collected 3 pages of notes, many overlap above
Signed to unsigned, e.g. vnclipsu
Changing SEW while preserving the ratio of LMUL
Changing rounding mode is slow on Kendryte K230 and SpacemiT K1
Missing integer absolute difference instruction, e.g., vabs (equiv. of vfabs)
Limited usage of register based vsetvl is OK
Optimization guide cautions against using vsetvl
Long discussion with hardware engineers at LLVM Dev Meeting Santa Clara
Under some circumstances it is OK to use vsetvl, e.g., as in dav1d
May be possible to use this in auto-vectorization loop unrolling to “fold” the tail so it shares the same code as main body, potentially reducing cost of enabling RVV
Register Pressure Example - 16bpc blend
Recall…
vwmulu.vv v24, v8, v16
vwmulu.vv v8, v12, v20
Section 5.2 Vector Operands
System Libraries Priorities - H1 2025
High profile, in-demand projects, e.g., Tensorflow, Chromium, etc.
Planning to finalize this in next System Libraries meeting Dec 10, 2024
Instruction timings for optimization guide
Potential to use llvm-exegesis to extract these for help with RVV optimizations
Capture missing vector instructions from multimedia projects [1]
Key Idea: Collect gaps in extension opcodes preventing efficient multimedia DSP functions, e.g., no transpose op, no dotprods, missing sign-unsign mults, etc.
Testing framework for upstream projects
Issues raised by vendors and OSS contributors about inconsistent benchmarking
Example project + source code with best practices for statistically significant data
More guest speakers!
Nov 7, 2024
Compilers and Toolchains Deep Dive (GCC)
Good progress on Cactu optimization for spec. Parts of this work will likely land upstream in the next week or two
“Bf16” support complete & integrated
Tons of work on saturating arithmetic done & integrated
Various codegen adjustments in the vector space have been integrated
Many more coming from Robin & myself
Glibc has accepted first vector ifunc’d mem* routine. Can adjust SiFive & Rivos implementations trivially and integrate
CRC work – still not integrated. Engineers chasing down one final issue before integration
Function multi-versioning submitted, going through review
GCC 15 feature freeze next week!
Compilers and Toolchains Deep Dive (LLVM)
Full register move instructions depend on vtype
Kernel setting v-ill at syscall points
Some concern that designs behave differently
Code generation implications, though perhaps not that bad?
Function Multi-versioning complete & integrated
Separate shrink wrapping ready for external review
Moves prologue/epilogue code to lesser executed positions within a function
Behavior looks good, particularly on perlbench & gcc in spec
Stack-clash – engineer busy, Craig recommends posting for review anyway
No updates on CRC, autovect (multiple), landing pads, openmp
Distro and Integration Deep Dive
Active group discussions
Ubuntu will be shipping RVA23 for all forthcoming releases
Fedora is currently at RVA20 “due to lack of hardware” (This is being taken back to the Fedora Council for discussion)
Baseline profile for distros
Actively recruiting members from Alma Linux for coordination of Python packagin
Oct 31, 2024
Kernel and Virtualization Working Group Deepdive 31/10/2024
What’s merged for Linux-6.13 ?
Linux-6.13-rc1 will be available 1st week of December 2024
Noteworthy stuff merged for Linux-6.13:
Linux RISC-V
RISC-V IOMMU driver using device-tree
Userspace pointer masking and tagged address ABI
Support for Smnpm, Ssnpm, and Supm extensions
Support for Svade and Svadu extensions
KVM RISC-V
Host nested acceleration using SBI NACL
Perf support to collect KVM guest statistics
Virtualize Smnpm and Ssnpm extensions
Virtualize Svade and Svadu extensions
RVA23 profile: Discovery updates
HWPROBE additions in Linux-6.13
Supm
KVM ONE_REG additions in Linux-6.13
Smnpm, Ssnpm, Svade, and Svadu
Kernel HWPROBE (31-10-2024, wiki)
NA - 47 (50%), TBD - 7 (7.45%), COMPLETED - 40 (42.55%), TOTAL - 94
KVM ONE_REG (31-10-2024, wiki)
NA - 17 (18.08%), TBD - 23 (24.47%), COMPLETED - 54 (57.45%), TOTAL - 94
2023-2H: Project updates
2023-2H: Recently upstreamed projects
LK_01_007 - IOMMU driver with DT support
2023-2H: Development status (31-10-2024, wiki)
COMPLETED - 25 (100%), TOTAL - 25
2023-2H: Upstreaming status (31-10-2024, wiki)
TBD - 1 (4%), COMPLETED - 24 (96%), TOTAL - 25
2024-1H: Recently development completed projects
LK_02_025 - KVM System Suspend virtualization
LK_03_008 - QEMU-KVM AIA userspace irqchip_split support
LK_03_032 - KVMTOOL System Suspend support
LK_03_033 - QEMU-KVM System Suspend support
2024-1H: Recently upstreamed projects
LK_02_008 - KVM Nested acceleration
LK_03_022 - KVMTOOL Zawrs support
2024-1H: Development status (31-10-2024, wiki)
TBD - 2 (4.08%), ONGOING - 1 (2.04%), COMPLETED - 46 (93.88%), TOTAL - 49
2024-1H: Upstreaming status (31-10-2024, wiki)
TBD - 3 (6.12%), ONGOING - 8 (16.33%), COMPLETED - 38 (77.55%), TOTAL - 49
2024-2H: Project updates
2024-2H: Recently upstreamed projects
LK_01_021 - Svadu support in kernel
LK_01_037 - Userspace Pointer Masking and tagged address ABI
LK_01_042 - Optimize memory fences using Svvptc extension
LK_02_017 - KVM Svadu virtualization
LK_02_026 - KVM Pointer Masking virtualization
2024-2H: Development status (31-10-2024, wiki)
TBD - 21 (58.33%), ONGOING - 2 (5.55%), COMPLETED - 13 (36.12%), TOTAL - 36
2024-2H: Upstreaming status (31-10-2024, wiki)
TBD - 29 (80.56%), ONGOING - 2 (5.55%), COMPLETED - 5 (13.89%), TOTAL - 36
Oct 17, 2024
Firmware Deep Dive