RISC-V Architecture Deep Dive
Overview
RISC-V (pronounced "risk five") is a free and open instruction set architecture (ISA) originating at the University of California, Berkeley in 2010. Unlike x86, ARM, or MIPS, RISC-V carries no licensing fees, no royalties, and no confidentiality agreements. Any individual, company, or government can implement it, modify it, and ship products without paying a single dollar to a standards body or IP holder. This single property has made it one of the most consequential developments in computer architecture in decades — not because it is technically revolutionary, but because it removes the economic and legal barriers that previously confined hardware innovation to a small number of well-capitalized actors.
The architecture was designed by Krste Asanovic, David Patterson, and Andrew Waterman as a clean-slate academic ISA suitable for research, education, and production. It was explicitly not designed around backward compatibility with existing software — a constraint that has cost x86 enormous die area and ARM considerable complexity over decades.
Prerequisites
- Familiarity with general CPU pipeline concepts (fetch, decode, execute, writeback)
- Basic understanding of instruction encoding (opcode fields, register specifiers)
- Understanding of memory models and cache coherency at a conceptual level
- Exposure to at least one other ISA (x86, ARM, or MIPS) for comparison
- Understanding of privilege levels in operating systems
Historical Context
The ISA Licensing Problem
By 2010, the dominant ISAs were x86 (Intel/AMD, tightly controlled) and ARM (licensed from ARM Holdings at significant cost per design). MIPS was nominally licensable but carried historical baggage and a fragmented ecosystem. Every university research group designing a new processor had to either pay for a license, reverse-engineer something, or build their own small ISA from scratch — which then had no toolchain, no OS support, and no compiler.
David Patterson, one of the architects of the original RISC (Reduced Instruction Set Computer) concept in the 1980s, recognized this as a fundamental bottleneck. RISC-V was the fifth RISC ISA developed at Berkeley (RISC-I through RISC-IV preceded it). The "V" is Roman numeral five, not a version number.
Open ISA Timeline
2010 — RISC-V project starts at UC Berkeley
2011 — First RISC-V chip taped out (Raven-1 in 28nm ST)
2014 — RISC-V Foundation formed
2015 — First RISC-V Workshop, community begins forming
2016 — Linux kernel port begins; SiFive founded
2018 — Linux 4.15: RISC-V merged mainline (rv64)
2019 — RISC-V Foundation moves to Switzerland (geopolitical neutrality)
2020 — RISC-V International; Western Digital ships 1B+ SweRV cores
2021 — Alibaba T-Head XuanTie C910 available on public cloud (Alibaba Cloud)
2023 — StarFive JH7110 in VisionFive 2 SBC; SpacemiT X60 (8-core RISC-V laptop SoC)
2024 — RISC-V Vector extension ratified; first RISC-V AI accelerator chips ship
Base ISA Design Philosophy
Modularity Over Monolithism
RISC-V separates the mandatory base ISA from optional extensions. A minimal conforming implementation need only implement the base integer ISA. Everything else — multiply, divide, floating point, atomics, compressed instructions — is an extension that implementations choose to include or omit depending on the target application.
This is a deliberate inversion of x86's philosophy, where an x86-64 CPU must support thousands of legacy instructions going back to the 8086.
Base ISA Variants
RV32I — 32-bit address space, 32 registers, 47 integer instructions
RV64I — 64-bit address space, 32 registers, integer operations on 64-bit values
RV128I — 128-bit (experimental, forward-looking)
RV32E — Embedded variant: 32-bit, only 16 registers (reduced area for microcontrollers)
Register File
RISC-V has 32 general-purpose integer registers, x0 through x31. x0 is hardwired to zero — reads always return 0, writes are discarded. This eliminates the need for a special zero register opcode and simplifies many instruction encodings.
RISC-V Integer Register File (RV32I/RV64I)
==========================================
Register ABI Name Convention Notes
-------- --------- ------------------ --------------------------------
x0 zero Hardwired zero Always 0, writes discarded
x1 ra Return address Caller saves
x2 sp Stack pointer Callee saves
x3 gp Global pointer Linker-reserved
x4 tp Thread pointer Per-thread data
x5 t0 Temp/alt link reg Caller saves
x6-x7 t1-t2 Temporaries Caller saves
x8 s0/fp Saved / frame ptr Callee saves
x9 s1 Saved register Callee saves
x10-x11 a0-a1 Args / return vals Caller saves
x12-x17 a2-a7 Function arguments Caller saves
x18-x27 s2-s11 Saved registers Callee saves
x28-x31 t3-t6 Temporaries Caller saves
Floating-point registers (F/D extensions add separate f0-f31):
f0-f7 ft0-ft7 FP temporaries Caller saves
f8-f9 fs0-fs1 FP saved Callee saves
f10-f11 fa0-fa1 FP args / return Caller saves
f12-f17 fa2-fa7 FP arguments Caller saves
f18-f27 fs2-fs11 FP saved Callee saves
f28-f31 ft8-ft11 FP temporaries Caller saves
Instruction Encoding
RV32I instructions are uniformly 32 bits wide with a clean field layout. The opcode occupies bits 0-6, with a fixed position for register specifiers across instruction formats. This regularity simplifies decoder design significantly compared to x86's variable-length encoding.
RV32I Instruction Formats
==========================
R-type (register-register):
31 25 24 20 19 15 14 12 11 7 6 0
[ funct7 ][ rs2 ][ rs1 ][funct3][ rd ][ opcode ]
I-type (immediate + register):
31 20 19 15 14 12 11 7 6 0
[ imm[11:0] ][ rs1 ][funct3][ rd ][ opcode ]
S-type (store):
31 25 24 20 19 15 14 12 11 7 6 0
[imm[11:5] ][ rs2 ][ rs1 ][funct3][imm[4:0]][ opcode ]
B-type (branch):
31 25 24 20 19 15 14 12 11 7 6 0
[imm[12|10:5]][ rs2 ][ rs1 ][funct3][imm[4:1|11]][ opcode ]
U-type (upper immediate):
31 12 11 7 6 0
[ imm[31:12] ][ rd ][ opcode ]
J-type (jump):
31 12 11 7 6 0
[ imm[20|10:1|11|19:12] ][ rd ][ opcode ]
Standard Extensions
Extension Matrix
Extension Name Key Instructions Typical Use
--------- ---------------------- ------------------------- ---------------------------
M Multiply/Divide MUL, DIV, REM General computation
A Atomics (LR/SC + AMO) LR.W, SC.W, AMOSWAP Lock-free data structures
F Single-precision FP FADD.S, FMUL.S, FLD.S Scientific, graphics
D Double-precision FP FADD.D, FMUL.D, FLD.D Scientific computing
Q Quad-precision FP FADD.Q Numerical stability research
C Compressed (16-bit) Subset of common instrs Code density improvement
B Bit manipulation ANDN, CLMUL, ROL, ROR Crypto, compression
V Vector (scalable) VADD.VV, VLE32.V SIMD/HPC/AI inference
H Hypervisor extension HFENCE, VS-mode instrs Type-2 hypervisors in kernel
Zicsr CSR instructions CSRRW, CSRRS, CSRRC Privileged state access
Zifencei Instruction-fetch fence FENCE.I Self-modifying code, JIT
Zba/Zbb/Zbc/Zbs Bit manipulation subsets Cryptography, string ops
Ztso Total Store Order (memory model variant) x86 compat layer
Compressed Extension (C)
The C extension is one of the most practically important. It provides 16-bit encodings for the most commonly used 32-bit instructions (small register loads, small immediates, common arithmetic). On typical application code, 50-60% of instructions can be encoded as 16-bit, improving code density to near-ARM Thumb2 levels. This matters enormously for microcontrollers where flash memory is scarce and expensive.
Vector Extension (V)
The vector extension uses a scalable vector model rather than fixed-width SIMD. Instead of SSE's 128-bit or AVX-512's 512-bit fixed widths, RISC-V V defines VLEN as an implementation parameter. Code written for RISC-V V runs correctly on a 128-bit VLEN implementation and a 1024-bit VLEN implementation without recompilation. The vsetvli instruction sets the vector length at runtime based on the application's requested element count and the hardware's capability.
Privileged ISA and Execution Modes
RISC-V Privilege Levels
========================
┌─────────────────────────────┐
│ Application Code │ U-mode (User)
│ (unprivileged, no CSR access)│
└──────────────┬────────────────┘
│ syscall / trap
┌──────────────▼────────────────┐
│ Operating System Kernel │ S-mode (Supervisor)
│ (manages VM, interrupts, I/O) │
└──────────────┬────────────────┘
│ SBI call (Supervisor Binary Interface)
┌──────────────▼────────────────┐
│ SEE / Hypervisor / SBI │ M-mode (Machine)
│ (bare metal, always present) │ (+ HS-mode if H ext)
└─────────────────────────────────┘
M-mode: highest privilege, direct hardware access, handles all traps initially
S-mode: OS kernel runs here, controls page tables (satp CSR), delegates traps from M
U-mode: user processes, no CSR access, memory access via page tables
HS-mode: hypervisor supervisor mode (requires H extension) — hosts guest OS in VS-mode
Key CSRs (Control and Status Registers)
CSR Mode Purpose
-------- ---- -----------------------------------------------
mstatus M Global interrupt enable, privilege mode stack
mepc M Exception program counter (return address)
mcause M Trap cause (interrupt or exception code)
mtvec M Trap vector base address
mhartid M Hardware thread ID (which core is running)
misa M ISA features supported by this hart
satp S Supervisor address translation (page table root)
sstatus S Supervisor status (subset of mstatus)
sepc S Supervisor exception PC
scause S Supervisor trap cause
stvec S Supervisor trap vector
Memory Model: RVWMO
RISC-V uses the RVWMO (RISC-V Weak Memory Ordering) model as its default. This is a relaxed consistency model — more permissive than x86's TSO (Total Store Order) but more constrained than the minimal models of some research architectures.
Under RVWMO:
- Loads and stores from the same hart appear in program order to that hart
- Between harts, the order of memory operations is not guaranteed unless explicitly synchronized
- The FENCE instruction inserts ordering constraints between preceding and subsequent memory operations
- Fence variants specify which preceding and subsequent operation types are ordered (I: device input, O: device output, R: memory reads, W: memory writes)
FENCE ordering combinations:
FENCE RW, RW — full fence (most conservative)
FENCE W, R — store-load fence (prevents store-load reordering)
FENCE.I — instruction-fetch fence (required after writing executable code)
The Ztso extension allows implementations to advertise TSO semantics, enabling x86 compatibility layers to run without fence overhead.
RISC-V vs ARM vs x86 Comparison
Property RISC-V ARM (A-profile) x86-64
--------------------- ------------- ----------------- -----------------
ISA ownership Open (RISC-V ARM Holdings Intel/AMD
International)
Licensing fees None Licensing + royalty Patent-encumbered
Instruction width Fixed 32-bit Fixed 32/16-bit Variable 1-15 bytes
(16-bit w/ C) (A64/T32/T16)
Register count 32 GP int 31 GP int 16 GP int
Memory model RVWMO (relaxed) Weakly ordered TSO (strong)
Compressed encoding C ext (opt) Thumb2 (integral) Always variable-len
Privilege levels M/S/U (+ H) EL0-EL3 ring 0-3 (0 and 3 used)
Virtualization H extension EL2 built in VMX/SVM extensions
Vector/SIMD V extension SVE/SVE2/NEON SSE/AVX/AVX-512
Ecosystem maturity Developing Mature Very mature
Compiler support GCC/LLVM Excellent Excellent
Linux support Mainline 4.15 Mainline Mainline
Binary size (vs ARM) ~same w/ C Baseline ~15% larger
Power efficiency Competitive Industry-leading Higher than ARM
Custom extensions Allowed Not allowed Not allowed
Production Deployments
SiFive Freedom Platform
SiFive, founded by Patterson and Waterman alumni, produces RISC-V IP cores and development boards. The HiFive Unmatched is a desktop-class RISC-V development board using the SiFive FU740 (4x U74 application cores + 1x S7 management core). SiFive cores power industrial controllers, embedded devices, and developer tooling.
Alibaba XuanTie C906/C910
Alibaba's T-Head semiconductor division designed the XuanTie (玄铁) C906 and C910 cores. The C906 implements RV64GCV (base + multiply + atomics + float + compressed + vector) and is available for free on GitHub under an open-source license. The C910 is a high-performance out-of-order core used in Alibaba Cloud's Yitian 710 server chips. XuanTie cores are the most widely shipped application-class RISC-V cores as of 2024.
Western Digital SweRV
Western Digital's SweRV EH1/EH2/EL2 are in-order pipeline cores designed for storage controller applications. Western Digital shipped over one billion RISC-V cores in hard drives, SSDs, and flash controllers. They open-sourced the SweRV core under Apache 2.0, demonstrating that RISC-V was production-ready for high-volume consumer electronics.
SpacemiT X60
SpacemiT's X60 is an 8-core RV64GCVB application processor (implementing the vector extension) targeting laptop and tablet form factors. It ships in the Bananapi F3 developer board and was used in the first commercial RISC-V laptop (Milk-V Jupiter). Performance is competitive with mid-range ARM Cortex-A55 class cores.
StarFive JH7110
The JH7110 SoC from StarFive powers the VisionFive 2 single-board computer, priced under $80. It features a quad-core SiFive U74 configuration with GPU and video acceleration. It achieved wide adoption in the hobbyist community as the "Raspberry Pi equivalent for RISC-V."
Linux on RISC-V
RISC-V was merged into the Linux mainline kernel in version 4.15 (released January 2018), making it the first new architecture merged in the mainline kernel in years. Support includes:
arch/riscv/architecture directory- SMP (symmetric multi-processing) via hardware thread (hart) management
- Virtual memory via sv39 (39-bit), sv48 (48-bit), sv57 (57-bit) page table modes
- Device drivers for SiFive UART, CLINT (Core Local Interruptor), PLIC (Platform-Level Interrupt Controller)
- Compressed kernel images (
Image.gz) - OpenSBI (Open Source Supervisor Binary Interface) as the M-mode runtime firmware
The RISC-V supervisor binary interface (SBI) provides a stable ABI between M-mode firmware and S-mode kernel, analogous to UEFI firmware services for x86.
RISC-V for Security Research
The open ISA makes RISC-V uniquely valuable for hardware security research:
CHERI on RISC-V: The CHERI (Capability Hardware Enhanced RISC Instructions) project from Cambridge implements hardware-enforced capability pointers. CHERI-RISC-V adds capability registers that carry bounds and permissions alongside pointer values. Any pointer dereference outside its declared bounds raises a hardware exception. This eliminates entire classes of spatial memory safety vulnerabilities (buffer overflows, out-of-bounds reads) at the hardware level, without software overhead.
Custom security extensions: Because RISC-V allows custom instruction encodings in defined opcode space, researchers can implement custom cryptographic accelerators, attestation primitives, or isolation mechanisms as hardware extensions and test them with real hardware within 6-12 months of design — something impossible with locked ISAs.
PMP (Physical Memory Protection): The standard RISC-V PMP unit provides M-mode controlled physical address range permissions (read/write/execute) enforced before virtual memory translation. This enables trusted execution environments without requiring full TrustZone IP licensing.
Debugging Notes
Toolchain setup:
# RISC-V GCC cross-compiler (Ubuntu/Debian)
sudo apt install gcc-riscv64-linux-gnu
# Compile a RISC-V binary
riscv64-linux-gnu-gcc -march=rv64gc -mabi=lp64d -o hello hello.c
# Inspect generated instructions
riscv64-linux-gnu-objdump -d hello | head -60
# QEMU system emulation (full system)
qemu-system-riscv64 -machine virt -kernel fw_jump.elf \
-drive file=rootfs.img,format=raw -nographic
# QEMU user-mode (run RISC-V binaries on x86 host)
qemu-riscv64 ./hello
RISC-V specific issues:
- Misaligned memory access: RISC-V allows implementations to raise alignment faults on misaligned loads/stores (unlike x86 which handles them in hardware). Check CONFIG_RISCV_MISALIGNED_FAULT kernel config.
- Relaxation: RISC-V linker relaxation substitutes GP-relative addressing for global variable accesses, shrinking code. This can confuse debuggers if -mno-relax is not set.
- FENCE.I requirement: When JIT-compiling code, FENCE.I must be executed before the CPU can safely fetch the new instructions. Missing this causes stale instruction cache execution.
Security Implications
- Open source hardware reference implementations (Rocket, BOOM, CVA6) allow security researchers to audit the microarchitecture — impossible with ARM or Intel.
- Side-channel research: RISC-V's simplicity makes it an ideal platform for studying speculative execution side channels (Spectre/Meltdown variants). Boom (Berkeley Out-of-Order Machine) has been used to research and mitigate these attacks.
- PMP misuse: Incorrectly configured PMP entries can leave M-mode code accessible from S-mode, creating privilege escalation paths. Linux kernel and OpenSBI must coordinate PMP configuration carefully.
- Custom extension trust: The ability to add custom instructions is powerful but also means implementations may silently diverge from published specs. Security-critical code should test on multiple RISC-V implementations.
Performance Implications
- No legacy overhead: RV64GC chips avoid the area and power budget spent on x86 legacy instruction decoding (estimated 10-20% of x86 die area).
- Compressed extension impact: Enabling the C extension typically reduces code size by 25-30% and improves I-cache hit rates proportionally.
- Vector extension efficiency: RISC-V V's scalable design means a single binary can exploit 128-bit, 256-bit, or 512-bit vector units depending on the CPU. No separate AVX, AVX2, AVX-512 code paths needed.
- Memory model cost: RVWMO requires explicit fences in lock-free code, but since ARM also requires this, the cost is the same as existing portable code. Code ported from x86 may need fences added.
Failure Modes
- Ecosystem fragmentation: Custom vendor extensions (Alibaba T-Head has several non-standard extensions) can fragment the binary ecosystem — a binary compiled for T-Head extensions won't run on standard RISC-V.
- Sparse hardware availability: As of 2025, RISC-V application-class hardware is still limited compared to ARM or x86, making deployment to production infrastructure non-trivial.
- Interrupt controller diversity: The RISC-V spec defines CLINT and PLIC but allows alternatives. Platform variation requires Device Tree or ACPI to describe the actual topology, and driver gaps still exist.
- Unratified extension churn: Using extensions before they are ratified (e.g., early vector implementations before V1.0 ratification) led to incompatible binaries across hardware generations.
Modern Usage (2025)
- Microcontrollers: Espressif ESP32-C3/C6 (RISC-V cores replacing Xtensa in flagship products)
- Storage controllers: Western Digital, Marvell (billions of units shipped)
- RISC-V Linux laptop SoCs: SpacemiT X60, Alibaba T-Head C910
- Cloud development VMs: Alibaba Cloud provides bare-metal RISC-V instances
- AI accelerators: Several startups using RISC-V control cores alongside tensor processing units
- Space: European Space Agency research payloads use RISC-V due to rad-hard implementation freedom
Future Directions
- RISC-V profiles: Standardized feature sets (RVA22, RVA23) define minimum extensions for application-class Linux, simplifying distribution support decisions.
- CHERI-RISC-V mainstreaming: Cambridge's Morello project demonstrated CHERI on ARM; RISC-V CHERI prototypes exist and may influence mainstream security extensions.
- Ratification pipeline: Extensions for cryptography (Zkn, Zks suites), cache management (CMO), hypervisor improvements, and debug are in various stages of ratification.
- Data center adoption: Alibaba, StarFive, and several funded startups are targeting server-class RISC-V chips with PCIe 5.0, DDR5, and CXL interfaces for 2025-2026.
- Automotive/safety: ISO 26262 ASIL-D compliant RISC-V cores from Andes and SiFive target ADAS applications.
Exercises
- Download and install the RISC-V GCC cross-compiler. Write a simple C program, compile it for
rv64gc, and disassemble it. Identify at least five instruction types from the base ISA and two compressed instructions. - Boot a RISC-V Linux VM under QEMU (
qemu-system-riscv64with thevirtmachine type and a Debian RISC-V image). Examine/proc/cpuinfoand identify the reported ISA extensions. - Write a spinlock in RISC-V assembly using the A extension's
LR.W/SC.Winstructions. Explain why the SC must retry on failure and why aFENCEis needed before releasing the lock. - Read the RISC-V privileged ISA specification section on PMP. Write a description of how a Type-1 hypervisor on RISC-V would use PMP to isolate guest physical memory from firmware.
- Compare code size of a simple benchmark compiled with
-march=rv64gversus-march=rv64gc. Measure the percentage reduction and correlate it with I-cache miss rate under QEMU's performance counters.
References
- Patterson, D. & Waterman, A. (2017). The RISC-V Reader: An Open Architecture Atlas. Strawberry Canyon LLC.
- RISC-V International. (2021). The RISC-V Instruction Set Manual, Volume I: Unprivileged ISA. https://riscv.org/technical/specifications/
- RISC-V International. (2021). The RISC-V Instruction Set Manual, Volume II: Privileged Architecture.
- Waterman, A. et al. (2014). The RISC-V Instruction Set Architecture. Berkeley EECS Technical Report UCB/EECS-2014-54.
- Asanovic, K. & Patterson, D. (2014). Instruction Sets Should Be Free: The Case For RISC-V. EECS Technical Report.
- Sewell, P. et al. (2010). x86-TSO: A Rigorous and Usable Programmer's Model for x86 Multiprocessors. CACM.
- CHERI RISC-V project: https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/
- Linux kernel RISC-V architecture: https://www.kernel.org/doc/html/latest/riscv/