Skip to content

PVM2 spec — differential against PVM (jar1)

Companion to rv64e-xjar-eei.md. That doc specifies PVM2 from the RV side; this one specifies it from the PVM side. Every PVM jar1 opcode is listed with what happens to it in PVM2.

The architecture is mostly unchanged: the same 13 hot registers used by jar-produced programs (within RV64E’s full 15-GPR file), the same 4 GiB guest memory period now specified as a 2^32-fold alias, the gas metering shape, and the host-call mechanism. PVM2 uses plain RISC-V control flow: function calls are jal/auipc+jalr, returns are jalr (c.jr ra), and PC is a real low-4 GiB virtual address (code mapped read-only at CODE_BASE). PVM’s single-global-jump- table jump_ind dispatch becomes a native jalr; the only runtime divergence is that a jalr target must be a basic-block start (validated when it executes — see rv64e-xjar-eei.md §8). An earlier draft routed all calls/returns through a custom br_table backed by Image.jump_table; that static-dispatch model has been removed. The wire encoding adopts RV+C for the ~97 ops with strict 1:1 mappings, keeps the jar-specific operations in the RV custom-0 slot, and drops everything else.

Status legend

  • KEEP (R) — same architectural semantics, new encoding is the corresponding standard RV instruction.
  • KEEP (R, CF) — control-flow op, encoded as the standard RV instruction; PC is a real virtual address (code mapped at CODE_BASE), same as RV.
  • KEEP (custom-0) — kept, but moved to RV custom-0 space.
  • DROP (static) — removed from PVM2 because the corresponding PVM dynamic-dispatch opcode has no direct analogue; the linker lowers the source pattern to native RV control flow (jal / auipc+jalr, with jalr targets validated against the basic-block-start set at runtime).
  • DROP — removed from the PVM2 spec. If the operation is ever needed, software lowers it to a sequence of standard RV ops.
  • DROP (fallback: …) — removed; expected RV lowering shown.

Per-opcode mapping (all 141 jar1 opcodes)

No-arg (0–3)

opnamePVM2 statusnotes
0trapKEEP (custom-0)funct3 000
1fallthroughKEEP (custom-0)funct3 100. Repurposed: PVM’s gas-hint nop is now PVM2’s bb_start-widening terminator (no architectural state change, but the next instruction is a bb_start)
2unlikelyDROP (fallback: c.nop)gas hint only, no semantic effect
3ecall (jar)KEEP (custom-0)funct3 001, semantically unchanged

One-immediate (10)

opnamePVM2 statusnotes
10ecalliKEEP (custom-0)funct3 010, 20-bit signed immediate

One reg + 8-byte imm (20)

opnamePVM2 statusnotes
20load_imm_64DROP (fallback: lui + addi + slli + addi or compressed where smaller)unused in benches; not worth a custom op

Two immediates / store-imm-direct (30–33)

opnamePVM2 statusnotes
30store_imm_u8DROP (fallback: li tmp, val; sb tmp, off(x0))unused in benches
31store_imm_u16DROPunused
32store_imm_u32DROPunused
33store_imm_u64DROPunused

One offset / unconditional jump (40)

opnamePVM2 statusnotes
40jumpKEEP (R, CF)RV jal x0, off. Optionally c.j off when in range

One reg + one imm (50–62)

opnamePVM2 statusnotes
50jump_indKEEP (R, CF)RV jalr x0, rs1, 0 (indirect jump). PVM validated the target against a global jump-table; PVM2 validates it against the basic-block-start set at runtime. See rv64e-xjar-eei.md §8
51load_immKEEP (R)RV addi rd, x0, imm (= li). With C, c.li (2 B) when imm fits 6-bit
52load_u8KEEP (R)RV lbu rd, imm(x0)
53load_i8KEEP (R)RV lb rd, imm(x0)
54load_u16KEEP (R)RV lhu rd, imm(x0)
55load_i16KEEP (R)RV lh rd, imm(x0)
56load_u32KEEP (R)RV lwu rd, imm(x0)
57load_i32KEEP (R)RV lw rd, imm(x0)
58load_u64KEEP (R)RV ld rd, imm(x0)
59store_u8KEEP (R)RV sb rs, imm(x0)
60store_u16KEEP (R)RV sh rs, imm(x0)
61store_u32KEEP (R)RV sw rs, imm(x0)
62store_u64KEEP (R)RV sd rs, imm(x0)

One reg + two imm / store-imm-ind (70–73)

opnamePVM2 statusnotes
70store_imm_ind_u8DROP (fallback: li tmp, val; sb tmp, off(rA))75 occurrences in benches; JIT can peephole li + sbmov [mem], imm8
71store_imm_ind_u16DROPunused
72store_imm_ind_u32DROPunused
73store_imm_ind_u64DROP (fallback: li tmp, val; sd tmp, off(rA))65 occurrences

One reg + imm + offset (80–90)

opnamePVM2 statusnotes
80load_imm_jumpKEEP (R, CF)a direct call: native RV jal ra, callee (4 B), or auipc ra, hi; jalr ra, ra, lo for far targets. ra holds the real return VA (code_base + next_pc); the callee returns with jalr x0, ra, 0 (c.jr ra)
81branch_eq_immDROP (fallback: imm=0 → c.beqz; else li tmp; beq rA, tmp, off)rare
82branch_ne_immDROP (fallback: imm=0 → c.bnez; else li tmp; bne rA, tmp, off)rare
83branch_lt_u_immDROPunused
84branch_le_u_immDROPunused
85branch_ge_u_immDROPunused
86branch_gt_u_immDROPunused
87branch_lt_s_immDROP (fallback: li tmp; blt rA, tmp, off)rare
88branch_le_s_immDROPunused
89branch_ge_s_immDROP (fallback: li tmp; bge rA, tmp, off)rare
90branch_gt_s_immDROPunused

Two registers (100–111, jar1)

opnamePVM2 statusnotes
100move_regKEEP (R)RV addi rd, rs, 0 (= mv). With C, c.mv (2 B)
101sbrkDROPunused in benches; production heap-grow goes via ecalli host fn
102popcount64KEEP (R)RV Zbb cpop
103popcount32KEEP (R)RV Zbb cpopw
104clz64KEEP (R)RV Zbb clz
105clz32KEEP (R)RV Zbb clzw
106ctz64KEEP (R)RV Zbb ctz
107ctz32KEEP (R)RV Zbb ctzw
108sign_extend_8KEEP (R)RV Zbb sext.b
109sign_extend_16KEEP (R)RV Zbb sext.h
110zero_extend_16KEEP (R)RV Zbb zext.h
111reverse_bytesKEEP (R)RV Zbb rev8

Two reg + one imm (120–161)

Loads / stores indirect (120–130)

opnamePVM2 statusnotes
120store_ind_u8KEEP (R)RV sb rs, imm(rb)
121store_ind_u16KEEP (R)RV sh rs, imm(rb)
122store_ind_u32KEEP (R)RV sw rs, imm(rb). RVC c.sw when in range
123store_ind_u64KEEP (R)RV sd rs, imm(rb). RVC c.sd when in range
124load_ind_u8KEEP (R)RV lbu rd, imm(rb)
125load_ind_i8KEEP (R)RV lb rd, imm(rb)
126load_ind_u16KEEP (R)RV lhu rd, imm(rb)
127load_ind_i16KEEP (R)RV lh rd, imm(rb)
128load_ind_u32KEEP (R)RV lwu rd, imm(rb)
129load_ind_i32KEEP (R)RV lw rd, imm(rb). RVC c.lw when in range
130load_ind_u64KEEP (R)RV ld rd, imm(rb). RVC c.ld when in range

ALU 32-bit + imm (131–146) + cmov-imm (147–148)

opnamePVM2 statusnotes
131add_imm_32KEEP (R)RV addiw rd, rs, imm. RVC c.addiw when in range
132and_immKEEP (R)RV andi rd, rs, imm. RVC c.andi when in range
133xor_immKEEP (R)RV xori rd, rs, imm
134or_immKEEP (R)RV ori rd, rs, imm
135mul_imm_32DROP (fallback: li tmp; mulw rd, rs, tmp)unused
136set_lt_u_immKEEP (R)RV sltiu rd, rs, imm
137set_lt_s_immDROPunused. Fallback: slti rd, rs, imm if needed
138shlo_l_imm_32DROPunused. Fallback: slliw rd, rs, sh
139shlo_r_imm_32KEEP (R)RV srliw rd, rs, sh
140shar_r_imm_32DROPunused. Fallback: sraiw rd, rs, sh
141neg_add_imm_32DROPunused
142set_gt_u_immDROPunused (operand-swap pseudo)
143set_gt_s_immDROPunused
144shlo_l_imm_alt_32DROPunused (imm-source shift)
145shlo_r_imm_alt_32DROPunused
146shar_r_imm_alt_32DROPunused
147cmov_iz_immDROP (fallback: li tmp, imm; czero.eqz t1, tmp, rB; czero.nez t2, rA, rB; or rA, t1, t2)472 occurrences; falls back to ~4 RV insns via Zicond. To be measured
148cmov_nz_immDROPunused

ALU 64-bit + imm (149–161)

opnamePVM2 statusnotes
149add_imm_64KEEP (R)RV addi rd, rs, imm. RVC c.addi/c.addi16sp/c.li per range
150mul_imm_64DROP (fallback: li tmp; mul rd, rs, tmp)rare
151shlo_l_imm_64KEEP (R)RV slli. RVC c.slli when in range
152shlo_r_imm_64KEEP (R)RV srli. RVC c.srli when in range
153shar_r_imm_64KEEP (R)RV srai. RVC c.srai when in range
154neg_add_imm_64DROPunused
155shlo_l_imm_alt_64DROPunused
156shlo_r_imm_alt_64DROPunused
157shar_r_imm_alt_64DROPunused
158rot_r_64_immKEEP (R)RV Zbb rori rd, rs, sh
159rot_r_64_imm_altDROPunused (imm-source rotate)
160rot_r_32_immDROPunused
161rot_r_32_imm_altDROPunused

Two reg + offset / branches (170–175)

opnamePVM2 statusnotes
170branch_eqKEEP (R, CF)RV beq rs1, rs2, off. RVC c.beqz when rs2=x0 and in range
171branch_neKEEP (R, CF)RV bne. RVC c.bnez when rs2=x0 and in range
172branch_lt_uKEEP (R, CF)RV bltu
173branch_lt_sKEEP (R, CF)RV blt
174branch_ge_uKEEP (R, CF)RV bgeu
175branch_ge_sKEEP (R, CF)RV bge

Two reg + two imm (180)

opnamePVM2 statusnotes
180load_imm_jump_indKEEP (R, CF)the fused “load-imm + indirect-jump-with-link” pattern: native auipc+jalr ra (or a register-indirect jalr ra, rs1, 0). The jalr target is validated against the basic-block-start set at runtime (§8) — fn-pointer / vtable dispatch works without link-time enumeration

Three registers — 32-bit ALU (190–199)

opnamePVM2 statusnotes
190add_32DROPunused (rustc emits 64-bit by default on RV64). Fallback: addw rd, rs1, rs2
191sub_32DROPunused. Fallback: subw
192mul_32DROPunused. Fallback: mulw
193div_u_32DROPunused. Fallback: divuw
194div_s_32DROPunused. Fallback: divw
195rem_u_32DROPunused. Fallback: remuw
196rem_s_32DROPunused. Fallback: remw
197shlo_l_32DROPunused. Fallback: sllw
198shlo_r_32DROPunused. Fallback: srlw
199shar_r_32DROPunused. Fallback: sraw

Note: dropping these is encoding-level only. The fallback RV-W opcodes (addw, mulw, etc.) are still valid RV instructions in PVM2 (since we include the M extension and base 64-bit ALU); they just aren’t called out as “kept PVM ops” here because no bench guest emits them.

Three registers — 64-bit ALU (200–209)

opnamePVM2 statusnotes
200add_64KEEP (R)RV add. RVC c.add when rd=rs1
201sub_64KEEP (R)RV sub. RVC c.sub
202mul_64KEEP (R)RV mul
203div_u_64DROPunused. Fallback: divu
204div_s_64DROPunused. Fallback: div
205rem_u_64DROPunused. Fallback: remu
206rem_s_64DROPunused. Fallback: rem
207shlo_l_64KEEP (R)RV sll
208shlo_r_64KEEP (R)RV srl
209shar_r_64DROPunused. Fallback: sra

Three registers — bitwise + upper-mul + slt + cmov (210–219)

opnamePVM2 statusnotes
210andKEEP (R)RV and. RVC c.and
211xorKEEP (R)RV xor. RVC c.xor
212orKEEP (R)RV or. RVC c.or
213mul_upper_s_sDROPunused. Fallback: mulh
214mul_upper_u_uKEEP (R)RV mulhu
215mul_upper_s_uDROPunused. Fallback: mulhsu
216set_lt_uKEEP (R)RV sltu
217set_lt_sDROPunused. Fallback: slt
218cmov_izDROP (fallback: Zicond czero.* + or sequence)unused
219cmov_nzDROPunused

Three registers — rotations + inverted-bitwise + min/max (220–230)

opnamePVM2 statusnotes
220rot_l_64DROPunused. Fallback: Zbb rol
221rot_l_32DROPunused. Fallback: Zbb rolw
222rot_r_64DROPunused. Fallback: Zbb ror
223rot_r_32DROPunused. Fallback: Zbb rorw
224and_invDROPunused. Fallback: Zbb andn
225or_invDROPunused. Fallback: Zbb orn
226xnorDROPunused. Fallback: Zbb xnor
227max_sDROPunused. Fallback: Zbb max
228max_uDROPunused. Fallback: Zbb maxu
229min_sDROPunused. Fallback: Zbb min
230min_uDROPunused. Fallback: Zbb minu

Tally

statuscount
KEEP (R)45
KEEP (R, CF)10 (jump, jump_ind, load_imm_jump, load_imm_jump_ind, 6 reg-reg branches)
KEEP (custom-0)4 (trap, fallthrough, ecall, ecalli)
DROP82
total141

So PVM2 has 59 of PVM’s 141 opcodes in its primary encoding (55 RV + 4 custom-0). The other 82 are either unused (per bench stats) or have a trivial 1-to-N RV-instruction lowering.

The 4 custom-0 opcodes (trap, fallthrough, ecall.jar, ecalli) carry the entire “PVM is not RV” semantic content. Everything else — including all control flow — is standard RV (with PVM2-Base’s divergences applied uniformly).

Architecture preserved

For clarity, all of these stay the same between PVM and PVM2:

  • RV64E’s 15 GPRs, with jar-produced programs using the same 13 hot registers and x3/x4 valid but host-spilled
  • 4 GiB guest memory period, specified as a 2^32-fold alias across the RV64E 64-bit address space
  • Gas metering model (cost-per-instruction, basic-block accounting)
  • Host-call interface (ecalli selector → host function dispatch)
  • Memory mapping format (pinned slots, initial slots, RW regions), plus the code region mapped read-only at CODE_BASE

Architecture changed: dispatch model

PVM has a single global runtime-validated indirect jump (jump_ind / load_imm_jump_ind); targets are checked against a deblob-time valid_pc bitmap.

PVM2 uses plain RISC-V control flow:

  • jal / auipc+jalr for calls, jalr (c.jr ra) for returns, branches unchanged. ra holds a real return VA (code_base + next_pc), not an opaque handle.
  • auipc, jalr, c.jr, c.jalr are all standard — they are not rejected at deblob.
  • A jalr target is validated against the basic-block-start set at runtime (the recompiler runs untrusted code and never trusts a target table). jal/branch targets are immediates, validated at recompile time against the same set.
  • Indirect dispatch (Rust fn pointers, vtables, switch tables) works natively through jalr — no link-time target enumeration.

The Image format changes: code is re-encoded as RV+C+custom-0 in one or more CodeRegions, referenced by a MappingSource::Code mapping that maps the region read-only at CODE_BASE; the bitmask and the jump table / jump-table-offsets are gone (RV+C is self-describing in length, and bb_starts is derived from code by both engines rather than carried on the wire).