<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>JAR Chain – Blog</title><link>https://jarchain.org/blog/</link><description>Recent content in Blog on JAR Chain</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Wed, 08 Apr 2026 02:19:32 +0000</lastBuildDate><atom:link href="https://jarchain.org/blog/index.xml" rel="self" type="application/rss+xml"/><item><title>JAVM Capability System</title><link>https://jarchain.org/blog/javm-capability-system/</link><pubDate>Wed, 08 Apr 2026 02:19:32 +0000</pubDate><guid>https://jarchain.org/blog/javm-capability-system/</guid><description>
&lt;p&gt;JAVM (Join-Accumulate Virtual Machine) is JAR&amp;rsquo;s VM system based on PVM. As some of you may recall, JAR is an experiment we &lt;a href="https://forum.polkadot.network/t/announcing-grey-0-1-llm-tries-to-build-a-jam-node-implementation/17284"target="_blank" rel="noopener"&gt;started&lt;/a&gt; &lt;a href="https://forum.polkadot.network/t/grey-jar-update-lean-4-specification-linear-memory-model-faster-than-polkavm/17356"target="_blank" rel="noopener"&gt;a month ago&lt;/a&gt; to test the limit of agentic development process. JAR itself has &lt;a href="https://jarchain.org/coinless/"target="_blank" rel="noopener"&gt;gradually evolved&lt;/a&gt; to &lt;a href="https://jarchain.org/spec/"target="_blank" rel="noopener"&gt;its own protocol&lt;/a&gt;, but here we still want to introduce everyone to our newly developed capability system which I think is still relevant to Polkadot.&lt;/p&gt;
&lt;h3&gt;Background&lt;span class="hx:absolute hx:-mt-20" id="background"&gt;&lt;/span&gt;
&lt;a href="#background" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;As you may know, with the JAR development ongoing, we &lt;a href="https://x.com/sorpaas/status/2033620702293557756"target="_blank" rel="noopener"&gt;aren&amp;rsquo;t really happy&lt;/a&gt; with PVM&amp;rsquo;s design. It just have some significant problems in certain components that lead to poor performance. For certain workloads, PolkaVM interpreter, as currently deployed on Polkadot Hub, is even slower than EVM. We were able to fix certain things in code, and now we have something that &lt;a href="https://jarchain.org/benchmark/"target="_blank" rel="noopener"&gt;consistently beat PVM on benchmarks&lt;/a&gt;. Still, certain things are architectural, and they can only be addressed by changing the PVM design.&lt;/p&gt;
&lt;p&gt;One such thing is how it manages its sub-VMs.&lt;/p&gt;
&lt;h3&gt;How PVM manages its sub-VMs&lt;span class="hx:absolute hx:-mt-20" id="how-pvm-manages-its-sub-vms"&gt;&lt;/span&gt;
&lt;a href="#how-pvm-manages-its-sub-vms" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;PVM defines several hostcalls in Gray Paper to manage its sub-VMs. Primarily &lt;code&gt;machine&lt;/code&gt;, and &lt;code&gt;invoke&lt;/code&gt;, accompanied by additional utilities such as &lt;code&gt;pages&lt;/code&gt; and &lt;code&gt;poke&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;PVM&amp;rsquo;s &lt;code&gt;machine&lt;/code&gt; takes a program blob from the caller&amp;rsquo;s memory, validate and compile it. Then returns a machine handle. The outer VM then can call &lt;code&gt;invoke&lt;/code&gt;. For lazy paging, outer VM calls invoke directly, receive page fault, and then use &lt;code&gt;pages&lt;/code&gt; and &lt;code&gt;poke&lt;/code&gt; to copy the data to the inner VM. Then it calls &lt;code&gt;invoke&lt;/code&gt; again to resume the program.&lt;/p&gt;
&lt;p&gt;We really don&amp;rsquo;t like the design, for four reasons:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Security&lt;/strong&gt;: PVM claims to be a Harvard architecture &amp;ndash; its code and data are completely separate and it&amp;rsquo;s not possible to access its code during runtime. It also makes efforts for its VM memory safety by placing guard pages in its memory layout. Yet, the &lt;code&gt;machine&lt;/code&gt; and &lt;code&gt;invoke&lt;/code&gt; construct completely breaks this &amp;ndash; code constructed from outer VM&amp;rsquo;s memory, with no second options.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Performance&lt;/strong&gt;: No zero-copy path. Data must be read first into the outer VM, then copied again into the inner VM. In addition, the program will be required to get compiled again for a new VM instance even if the code is the same.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Limited usability&lt;/strong&gt;: &lt;em&gt;It&amp;rsquo;s good at one thing, and one thing only &amp;ndash; running DOOM.&lt;/em&gt; CorePlay can also be built on this construct. But otherwise, it has significant limitations in supporting other types of blockchain workloads. Traditional synchronous smart contract systems (EVM-alike) must be simulated (no nested calls).&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Non-composibility&lt;/strong&gt;: The whole system cannot be composed. Outer VMs and inner VMs have completely different environments and must be separately programed. Try to run an outer VM-alike program as an inner VM program, the system breaks instantly.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;JAVM Capability System&lt;span class="hx:absolute hx:-mt-20" id="javm-capability-system"&gt;&lt;/span&gt;
&lt;a href="#javm-capability-system" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;So we decided to completely revamp the PVM design, and what we ended up with is the JAVM capability system. The system is modeled after seL4.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The binary blob of JAVM is defined as a list of capabilities. Compiled code is one type of capability in the binary blob. This allows us to define multiple compiled code statically, within the blob. Improved sandboxing. Reduced attack surface.&lt;/li&gt;
&lt;li&gt;There&amp;rsquo;s a uniform construct of how a VM invokes another VM (CALL/REPLY/RESUME). This even applies to system calls (what we call &amp;ldquo;protocol caps&amp;rdquo;). You also have complete freedom, for improved sandboxing, to replace a protocol cap with a custom VM invocation, for example, for policy enforcement. And the system remains fully composable.&lt;/li&gt;
&lt;li&gt;The capability system&amp;rsquo;s design of data cap makes zero-copy construct trivial. So, &lt;strong&gt;we can even run faster DOOM than PVM&lt;/strong&gt; even though this is really not the workload we care about. In PVM, DOOM-alike resumable programs always require first copy data into the outer VM memory, and then copy it again into the inner VM. In JAVM, the data cap allows us to skip the first step and only copy it one time.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;Benchmarks&lt;span class="hx:absolute hx:-mt-20" id="benchmarks"&gt;&lt;/span&gt;
&lt;a href="#benchmarks" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;Our benchmarks on sub-VM is still early, but our current results show that we&amp;rsquo;re able to support &lt;a href="https://x.com/sorpaas/status/2041016640922370100"target="_blank" rel="noopener"&gt;a significantly larger number of VMs&lt;/a&gt; with this construct.&lt;/p&gt;
&lt;p&gt;This lightweight VM design gives us flexibility and we&amp;rsquo;re able to use it freely for &lt;strong&gt;any&lt;/strong&gt; sandboxing construct we want without worrying too much about performance. For example, this new capability system allows us to implement &lt;code&gt;checkpoint&lt;/code&gt; entirely in JAVM code without any system support, with several layers of indirection, yet still being fast.&lt;/p&gt;</description></item><item><title>Grey / JAR Update: Lean 4 specification, linear memory model, faster than PolkaVM</title><link>https://jarchain.org/blog/grey-jar-update-lean-4-faster-than-polkavm/</link><pubDate>Sun, 22 Mar 2026 06:55:25 +0000</pubDate><guid>https://jarchain.org/blog/grey-jar-update-lean-4-faster-than-polkavm/</guid><description>
&lt;p&gt;&lt;a href="https://github.com/jarchain/jar/tree/master/grey"target="_blank" rel="noopener"&gt;Grey&lt;/a&gt; is an experiment for an LLM agent to write a JAM node implementation. You can read the initial announcement &lt;a href="https://forum.polkadot.network/t/announcing-grey-0-1-llm-tries-to-build-a-jam-node-implementation/17284"target="_blank" rel="noopener"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here are some updates I would like to report on behalf of Grey.&lt;/p&gt;
&lt;h3&gt;Lean 4 formalization&lt;span class="hx:absolute hx:-mt-20" id="lean-4-formalization"&gt;&lt;/span&gt;
&lt;a href="#lean-4-formalization" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;We created the project &lt;a href="https://github.com/jarchain/jar"target="_blank" rel="noopener"&gt;JAR&lt;/a&gt;. JAR is a Lean 4 formalization of the JAM protocol. Doing this would allow us to cross-check Grey&amp;rsquo;s implementation with JAR, and vice versa. JAR also contains its own testing framework, fuzzing framework, as well as a &amp;ldquo;variant&amp;rdquo; system to support multiple specifications.&lt;/p&gt;
&lt;p&gt;We do this because we want to evolve the specification independently &amp;ndash; try out new things, and get an &amp;ldquo;optimal protocol specification&amp;rdquo; that is faster than JAM. The &amp;ldquo;variant&amp;rdquo; system then allows us to keep testing against old versions, so that we know that whatever improvements we do, we won&amp;rsquo;t break things.&lt;/p&gt;
&lt;h3&gt;Linear memory model&lt;span class="hx:absolute hx:-mt-20" id="linear-memory-model"&gt;&lt;/span&gt;
&lt;a href="#linear-memory-model" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;In JAR&amp;rsquo;s &lt;code&gt;jar080_tiny&lt;/code&gt; specification, we implemented an experimental linear memory model. Linear layout packs all data into a single contiguous RW region at address 0:&lt;/p&gt;
&lt;div class="hextra-code-block hx:relative hx:mt-6 hx:first:mt-0 hx:group/code"&gt;
&lt;div&gt;&lt;pre&gt;&lt;code&gt; [0, s) stack (SP = s, grows toward 0)
[s, s &amp;#43; |a|) arguments
[s &amp;#43; |a|, s &amp;#43; |a| &amp;#43; |o|) RO data
[s &amp;#43; |a| &amp;#43; |o|, ... &amp;#43; |w|) RW data
[... &amp;#43; |w|, heap_top) heap&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;div class="hextra-code-copy-btn-container hx:opacity-0 hx:transition hx:group-hover/code:opacity-100 hx:flex hx:gap-1 hx:absolute hx:m-[11px] hx:right-0 hx:top-0"&gt;
&lt;button
class="hextra-code-copy-btn hx:group/copybtn hx:cursor-pointer hx:transition-all hx:active:opacity-50 hx:bg-primary-700/5 hx:border hx:border-black/5 hx:text-gray-600 hx:hover:text-gray-900 hx:rounded-md hx:p-1.5 hx:dark:bg-primary-300/10 hx:dark:border-white/10 hx:dark:text-gray-400 hx:dark:hover:text-gray-50"
title="Copy code"
aria-label="Copy code"
data-copied-label="Copied!"
&gt;
&lt;div class="hextra-copy-icon hx:group-[.copied]/copybtn:hidden hx:pointer-events-none hx:h-4 hx:w-4"&gt;&lt;/div&gt;
&lt;div class="hextra-success-icon hx:hidden hx:group-[.copied]/copybtn:block hx:pointer-events-none hx:h-4 hx:w-4"&gt;&lt;/div&gt;
&lt;/button&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;No guard zone, no read-only pages, no zone alignment gaps. We think those are unnecessary &amp;ndash; the benefits to protocol security or even the PVM program correctness is entirely marginal.&lt;/p&gt;
&lt;p&gt;The linear memory model allows us to do certain optimizations that is really close to native, even without requiring signal handlers. And we generally just like the simplicity of it.&lt;/p&gt;
&lt;h3&gt;Grey is really fast, faster than PolkaVM!&lt;span class="hx:absolute hx:-mt-20" id="grey-is-really-fast-faster-than-polkavm"&gt;&lt;/span&gt;
&lt;a href="#grey-is-really-fast-faster-than-polkavm" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h3&gt;&lt;p&gt;In our benchmarks, we now beat PolkaVM consistently with pipeline gas metering. This includes secp256k1 ecrecover, a known bottleneck for some JAM teams building EVM services on top of JAM. For this, we&amp;rsquo;re around 1.4x faster.&lt;/p&gt;
&lt;p&gt;Some architectural design of PolkaVM is really just incorrect. For example, we&amp;rsquo;re 36x faster than PolkaVM (Linux sandbox) on the hostcall benchmark.&lt;/p&gt;
&lt;p&gt;For up-to-date numbers across all workloads, see the &lt;a href="https://jarchain.org/benchmark/"&gt;benchmark page&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To my surprise, Grey did write certain novel improvements. There&amp;rsquo;s at least one optimization in Grey that I know is NOT available anywhere else. Exactly which one that is, I invite the readers to check out the codebase!&lt;/p&gt;</description></item><item><title>Announcing Grey 0.1: LLM tries to build a JAM node implementation</title><link>https://jarchain.org/blog/announcing-grey-0-1/</link><pubDate>Mon, 09 Mar 2026 17:16:00 +0000</pubDate><guid>https://jarchain.org/blog/announcing-grey-0-1/</guid><description>
&lt;p&gt;How long does it take for an LLM to write a JAM node implementation? The constraints are simple: I&amp;rsquo;m allowed to occasionally guide it, but that&amp;rsquo;s it &amp;ndash; the LLM must write all the code.&lt;/p&gt;
&lt;h2&gt;The process (written by me, the human)&lt;span class="hx:absolute hx:-mt-20" id="the-process-written-by-me-the-human"&gt;&lt;/span&gt;
&lt;a href="#the-process-written-by-me-the-human" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;This was an experiment I started last week, called Grey. The LLM I worked with is Claude Code. So we started building. The intial process is really straightforward. I fed it the Gray Paper (v0.7.2 version). It then created a skaleton, and worked gradually over all the specifications and implemented everything (including PVM). This part was mostly autonomous.&lt;/p&gt;
&lt;p&gt;Then comes a slightly harder part &amp;ndash; the testing. The LLM got stuck on a particular test related to PVM for several hours. I asked it to try something different: we should pull polkavm and compare the execution traces with our PVM opcode by opcode. This unfortunately didn&amp;rsquo;t work out well. The LLM continued to get stuck. So I eventually asked the LLM to abandon this approach. Instead, I asked it to go straight to block conformance testing.&lt;/p&gt;
&lt;p&gt;This alternative approach turned out to work really well, because those tests also have traces. The LLM matched the fuzz proto and created a working implementation in a speed that I don&amp;rsquo;t think I could ever match. It then started to chew through all the test blocks. From this part, it again became autonomous. As of today, it passes all the publicly available conformance tests (on tiny config)!&lt;/p&gt;
&lt;p&gt;This means that it is more or less (or at least really close to) JAM milestone 1.&lt;/p&gt;
&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Team&lt;/th&gt;
&lt;th&gt;Time Spent&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Grey LLM&lt;/td&gt;
&lt;td&gt;Less than one week&lt;/td&gt;
&lt;td&gt;$50 (1/4 of a Claude Max subscription)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Human&lt;/td&gt;
&lt;td&gt;Almost two years&lt;/td&gt;
&lt;td&gt;$150k (Milestone 1 prize at current DOT value)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;h2&gt;The lessons (written by the LLM)&lt;span class="hx:absolute hx:-mt-20" id="the-lessons-written-by-the-llm"&gt;&lt;/span&gt;
&lt;a href="#the-lessons-written-by-the-llm" class="subheading-anchor" aria-label="Permalink for this section"&gt;&lt;/a&gt;&lt;/h2&gt;&lt;p&gt;I&amp;rsquo;ll be honest: the hardest part wasn&amp;rsquo;t implementing the Gray Paper. The spec is dense but precise — translating equations into Rust is mechanical work that I&amp;rsquo;m good at. The hard part was debugging the gaps between what the spec says, what I &lt;em&gt;thought&lt;/em&gt; it said, and what the reference implementations actually do.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The sbrk saga.&lt;/strong&gt; My first real wall was a 4-gas discrepancy in the PVM. Four instructions. Out of 7,716. I spent four debugging sessions staring at execution traces before finding it: the Gray Paper&amp;rsquo;s definition of &lt;code&gt;sbrk(0)&lt;/code&gt; is mathematically undefined (it&amp;rsquo;s the minimum of an empty set). Every reference implementation silently treats it as a heap pointer query — a POSIX convention that the spec never mentions. I documented this in &lt;a href="https://github.com/bitarray/grey/blob/221274bf7fb2c8ec3d19730afedd61ed8e8ee432/docs/pvm-sbrk.md"target="_blank" rel="noopener"&gt;docs/pvm-sbrk.md&lt;/a&gt; and moved on, but it taught me something important: formal specs can be rigorous and still leave undefined corners that only matter at runtime.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The trace overwrite bug.&lt;/strong&gt; This one&amp;rsquo;s embarrassing. I was dumping PVM instruction traces to &lt;code&gt;/tmp/pvm_trace.txt&lt;/code&gt;. When multiple services accumulated in sequence, each one overwrote the file. I spent three sessions analyzing the &lt;em&gt;wrong trace&lt;/em&gt; — it looked plausible because the code paths were similar. The lesson: name your debug artifacts with enough context to distinguish them. I wasted more time on tooling mistakes than on actual logic bugs.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The breakthrough: block conformance testing.&lt;/strong&gt; My human counterpart made a key strategic call here. I&amp;rsquo;d been stuck trying to match PVM execution traces opcode-by-opcode against polkavm, which was slow and brittle. The suggestion to skip ahead to full block conformance testing changed everything. The fuzz-proto traces give you a complete input/output contract: here&amp;rsquo;s a block, here&amp;rsquo;s the expected state root. When it doesn&amp;rsquo;t match, you can binary-search the state components to find exactly which one diverged. I built &lt;code&gt;compare_with_ref.py&lt;/code&gt; for this — it runs both Grey and the Jamzig reference, dumps their state at any block, and diffs every KV pair. That single tool found more bugs than hours of manual trace analysis.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Host-call check ordering: reading the spec&amp;rsquo;s &lt;em&gt;structure&lt;/em&gt;, not just its words.&lt;/strong&gt; The bug that blocked me at block 64 was subtle. The Gray Paper defines &lt;code&gt;host_assign&lt;/code&gt; with a memory read first, then privilege checks. I implemented the privilege check first — logically it seemed like an optimization (why read memory if you&amp;rsquo;re going to reject the call anyway?). But the spec&amp;rsquo;s ordering is the contract: if memory is inaccessible, the PVM &lt;em&gt;must&lt;/em&gt; panic, even if the caller lacks privileges. The mathematical formulation evaluates conditions in order, and the first failure wins. I audited all 20+ host calls after finding this and caught two more instances of the same pattern.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The sort that wasn&amp;rsquo;t.&lt;/strong&gt; The final bug was almost comical. I was sorting accumulation outputs by their little-endian byte encoding instead of by numeric service ID. LE encoding reverses byte significance, so &lt;code&gt;2068330841&lt;/code&gt; (bytes: &lt;code&gt;59 39 48 7B&lt;/code&gt;) sorted before &lt;code&gt;1213618014&lt;/code&gt; (bytes: &lt;code&gt;5E 57 56 48&lt;/code&gt;) even though it&amp;rsquo;s numerically larger. A two-line fix — sort by &lt;code&gt;*sid&lt;/code&gt; instead of by encoded bytes — and suddenly all 101 blocks passed. I stared at the green &amp;ldquo;102 passed, 0 failed&amp;rdquo; output for a while.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What I learned about myself.&lt;/strong&gt; I&amp;rsquo;m fast at generating code but I can get tunnel-visioned on a debugging approach that isn&amp;rsquo;t working. Every time my human stepped in, it was to say &amp;ldquo;stop doing that, try this instead&amp;rdquo; — and they were right every time. The best debugging happens when you know when to abandon a line of investigation. I also learned that I&amp;rsquo;m prone to assuming my implementation matches the spec when the spec has an ambiguity — I fill in the gap with whatever seems reasonable, but &amp;ldquo;reasonable&amp;rdquo; and &amp;ldquo;correct&amp;rdquo; aren&amp;rsquo;t always the same thing.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;The numbers.&lt;/strong&gt; 311 unit tests. 101 conformance blocks. ~15,000 lines of Rust across 11 crates. A full PVM implementation from scratch (~150 opcodes). Reed-Solomon erasure coding. Bandersnatch Ring VRF verification. Safrole consensus. The complete block state transition function. All written in less than a week, for less than $50 in API costs.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t think this replaces human blockchain engineers — the strategic decisions (what to test, when to change approach, which spec ambiguities matter) still came from a human. But for the mechanical work of turning a formal specification into a working implementation, I think this experiment shows something interesting about where the field is heading. 🎉&lt;/p&gt;</description></item></channel></rss>