vm16bit

Python Virtual Machine Assembly CPU Architecture

What It Is

vm16bit is a fully custom 16-bit virtual CPU — instruction set, assembler, and runtime — written in Python. You give it an .asm file in my custom assembly language, the assembler compiles it to bytecode, and the VM executes it. The whole thing runs in about 1,500 lines of Python.

Why I Built It

I wanted to understand how CPUs actually work: fetch-decode-execute, register files, status flags, stack frames, calling conventions. Reading about these concepts is one thing; implementing them forces you to make every decision yourself. The long-term goal is to eventually target this VM with a simple C compiler.

Architecture Decisions

I went with a Harvard architecture — separate 32KB ROM for program code and 32KB RAM for data. This keeps things clean and prevents programs from accidentally overwriting themselves.

The VM is accumulator-based with 8 registers: four general-purpose (A, B, C, D), a stack pointer, program counter, frame pointer, and a flags register carrying zero, carry, overflow, and negative bits. Instructions are variable-length (1–4 bytes), which made decoding more interesting than a fixed-width format.

The instruction set has 38 opcodes across five categories: data movement (MOV, PUSH, POP, LEA), arithmetic (ADD, SUB, MUL, DIV, INC, DEC, NEG), logic/shifts (AND, OR, XOR, NOT, SHL, SHR, SAR), control flow (JMP, JZ, JNZ, CALL, RET, CMP), and I/O (IN, OUT). I modeled the calling convention after x86 — downward-growing stack, frame pointer for stack frames, caller-pushes-args-right-to-left.

The Assembler

The assembler is a two-pass design: first pass resolves labels and addresses, second pass emits bytecode. It supports directives like .ORG, .DB, .DW, and .DS for laying out data, plus both line and block comments. The output is a binary that the VM loads directly into ROM.

Sample Programs

I wrote several programs to validate the VM: fibonacci sequence generation, factorial computation, bubble sort, string routines, math routines, and a memory dump utility. Each one exercises different parts of the instruction set — the bubble sort, for example, tests indirect addressing, comparison, and conditional branching together.

What I Learned

Designing an ISA involves constant tradeoffs — instruction width vs. expressiveness, number of addressing modes vs. decoder complexity, opcode space allocation. I also gained a much better intuition for how high-level constructs like function calls and local variables map down to stack manipulation and pointer arithmetic. Building the assembler taught me how label resolution works in practice (forward references require two passes — there's no shortcut).

What's Next

The roadmap includes building a simple C compiler frontend that targets this VM, adding a step-through debugger with register/memory inspection, and implementing memory-mapped I/O for simulated peripherals.