Building a Virtual Machine, JVM-inspired — Byte-code execution (Part 8)

10 min readJan 16, 2025

Introduction

In our previous article, we implemented compilation that transforms our source code into bytecode. Now we’ll tackle the next crucial step: executing this bytecode efficiently. This brings us closer to how real virtual machines like the JVM operate, moving away from interpreting text directly to working with compiled bytecode.

Implementation goals

With our compiler (tiny_vm_compile) now producing bytecode files, we need to:

Load bytecode when the VM starts, requiring the bytecode file path as an argument to tiny_vm_run
Parse the loaded bytecode
Execute bytecode operations directly instead of interpreting string instructions
Maintain compatibility with all our existing features (threading, synchronization, etc.)

Let’s see this in action with a simple example. Here’s our source file (hello.tvm):

function greet
set message 42
print message

function main
sync greet
exit

When compiled, this produces the following bytecode:

[Compiler] Bytecode for function 'greet':
  0: LOAD_CONST message = 42
  1: PRINT message
[Compiler] Bytecode for function 'main':
  0: INVOKE_SYNC greet
  1: RETURN

Running the VM with this bytecode produces:

[2025-01-02 19:14:42.723389] [VM] Starting TinyVM...

[2025-01-02 19:14:42.723875] [VM] Loaded function 'greet' with 2 instructions
[2025-01-02 19:14:42.723879] [VM] Loaded function 'main' with 2 instructions
[2025-01-02 19:14:42.723882] [VM] Successfully loaded compiled bytecode from: /Users/ondrej/Documents/Projects/c/tiny-vm/tiny-vm_07_compilation/examples/hello.tvmc

[2025-01-02 19:14:42.723910] [Thread 0] Thread instructions started
[2025-01-02 19:14:42.723916] [Thread 0] Sync function 'greet' started
[2025-01-02 19:14:42.723920] [Thread 0] SET message = 42
[2025-01-02 19:14:42.723922] [Thread 0] PRINT message = 42
[2025-01-02 19:14:42.723923] [Thread 0] Thread instructions finished

[2025-01-02 19:14:42.723937] [VM] TinyVM finished.

The implementation

VM entry point

We are going to rename our main.c to vm_main.c, to avoid ambiguity with the main that contains compiler code. The main function now takes one argument, which has the be a path to the compiled byte-code, a file ending with .tvmc.

// src/vm_main.c
#include "utils/logger.h"
#include "core/vm.h"
#include "compiler/compiler.h"

int main(const int argc, char* argv[]) {
    if (argc < 2) {
        print("Usage: %s <bytecode_file>", argv[0]);
        return 1;
    }

    print("[VM] Starting TinyVM...");

    CompilationResult* compiled = load_compiled_bytecode(argv[1]);
    if (!compiled) {
        print("[VM] Compilation failed");
        return 1;
    }
    print_compilation_result(compiled);

    // Create and run VM
    VM* vm = create_vm(compiled);
    run_vm(vm);

    // Cleanup
    destroy_vm(vm);

    print("[VM] TinyVM finished.");
    return 0;
}

VM updates

The VM creation process has been simplified to work directly with compiled bytecode:

// src/core/vm.h
#ifndef TINY_VM_CORE_VM_H
#define TINY_VM_CORE_VM_H

#include "../types.h"
#include "../compiler/compiler.h"

// Core VM functions
VM* create_vm(CompilationResult* compiled);

void run_vm(VM* vm);

void destroy_vm(VM *vm);

#endif

The implementation simply transfers the compiled functions to the VM’s functions array:

// src/core/vm.c
// ... no changes here

VM* create_vm(const CompilationResult* compiled) {
    VM *vm = malloc(sizeof(VM));

    // ... no changes here

    for (int i = 0; i < compiled->function_count; i++) {
        vm->functions[i] = compiled->functions[i];  // Shallow copy
        vm->function_count++;
    }

    return vm;
}

// ... the rest is untouched

Function management

Function handling has been simplified significantly. We now only need to find functions in the VM’s function table:

// src/function/function.h

#ifndef TINY_VM_FUNCTION_H
#define TINY_VM_FUNCTION_H

#include "../types.h"

// Function management
Function* find_function(const VM* vm, const char* name);

#endif

// src/function/function.c

#include <stdio.h>
#include <string.h>

#include "function.h"

Function* find_function(const VM* vm, const char* name) {
    Function* found = NULL;
    for (int i = 0; i < vm->function_count; i++) {
        if (strcmp(vm->functions[i]->name, name) == 0) {
            found = vm->functions[i];
            break;
        }
    }
    return found;
}

Byte-code loading

The bytecode loader reads our compiled format and reconstructs the function structures:

// src/compiler/bytecode.c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "../utils/logger.h"
#include "compiler.h"

// File operations for saving compiled bytecode
void save_compiled_bytecode(const char* filename, CompilationResult* compiled) {
    // ... implemented in the previous article
}


// File operations for loading compiled bytecode
CompilationResult* load_compiled_bytecode(const char* filename) {
    FILE* file = fopen(filename, "rb");
    if (!file) {
        print("[VM] Error: Could not open file: %s", filename);
        return NULL;
    }

    CompilationResult* result = malloc(sizeof(CompilationResult));

    // Read number of functions
    fread(&result->function_count, sizeof(int), 1, file);
    result->functions = malloc(sizeof(Function*) * result->function_count);

    // Read each function
    for (int i = 0; i < result->function_count; i++) {
        Function* func = malloc(sizeof(Function));
        result->functions[i] = func;

        // Read function name
        int name_len;
        fread(&name_len, sizeof(int), 1, file);
        func->name = malloc(name_len);
        fread(func->name, 1, name_len, file);

        // Read code length
        fread(&func->code_length, sizeof(int), 1, file);

        // Allocate and read bytecode
        func->byte_code = malloc(sizeof(BytecodeInstruction) * func->code_length);

        // Read each instruction
        for (int j = 0; j < func->code_length; j++) {
            BytecodeInstruction* instr = &func->byte_code[j];

            // Read opcode and indexes
            fread(&instr->opcode, sizeof(OpCode), 1, file);
            fread(&instr->var_index, sizeof(uint16_t), 1, file);
            fread(&instr->var_index2, sizeof(uint16_t), 1, file);
            fread(&instr->var_index3, sizeof(uint16_t), 1, file);
            fread(&instr->constant, sizeof(int32_t), 1, file);

            // Read name if present
            int has_name;
            fread(&has_name, sizeof(int), 1, file);
            if (has_name) {
                int instr_name_len;
                fread(&instr_name_len, sizeof(int), 1, file);
                instr->name = malloc(instr_name_len);
                fread(instr->name, 1, instr_name_len, file);
            } else {
                instr->name = NULL;
            }
        }

        // Read constant pool
        fread(&func->constant_pool_size, sizeof(int), 1, file);
        func->constant_pool = malloc(sizeof(char*) * func->constant_pool_size);

        for (int j = 0; j < func->constant_pool_size; j++) {
            int const_len;
            fread(&const_len, sizeof(int), 1, file);
            func->constant_pool[j] = malloc(const_len);
            fread(func->constant_pool[j], 1, const_len, file);
        }

        print("[VM] Loaded function '%s' with %d instructions", func->name, func->code_length);
    }

    fclose(file);
    print("[VM] Successfully loaded compiled bytecode from: %s", filename);
    return result;
}

Notice that the constants are stored in the constant pool.

Bytecode execution

The first step is to adjust the execute_bytecode function signature, we no longer need to send the instructions there, because the byte-code is attached to the ThreadContext.

// src/execution/execution.h

#ifndef TINY_VM_EXECUTION_H
#define TINY_VM_EXECUTION_H

#include "../types.h"

void execute_bytecode(ThreadContext* thread);

// synchronous and asynchronous function execution
void sync_function(ThreadContext* caller, const Function* function);
ThreadContext* async_function(VM* vm, const Function* function);

#endif

Now we implement the core of the VM, which executes the byte-code using operation instructions.

// src/execution/execution.c

#include "execution.h"

#include <unistd.h>
#include <pthread.h>

#include "../types.h"
#include "../thread/thread.h"
#include "../function/function.h"
#include "../utils/logger.h"
#include "../memory/memory.h"
#include "../synchronization/synchronization.h"

void execute_bytecode(ThreadContext* thread) {
    const BytecodeInstruction* instr = &thread->current_function->byte_code[thread->pc];

    switch(instr->opcode) {
        case OP_NOP:
            break;

        case OP_PRINT: {
            const char* var_name = instr->name;
            const jint value = get_value(thread, var_name);
            print("[Thread %d] PRINT %s = %d", thread->thread_id, var_name, value);
            break;
        }

        case OP_LOAD_CONST: {
            Variable* var = get_variable(thread, instr->name);
            if (var) {
                var->value = instr->constant;
                print("[Thread %d] SET %s = %d", thread->thread_id, var->name, var->value);
            }
            break;
        }

        case OP_ADD: {
            const char* target = thread->current_function->constant_pool[instr->var_index];
            const char* op1 = thread->current_function->constant_pool[instr->var_index2];
            const char* op2 = thread->current_function->constant_pool[instr->var_index3];

            const jint val1 = get_value(thread, op1);
            const jint val2 = get_value(thread, op2);
            Variable* target_var = get_variable(thread, target);

            if (target_var) {
                target_var->value = val1 + val2;
                print("[Thread %d] ADD %s = %s(%d) + %s(%d) = %d",
                      thread->thread_id, target, op1, val1, op2, val2, target_var->value);
            }
            break;
        }

        case OP_SLEEP: {
            usleep(instr->constant * 1000);
            break;
        }

        case OP_SETSHARED: {
            Variable* var = get_shared_variable(thread, instr->name);
            if (var) {
                var->value = instr->constant;
                print("[Thread %d] SETSHARED %s = %d", thread->thread_id, var->name, var->value);
            }
            break;
        }

        case OP_MONITOR_ENTER: {
            SynchronizationLock* mutex = get_sync_lock(thread->vm, instr->name);
            if (mutex) {
                print("[Thread %d] Waiting for lock '%s'", thread->thread_id, mutex->name);
                pthread_mutex_lock(&mutex->mutex);
                mutex->locked = 1;
                print("[Thread %d] Acquired lock '%s'", thread->thread_id, mutex->name);
            }
            break;
        }

        case OP_MONITOR_EXIT: {
            SynchronizationLock* mutex = get_sync_lock(thread->vm, instr->name);
            if (mutex && mutex->locked) {
                pthread_mutex_unlock(&mutex->mutex);
                mutex->locked = 0;
                print("[Thread %d] Released lock '%s'", thread->thread_id, mutex->name);
            }
            break;
        }

        case OP_INVOKE_SYNC: {
            const Function* function = find_function(thread->vm, instr->name);
            if (function) {
                sync_function(thread, function);
            } else {
                print("[Thread %d] Error: Function '%s' not found", thread->thread_id, instr->name);
            }
            break;
        }

        case OP_INVOKE_ASYNC: {
            const Function* function = find_function(thread->vm, instr->name);
            if (function) {
                async_function(thread->vm, function);
            } else {
                print("[Thread %d] Error: Function '%s' not found", thread->thread_id, instr->name);
            }
            break;
        }

        case OP_RETURN: {
            // this should only end the current function, on stack, 
            // but not kill a thread, we will need 
            // function call stack soon!
            thread->is_running = 0;
            break;
        }

        default:
            print("[Thread %d] Error: Unknown opcode: %d", thread->thread_id, instr->opcode);
    }
}

ThreadContext* async_function(VM* vm, const Function* function) {
    return create_thread(vm, function);
}

Synchronous function calls

The execution of synchronous functions reveals a current limitation in our design:

// src/execution/execution.c

void sync_function(ThreadContext* caller, const Function* function) {
    // Save caller's context (because we don't have function call stack)
    const Function* original_function = caller->current_function;
    const int original_pc = caller->pc;

    // Set up function context
    caller->current_function = function;
    caller->pc = 0;

    print("[Thread %d] Sync function '%s' started", caller->thread_id, function->name);

    // Execute function instructions
    while (caller->is_running && caller->pc < function->code_length) {
        execute_bytecode(caller);
        caller->pc++;
    }

    // Restore caller's context
    caller->current_function = original_function;
    caller->pc = original_pc;
}

This approach of saving and restoring context is a temporary solution. A proper function call stack would allow us to handle nested function calls more elegantly.

Threading support

The TinyVM implementation focuses on core virtual machine mechanics rather than complex language features. Our language model is intentionally minimal, supporting only functions as the primary unit of execution. This simplification allows us to concentrate on fundamental VM concepts like bytecode execution, threading, and synchronization without getting bogged down in language grammar complexities.

The thread management functions have been updated to work with bytecode execution:

// src/thread/thread.h
#ifndef TINY_VM_THREAD_H
#define TINY_VM_THREAD_H

#include "../types.h"

ThreadContext* create_thread(VM* vm, const Function* function);

void* execute_thread_instructions(void* arg);

#endif

The key change in our thread implementation is how we handle execution. Instead of processing text instructions line by line, each thread now executes a complete function with its associated bytecode.

// src/thread/thread.c
#include "thread.h"
#include "../core/vm.h"
#include "../execution/execution.h"
#include "../utils/logger.h"
#include "../memory/memory.h"

ThreadContext* create_thread(VM* vm, const Function* function) {
    pthread_mutex_lock(&vm->thread_mgmt_lock);

    if (vm->thread_count >= vm->thread_capacity) {
        pthread_mutex_unlock(&vm->thread_mgmt_lock);
        return NULL;
    }

    ThreadContext* thread = &vm->threads[vm->thread_count++];
    thread->local_scope = create_local_scope();
    thread->current_function = function;

    thread->pc = 0;
    thread->is_running = 1;
    thread->thread_id = vm->next_thread_id++;  // Assign and increment thread ID
    thread->vm = vm;

    pthread_create(&thread->thread, NULL, execute_thread_instructions, thread);

    pthread_mutex_unlock(&vm->thread_mgmt_lock);
    return thread;
}

void* execute_thread_instructions(void* arg) {
    ThreadContext* thread = (ThreadContext*) arg;
    print("[Thread %d] Thread instructions started", thread->thread_id);

    while (thread->is_running && thread->pc < thread->current_function->code_length) {
        execute_bytecode(thread);
        thread->pc++;
    }

    print("[Thread %d] Thread instructions finished", thread->thread_id);
    return NULL;
}

Testing the new VM

Let’s test our VM with a more complex example that uses threads and synchronization:

function createCounter
setshared counter 1000
print counter

function incrementCounter
lock counter_lock
set increment 10
add counter increment counter
print counter
unlock counter_lock
exit

function decrementCounter
lock counter_lock
set decrement -10
add counter decrement counter
print counter
unlock counter_lock
exit

function main
sync createCounter
async incrementCounter
async decrementCounter
exit

The output demonstrates that our bytecode execution correctly handles threading, synchronization, and shared memory:

[2025-01-02 20:57:21.297375] [VM] Starting TinyVM...
[2025-01-02 20:57:21.297850] [VM] Loaded function 'createCounter' with 2 instructions
[2025-01-02 20:57:21.297856] [VM] Loaded function 'incrementCounter' with 6 instructions
[2025-01-02 20:57:21.297860] [VM] Loaded function 'decrementCounter' with 6 instructions
[2025-01-02 20:57:21.297864] [VM] Loaded function 'main' with 4 instructions
[2025-01-02 20:57:21.297868] [VM] Successfully loaded compiled bytecode from: /Users/ondrej/Documents/Projects/c/tiny-vm/tiny-vm_07_compilation/examples/counter.tvmc

[2025-01-02 20:57:21.297913] [Thread 0] Thread instructions started
[2025-01-02 20:57:21.297917] [Thread 0] Sync function 'createCounter' started
[2025-01-02 20:57:21.297921] [Thread 0] Created shared variable counter
[2025-01-02 20:57:21.297923] [Thread 0] SETSHARED counter = 1000
[2025-01-02 20:57:21.297924] [Thread 0] Found shared variable counter
[2025-01-02 20:57:21.297925] [Thread 0] PRINT counter = 1000

[2025-01-02 20:57:21.297939] [Thread 1] Thread instructions started
[2025-01-02 20:57:21.297941] [Thread 1] Waiting for lock 'counter_lock'
[2025-01-02 20:57:21.297942] [Thread 1] Acquired lock 'counter_lock'
[2025-01-02 20:57:21.297945] [Thread 1] SET increment = 10
[2025-01-02 20:57:21.297942] [Thread 0] Thread instructions finished
[2025-01-02 20:57:21.297956] [Thread 1] Found shared variable counter
[2025-01-02 20:57:21.297961] [Thread 1] ADD counter = increment(10) + counter(1000) = 1010
[2025-01-02 20:57:21.297965] [Thread 1] PRINT counter = 1010
[2025-01-02 20:57:21.297971] [Thread 1] Released lock 'counter_lock'
[2025-01-02 20:57:21.297973] [Thread 1] Thread instructions finished

[2025-01-02 20:57:21.297944] [Thread 2] Thread instructions started
[2025-01-02 20:57:21.297978] [Thread 2] Waiting for lock 'counter_lock'
[2025-01-02 20:57:21.297979] [Thread 2] Acquired lock 'counter_lock'
[2025-01-02 20:57:21.297981] [Thread 2] SET decrement = -10
[2025-01-02 20:57:21.297982] [Thread 2] Found shared variable counter
[2025-01-02 20:57:21.297984] [Thread 2] ADD counter = decrement(-10) + counter(1000) = 990
[2025-01-02 20:57:21.297985] [Thread 2] PRINT counter = 990
[2025-01-02 20:57:21.297986] [Thread 2] Released lock 'counter_lock'
[2025-01-02 20:57:21.297988] [Thread 2] Thread instructions finished

[2025-01-02 20:57:21.297998] [VM] TinyVM finished.

Bytecode execution differences

Our bytecode execution model is also much simpler compared to the JVM’s advanced execution engine. Let’s see what are the main differences to learn more about how JVM executes the byte-code:

Execution engine

TinyVM:

Direct bytecode interpretation

JVM — Multiple execution modes:

Interpreted mode for initial execution
Just-In-Time (JIT) compilation for hot methods
Adaptive optimization based on runtime profiling
Support for different JIT compilers (C1/C2 in HotSpot)

Memory management

TinyVM:

Allocates global variables directly in the heap
Basic heap allocation without garbage collection

JVM: Sophisticated memory management:

Generational garbage collection
Multiple GC algorithms (Serial, Parallel, G1, ZGC)
Object allocation optimization
Escape analysis for stack allocation

Thread management

TinyVM:

Basic thread creation and synchronization

JVM — Advanced threading features:

Thread pooling and management
Biased locking
Lock coarsening and elimination
Integration with OS thread scheduling

Function call handling

TinyVM:

Simple function calls without proper call stack

JVM: Complete method invocation system:

Method resolution and dispatch
Virtual method table (vtable)
Interface method tables
Dynamic dispatch optimization

Error handling

TinyVM:

Minimal error reporting

JVM: Comprehensive exception handling:

Stack trace generation
Exception tables
Try-catch-finally blocks
Stack unwinding

Performance features

TinyVM:

No runtime optimization

JVM — Extensive runtime optimization:

On-Stack Replacement (OSR)
Speculative optimization
Profile-guided optimization
Deoptimization when assumptions fail

Native integration

TinyVM:

No native code integration

JVM: Complete native interface (JNI):

Native method resolution
Data marshalling
Native library loading
Security checks

The complete source code for this article is available in the tiny-vm_07_compilation directory of the TinyVM repository.

The next steps

Our current implementation has one significant limitation: the lack of a proper function call stack. This makes it difficult to handle nested function calls elegantly and limits our ability to implement more advanced features. In the next article, we’ll implement a proper function call stack to address these limitations.

Introduction
Part 1 — Foundations
Part 2 — Multithreading
Part 3 — Heap
Part 4 — Synchronized
Part 5 — Refactoring
Part 6 — Functions
Part 7 — Compilation
Part 8 — Byte-code execution (you are here)
Part 9 — Function call stack (not started)
Part 10 — Garbage collector (not started)

Building a Virtual Machine, JVM-inspired — Byte-code execution (Part 8)

Introduction

Implementation goals

The implementation

VM entry point

VM updates

Function management

Byte-code loading

Bytecode execution

Synchronous function calls

Threading support

Testing the new VM

Bytecode execution differences

Execution engine

Memory management

Thread management

Function call handling

Error handling

Performance features

Native integration

The next steps

Written by Ondrej Kvasnovsky

No responses yet