Building a Virtual Machine, JVM-inspired — Multithreading (Part 2)

7 min readJan 5, 2025

Introduction

In Part 1, we built the foundations of TinyVM with basic instruction parsing and variable management. Now we’ll extend our VM to support multithreading. Just like the JVM, our VM will allow multiple threads to execute code concurrently while maintaining thread safety and proper resource isolation.

The evolution of threading JVMs

It is factinating how the concurrency in JVM evolved over time.

Green Threads (1997–2000)

The original JVM used green threads, managed entirely by the VM:

Simple to implement
Worked on any operating system
Efficient context switching
Couldn’t utilize multiple CPU cores
Poor performance with blocking I/O operations
Limited by single OS thread

Native Threads (2000-Present)

From Java 1.3 onwards, the JVM switched to native threads:

True parallelism across CPU cores
Better system integration
Improved I/O operations
Higher memory overhead
More expensive context switching
Limited by OS thread capacity

Project Loom (Java 21+)

Virtual threads introduce a new threading model with some interesting trade-offs:

Extremely lightweight (millions possible)
Efficient I/O operations
Simpler concurrent programming model
Pinning problems with synchronized blocks/methods. Pinning causes the virtual thread to get bound to the its carrier thread and can’t be used by other virtual threads, which defeats the purpose of having “milions” of virtual threads.
Not ideal for CPU-heavy tasks
Higher memory overhead with ThreadLocal
Native method calls cause pinning (e.g., System.nanoTime())

Our implementation choice

Our TinyVM is going to use native threads through pthreads for several reasons:

Simplicity — Direct mapping between VM threads and OS threads makes the implementation clearer
Learning value — Understanding native threading is fundamental to VM design
Performance — True parallelism without the complexity of virtual thread scheduling

Implementing virtual threads would require sophisticated scheduling and continuation support that would obscure the basic concepts we’re trying to demonstrate. However, understanding our simple native threading implementation provides a foundation for appreciating more advanced approaches like virtual threads.

Implementation goals

The aim of this article is to implement:

Parallel execution — Multiple tasks can run simultaneously, utilizing modern multi-core processors. Long-running tasks don’t block the entire program.
Resource sharing — Threads can share the VM’s resources while maintaining isolation where needed.

Prerequisities

Understanding of Part 1: Foundations implementation
Basic familiarity with threading concepts (threads, mutexes, race conditions)

Programming language

For simplicity, and to avoid a need for langauge grammar and parsing, our VM will be able to run code like this:

const char* program[] = {
    // Thread-0 starts when the program starts
    "set x 5",               // Line 0
    "thread 5",              // Line 1 - Start thread at line 5
    "print x",               // Line 2
    "sleep 5000",            // Line 3 - Wait for new thread
    "exit",                  // Line 4
    "set y 10",              // Line 5 - Second thread starts here
    "print y",               // Line 6
    "sleep 1000",            // Line 7
    "exit",                  // Line 8
    NULL
};

Each thread has its own local variables, so x is only visible in Thread-0 and y in Thread-1.

Thread management design

Thread context

Let’s look at the core structures we’ll add to support multi-threading:

// tiny_vm.h
// ..

// Thread context
typedef struct {
    // Each thread has its own local_scope
    LocalScope* local_scope; 

    // A program (code) this thread is executing
    const char** program;    

    // Program counter (pointer to line in code/program ^^^)
    int pc;                  

    // Native thread handle
    pthread_t thread;        

    // Logical thread ID (0, 1, etc)
    int thread_id;           

    // Thread status (to when the thread is done)
    int is_running;          

    // Reference to VM
    VM* vm;                  
} ThreadContext;

// ...

Each thread context includes a pthread_t thread handle, which is a unique identifier that the operating system and our VM use to track and manage the thread.

Our VM needs pthread_t to control the thread’s lifecycle:

// Create a new thread
pthread_create(&thread->thread, NULL, thread_main, thread);

// Wait for thread completion
pthread_join(vm->threads[i].thread, NULL);

Thread safety considerations

We are going to protect shared resources (like threads, thread_count, and next_thread_id) using a VM-wide mutex (mutual exclusion) lock:

// tiny_vm.h
// ..
struct VM {
    ThreadContext* threads;
    int thread_count;
    int thread_capacity;
    pthread_mutex_t thread_lock; // <-- mutex
    int next_thread_id;
};

Critical sections will be wrapped in pthread_mutex_lock/unlock calls which prevents race conditions during thread creation and management.

Without this mutex lock, there could be race conditions when multiple threads try to create new threads simultaneously, potentially corrupting the VM’s thread management data structures or exceeding the thread capacity in an unsafe way.

This design choice of using a single mutex is simple but effective for our needs. In production VMs, finer-grained locking would be used for better performance. We will add more locks in the following articles.

Implementation deep dive

Thread creation and initialization

Here’s how we create new threads:

// tiny_vm.c
// ..
ThreadContext* create_thread(VM* vm, const char** program, int start_line) {
    pthread_mutex_lock(&vm->thread_lock);

    if (vm->thread_count >= vm->thread_capacity) {
        pthread_mutex_unlock(&vm->thread_lock);
        return NULL;
    }

    ThreadContext* thread = &vm->threads[vm->thread_count++];
    thread->local_scope = create_local_scope();
    thread->program = program;
    thread->pc = start_line;
    thread->is_running = 1;
    thread->thread_id = vm->next_thread_id++;  // Assign and increment thread ID
    thread->vm = vm; // Store reference to VM

    pthread_create(&thread->thread, NULL, execute_thread_instructions, thread);

    pthread_mutex_unlock(&vm->thread_lock);
    return thread;
}

The mutex ensures thread-safe access to shared VM structures during thread creation.

pthread_create function from pthread.h creates a new thread. Let’s explain the 4 arguments that we pass to the pthread_create function:

&thread->thread — Where to store the new thread's identifier
NULL — Default thread attributes
execute_thread_instructions — Function that the thread will run (we will define this function in the next chapter)
thread — Argument passed to that function (accessible as void* arg)

Thread execution loop

Each thread runs its own execution loop, where each iteration of while look is going to execute one line of code, until the tread is finished:

// tiny_vm.c
// ..  
void* execute_thread_instructions(void* arg) {
    ThreadContext* thread = (ThreadContext*)arg;
    printf("[Thread %d] Thread started\n", thread->thread_id);

    while (thread->is_running) {
        const char* line = thread->program[thread->pc];
        if (line == NULL) break;

        Instruction instr = parse_instruction(line);
        execute_instruction(thread->local_scope->vm, thread, &instr);

        thread->pc++;
    }

    printf("[Thread %d] Thread finished\n", thread->thread_id);
    return NULL;
}

Now we extend our instruction parser and executor with new instructions to create and exit a thread.

// tiny_vm.c
// ..
Instruction parse_instruction(const char* line) {
    Instruction instr;
    memset(&instr, 0, sizeof(Instruction));
    char cmd[32];
    sscanf(line, "%s", cmd);
    // ...
    else if (strcmp(cmd, "thread") == 0) {
        instr.type = THREAD;
        sscanf(line, "%s %s", cmd, instr.args[0]);
    }
    else if (strcmp(cmd, "exit") == 0) {
        instr.type = EXIT;
    }
    return instr;
}

void execute_instruction(VM* vm, ThreadContext* thread, Instruction* instr) {
    LocalScope* local_scope = thread->local_scope;
    switch (instr->type) {
        // ...
        case THREAD: {
            int start_line = atoi(instr->args[0]);
            create_thread(vm, thread->program, start_line);
            break;
        }
        case EXIT: {
            thread->is_running = 0;
            printf("[Thread %d] Thread exiting\n", thread->thread_id);
            break;
        }
    }
}

This mirrors how the JVM executes bytecode in each thread, though our implementation is simplified.

VM creation and management

The VM initialization sets up the thread management:

// tiny_jvm.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include "tiny_vm.h"

VM* create_vm() {
    VM* vm = malloc(sizeof(VM));
    vm->thread_capacity = 10;
    vm->thread_count = 0;
    vm->threads = malloc(sizeof(ThreadContext) * vm->thread_capacity);
    vm->next_thread_id = 0;  // Initialize thread ID counter
    pthread_mutex_init(&vm->thread_lock, NULL);
    return vm;
}

void destroy_vm(VM* vm) {
    for (int i = 0; i < vm->thread_count; i++) {
        if (vm->threads[i].local_scope) {
            destroy_local_scope(vm->threads[i].local_scope);
        }
    }
    free(vm->threads);
    pthread_mutex_destroy(&vm->thread_lock);
    free(vm);
}

Start the VM function

Finally, we implement the VM startup function. It creates a new thread and starts the program from the line 0.

// tiny_jvm.c
// ...
void start_vm(VM* vm, const char** program) {
    // Create main thread starting at line 0
    create_thread(vm, program, 0);

    // Wait for all threads to finish
    for (int i = 0; i < vm->thread_count; i++) {
        printf("[Main] Waiting for thread Thread %d\n", vm->threads[i].thread_id);
        pthread_join(vm->threads[i].thread, NULL);
    }
}

We also need to make sure the VM does not exit too soon, so we iterate through all threads and wait for their completion, using pthread_join function.

Example program

Let’s look at a practical example that demonstrates thread creation and interaction:

// main.c
#include <stdio.h>
#include "tiny_vm.h"

int main() {
    // Our "Java" program with threads
    const char* program[] = {
        "set x 5",               // Line 0
        "thread 5",              // Line 1 - Start thread at line 5
        "print x",               // Line 2
        "sleep 5000",            // Line 3 - Sleep till the new thread finishes
        "exit",                  // Line 4
        "set y 10",              // Line 5 - Second thread starts here
        "print y",               // Line 6
        "sleep 1000",            // Line 7
        "exit",                  // Line 8
        NULL
    };

    printf("Starting TinyVM with threads...\n");

    VM* vm = create_vm();
    start_vm(vm, program);
    destroy_vm(vm);

    printf("TinyVM finished.\n");
    return 0;
}

The program starts two threads. The first thread (Thread-0) is started before the program executes the first line. The second thread (Thread-1) is started on the line 5.

Output analysis

When we run our multi-threaded program, the output shows the interleaved execution of both threads.

./build/tiny_vm
Starting TinyVM with threads...
[Main] Waiting for thread Thread 0
[Thread 0] Thread started
[Thread 0] 5
[Thread 1] Thread started
[Thread 1] 10
[Thread 1] Thread exiting
[Thread 1] Thread finished
[Thread 0] Thread exiting
[Thread 0] Thread finished
[Main] Waiting for thread Thread 1
TinyVM finished.

It might be difficult to see what exactly happened from the console logs. Have a look at the following sequence diagram to understand what and when each thread executed:

Next steps

With threading support in place, we’re ready to implement heap memory in Part 3. This will allow threads to share data safely and set the stage for more advanced features like object allocation and garbage collection.

The complete source code for this article is available in the tiny-vm_02_multithreading directory of the TinyVM repository.

Introduction
Part 1 — Foundations
Part 2 — Multithreading (you are here)
Part 3 — Heap
Part 4 — Synchronized
Part 5 — Refactoring
Part 6 — Functions
Part 7 — Compilation
Part 8 — Byte-code execution
Part 9 — Function call stack (not started)
Part 10 — Garbage collector (not started)