Speculative Optimization in the JVM

7 min readJan 18, 2025

Introduction

The Java Virtual Machine (JVM) employs sophisticated optimization techniques to achieve high performance while maintaining Java’s safety guarantees. One of the most powerful among these is speculative optimization, where the JVM makes educated guesses about program behavior based on runtime observations and optimizes accordingly.

What is Speculative Optimization?

Speculative optimization is a technique where the JVM makes assumptions about how code will behave based on runtime profiling data, and then optimizes the code based on those assumptions. If the assumptions later prove false, the JVM can “deoptimize” the code back to a safer, slower version.

Optimization types

1. Monomorphic call transformation

This optimization converts virtual method calls to direct calls when the JVM observes only one implementation being used. This is particularly powerful because:

It eliminates virtual dispatch overhead
It enables further optimizations like method inlining
It significantly reduces the number of indirections in the code path

Here’s an example:

interface Calculator {
    double compute(double value);
}

class SquareCalculator implements Calculator {
    @Override
    public double compute(double value) {
        return value * value;
    }
}

public class MonomorphicExample {
    public static double processValues(Calculator calc, double[] values) {
        double sum = 0;
        for (double value : values) {
            sum += calc.compute(value);  // Virtual call can be optimized
        }
        return sum;
    }
    
    public static void main(String[] args) {
        Calculator calc = new SquareCalculator();
        double[] values = new double[1000];
        
        // JVM observes only SquareCalculator used
        for (int i = 0; i < 10000; i++) {
            processValues(calc, values);
        }
        
        // If a new implementation is loaded, deoptimization occurs
        Calculator newCalc = new Calculator() {
            @Override
            public double compute(double value) {
                return value + 1;
            }
        };
        processValues(newCalc, values);  // Triggers deoptimization
    }
}

In the example above, the JVM:

Observes that only SquareCalculator implements Calculator (or better to say: only SquareCalculator implementation is being used)
Converts the virtual call to compute() into a direct call
Possibly inlines the compute method entirely (which is probably the most powerful optimization as it eliminates method call overhead and reduces stack operations)
Must deoptimize as a new Calculator implementation appears

We can run the example and observe what happens:

java -XX:+PrintCompilation \
     -XX:+UnlockDiagnosticVMOptions \
     -XX:+LogCompilation \
     -Xlog:class+load=info \
     -Xlog:compilation=info \
     -Xlog:jit+compilation=info \
     MonomorphicExample

Let’s analyze the output to see where the monomorphic optimization and deoptimization happened:

the important part of the console output we are going to analyze

1. First, we see the SquareCalculator’s compute method being compiled:

# Initial compile at level 2
23 8  2 SquareCalculator::compute (4 bytes)   

# Optimized to level 4 (highest)
24 9  4 SquareCalculator::compute (4 bytes)   

# Old version discardedThen processValues gets compiled 
# and optimized with the monomorphic assumption:
24 8  2 SquareCalculator::compute (4 bytes)   made not entrant

2. Then processValues gets compiled with monomorphic assumption:

# OSR compilation
25  10 %  3 MonomorphicExample::processValues @ 13 (46 bytes) 

# Normal compilation
25  11    3 MonomorphicExample::processValues (46 bytes)      

# Optimized OSR
25  12 %  4 MonomorphicExample::processValues @ 13 (46 bytes)

3. Main method gets compiled:

88   14 %  3  MonomorphicExample::main @ 16 (49 bytes)
89   15    3  MonomorphicExample::main (49 bytes)

4. The deoptimization happens when the new Calculator implementation is loaded:

# New class (implementation) is loaded 
[0.124s][info][class,load] MonomorphicExample$1 source: file:/Users/ondrej/Documents/Projects/java/jvm-examples/

# Deoptimization!
123   13       4       MonomorphicExample::processValues (46 bytes)   made not entrant  

# New implementation compiled
123   16       3       MonomorphicExample$1::compute (4 bytes)        

# Recompiled without monomorphic assumption
123   17       4       MonomorphicExample::processValues (46 bytes)

2. Type speculation

The JVM can speculate about the actual types of objects at runtime. Here’s a simple example:

public class ArrayTypeSpeculation {
    public static double sumArray(Object[] arr) {
        double sum = 0.0;
        // JVM will initially speculate all elements are Integers
        for (Object elem : arr) {
            if (elem instanceof Number) {
                sum += ((Number)elem).doubleValue();
            }
        }
        return sum;
    }
    
    public static void main(String[] args) {
        // Create array of Integers first
        Object[] integers = new Object[1000];
        for (int i = 0; i < integers.length; i++) {
            integers[i] = Integer.valueOf(i);
        }
        
        // Train JVM with Integer array
        for (int i = 0; i < 10000; i++) {
            sumArray(integers);
        }
        
        System.out.println("Now introducing array with mixed types...");
        
        // Create mixed array with both Integers and Doubles
        Object[] mixed = new Object[1000];
        for (int i = 0; i < mixed.length; i++) {
            if (i % 2 == 0) {
                mixed[i] = Integer.valueOf(i);
            } else {
                mixed[i] = Double.valueOf(i);
            }
        }
        
        // This should trigger deoptimization
        sumArray(mixed);
    }
}

This example shows the JVM’s type speculative optimization strategy:

Initially assumes homogeneous Integer types
Optimizes based on this assumption
Deoptimizes when encountering Double values
Recompiles to handle the mixed type case

Let’s run this and analyze the console logs:

java -XX:+PrintCompilation \
     -XX:+UnlockDiagnosticVMOptions \
     -XX:+LogCompilation \
     -XX:+TraceDeoptimization \
     -Xlog:class+load=info \
     -Xlog:compilation=info \
     -Xlog:jit+compilation=info \
     ArrayTypeSpeculation

Console log analysis:

1. The JVM first optimizes the sumArray method, going up to level 4 (C2 compiler) with OSR compilation.

26   11 %     3       ArrayTypeSpeculation::sumArray @ 11 (51 bytes)
26   12       3       ArrayTypeSpeculation::sumArray (51 bytes)
26   13 %     4       ArrayTypeSpeculation::sumArray @ 11 (51 bytes)

2. JVM optimizes for Integer operations, suggesting it’s speculating all array elements are Integers.

24    8       3       java.lang.Integer::<init> (10 bytes)
24    6       3       java.lang.Integer::valueOf (32 bytes)
24    9       2       java.lang.Integer::doubleValue (6 bytes)

3. When we introduce the mixed array (Integers and Doubles), we see:

Now introducing array with mixed types...
UNCOMMON TRAP method=ArrayTypeSpeculation.sumArray([Ljava/lang/Object;)D
  bci=26 pc=0x0000000112a7ce0c
  reason=class_check 
  action=maybe_recompile

4. JVM is forced to deoptimize due to the type assumption being violated:

DEOPT PACKING thread=0x000000011f008200
   Virtual frames (innermost/newest first):
      VFrame 0 - ArrayTypeSpeculation.sumArray([Ljava/lang/Object;)D - instanceof @ bci=26

5. New class loading for Double support:

[0.041s][info][class,load] java.lang.Double::valueOf (9 bytes)
41   16       3       java.lang.Double::<init> (10 bytes)

Type speclation is more sophisticated than this, feel free to explore: Type profiling and speculation in JVM.

3. Null check elimination

The JVM can speculate that certain references will never be null:

public class NullCheckSpeculation {
    public static int getStringLength(String s) {
        // Add explicit null check that JVM can learn to eliminate
        if (s == null) {
            return -1;
        }
        return s.length();
    }
    
    public static void main(String[] args) {
        // Training phase
        for (int i = 0; i < 100_000; i++) {
            getStringLength("test" + i);  // Never null
        }
        
        System.out.println("Now introducing null...");
        
        // This should trigger deoptimization but not NPE
        int result = getStringLength(null);
        System.out.println("Result for null: " + result);
    }
}

Let’s run the code with all the JVM flags:

java -XX:+PrintCompilation \
     -XX:+UnlockDiagnosticVMOptions \
     -XX:+LogCompilation \
     -XX:+TraceDeoptimization \
     -Xlog:class+load=info \
     -Xlog:compilation=info \
     -Xlog:jit+compilation=info \
     NullCheckSpeculation

Output analysis:

1. The getStringLength method is initially compiled at level 3 (C1 with full optimization)

54   85       3       NullCheckSpeculation::getStringLength (11 bytes)

2. The main loop gets compiled with OSR compilation (%) during the training phase:

61  106 %     3       NullCheckSpeculation::main @ 2 (50 bytes)
62  107       3       NullCheckSpeculation::main (50 bytes)

3. The deoptimization evidence when null is introduced. The code falls back to the safe path and handles the null case correctly:

Now introducing null...
63 106 % 3 NullCheckSpeculation::main @ 2 (50 bytes)   made not entrant

DEOPT PACKING thread=0x000000013e014e00 vframeArray=0x0000000149011200
   Deoptimized frame (sp=0x000000016bf92a40...)
   Virtual frames (innermost/newest first):
      VFrame 0 (...) - NullCheckSpeculation.main([Ljava/lang/String;)V - invokedynamic @ bci=41

Result for null: -1

4. Branch Prediction

The JVM can speculate about which branches are likely to be taken:

public class BranchSpeculation {
    public static int processValue(int value) {
        if (value > 100) {  // JVM might speculate this is rarely true
            return complexCalculation(value);
        }
        return value + 1;   // Common path optimized
    }
    
    private static int complexCalculation(int value) {
        // Complex logic here
        return value * 2;
    }
}

5. Method Inlining with Guards

Inlines methods based on observed receiver types
Adds guards to check if assumption holds
Deoptimizes if different implementation is encountered

interface Parser {
    Object parse(String input);
}
// If only JsonParser is used during training
Parser parser = new JsonParser(); 
parser.parse(input);  // JVM might inline JsonParser's implementation

6. Loop optimizations

Range check elimination
Loop unrolling based on observed patterns
Loop versioning for common cases

for (int i = 0; i < array.length; i++) {
    // If array bounds checks always pass
    array[i].process();  // JVM might eliminate bounds checking
}

7. Exception path speculation

Optimizes for no-exception paths
Deoptimizes if exceptions occur frequently

try {
    // If exceptions are rare
    riskyOperation();  // JVM might optimize for no-exception path
} catch (Exception e) {
    // uncommon path
}

8. Field value speculation

Speculates on values of fields
Particularly effective with final fields
Deoptimizes if speculation proves wrong

class Config {
    private final boolean debug;
    void log() {
        if (debug) {  // If debug is usually false
            // JVM might speculate debug is always false
        }
    }
}

9. Lock Elision

Removes synchronization if no contention is observed
Deoptimizes if contention occurs

synchronized void method() {  // If no contention observed
    // JVM might eliminate actual locking
}

Deoptimization

When speculative optimizations fail, the JVM performs deoptimization:

Trap — The optimized code detects an assumption violation
Stack unwinding — JVM captures the current program state
State translation — Converts optimized state back to interpreter state
Continuation — Execution continues in the interpreter

Best Practices

Consistent types — Keep type hierarchies simple and consistent
Warm-up period — Allow time for the JVM to gather profiling data
Avoid mixed types — Don’t mix different types in the same collection or method parameter if possible
Profile-guided optimization — Use real-world workloads during testing

How the JVM Tracks Optimizations

The JVM maintains detailed metadata about its optimizations:

Dependencies between compiled methods
Assumptions made during compilation
Profiling data that led to optimizations
Deoptimization triggers embedded in the code

When assumptions are violated:

The JVM detects the violation
Identifies affected compiled methods
Marks them for deoptimization
Recompiles with updated assumptions

Conclusion

Even the Speculative optimization are very deep in weeds of JVM, and probably not that crucial to every day coding, it is amazing how it allows our applications to achieve high performance while maintaining the safety and flexibility of the Java language.

As always, understanding these mechanisms helps us to have an idea what is going on when something, like performance, goes wrong.