Speculative Optimization in the JVM
Introduction
The Java Virtual Machine (JVM) employs sophisticated optimization techniques to achieve high performance while maintaining Java’s safety guarantees. One of the most powerful among these is speculative optimization, where the JVM makes educated guesses about program behavior based on runtime observations and optimizes accordingly.
What is Speculative Optimization?
Speculative optimization is a technique where the JVM makes assumptions about how code will behave based on runtime profiling data, and then optimizes the code based on those assumptions. If the assumptions later prove false, the JVM can “deoptimize” the code back to a safer, slower version.
Optimization types
1. Monomorphic call transformation
This optimization converts virtual method calls to direct calls when the JVM observes only one implementation being used. This is particularly powerful because:
- It eliminates virtual dispatch overhead
- It enables further optimizations like method inlining
- It significantly reduces the number of indirections in the code path
Here’s an example:
interface Calculator {
double compute(double value);
}
class SquareCalculator implements Calculator {
@Override
public double compute(double value) {
return value * value;
}
}
public class MonomorphicExample {
public static double processValues(Calculator calc, double[] values) {
double sum = 0;
for (double value : values) {
sum += calc.compute(value); // Virtual call can be optimized
}
return sum;
}
public static void main(String[] args) {
Calculator calc = new SquareCalculator();
double[] values = new double[1000];
// JVM observes only SquareCalculator used
for (int i = 0; i < 10000; i++) {
processValues(calc, values);
}
// If a new implementation is loaded, deoptimization occurs
Calculator newCalc = new Calculator() {
@Override
public double compute(double value) {
return value + 1;
}
};
processValues(newCalc, values); // Triggers deoptimization
}
}
In the example above, the JVM:
- Observes that only
SquareCalculator
implementsCalculator
(or better to say: onlySquareCalculator
implementation is being used) - Converts the virtual call to
compute()
into a direct call - Possibly inlines the compute method entirely (which is probably the most powerful optimization as it eliminates method call overhead and reduces stack operations)
- Must deoptimize as a new
Calculator
implementation appears
We can run the example and observe what happens:
java -XX:+PrintCompilation \
-XX:+UnlockDiagnosticVMOptions \
-XX:+LogCompilation \
-Xlog:class+load=info \
-Xlog:compilation=info \
-Xlog:jit+compilation=info \
MonomorphicExample
Let’s analyze the output to see where the monomorphic optimization and deoptimization happened:
1. First, we see the SquareCalculator
’s compute
method being compiled:
# Initial compile at level 2
23 8 2 SquareCalculator::compute (4 bytes)
# Optimized to level 4 (highest)
24 9 4 SquareCalculator::compute (4 bytes)
# Old version discardedThen processValues gets compiled
# and optimized with the monomorphic assumption:
24 8 2 SquareCalculator::compute (4 bytes) made not entrant
2. Then processValues gets compiled with monomorphic assumption:
# OSR compilation
25 10 % 3 MonomorphicExample::processValues @ 13 (46 bytes)
# Normal compilation
25 11 3 MonomorphicExample::processValues (46 bytes)
# Optimized OSR
25 12 % 4 MonomorphicExample::processValues @ 13 (46 bytes)
3. Main method gets compiled:
88 14 % 3 MonomorphicExample::main @ 16 (49 bytes)
89 15 3 MonomorphicExample::main (49 bytes)
4. The deoptimization happens when the new Calculator
implementation is loaded:
# New class (implementation) is loaded
[0.124s][info][class,load] MonomorphicExample$1 source: file:/Users/ondrej/Documents/Projects/java/jvm-examples/
# Deoptimization!
123 13 4 MonomorphicExample::processValues (46 bytes) made not entrant
# New implementation compiled
123 16 3 MonomorphicExample$1::compute (4 bytes)
# Recompiled without monomorphic assumption
123 17 4 MonomorphicExample::processValues (46 bytes)
2. Type speculation
The JVM can speculate about the actual types of objects at runtime. Here’s a simple example:
public class ArrayTypeSpeculation {
public static double sumArray(Object[] arr) {
double sum = 0.0;
// JVM will initially speculate all elements are Integers
for (Object elem : arr) {
if (elem instanceof Number) {
sum += ((Number)elem).doubleValue();
}
}
return sum;
}
public static void main(String[] args) {
// Create array of Integers first
Object[] integers = new Object[1000];
for (int i = 0; i < integers.length; i++) {
integers[i] = Integer.valueOf(i);
}
// Train JVM with Integer array
for (int i = 0; i < 10000; i++) {
sumArray(integers);
}
System.out.println("Now introducing array with mixed types...");
// Create mixed array with both Integers and Doubles
Object[] mixed = new Object[1000];
for (int i = 0; i < mixed.length; i++) {
if (i % 2 == 0) {
mixed[i] = Integer.valueOf(i);
} else {
mixed[i] = Double.valueOf(i);
}
}
// This should trigger deoptimization
sumArray(mixed);
}
}
This example shows the JVM’s type speculative optimization strategy:
- Initially assumes homogeneous Integer types
- Optimizes based on this assumption
- Deoptimizes when encountering Double values
- Recompiles to handle the mixed type case
Let’s run this and analyze the console logs:
java -XX:+PrintCompilation \
-XX:+UnlockDiagnosticVMOptions \
-XX:+LogCompilation \
-XX:+TraceDeoptimization \
-Xlog:class+load=info \
-Xlog:compilation=info \
-Xlog:jit+compilation=info \
ArrayTypeSpeculation
Console log analysis:
1. The JVM first optimizes the sumArray
method, going up to level 4 (C2 compiler) with OSR compilation.
26 11 % 3 ArrayTypeSpeculation::sumArray @ 11 (51 bytes)
26 12 3 ArrayTypeSpeculation::sumArray (51 bytes)
26 13 % 4 ArrayTypeSpeculation::sumArray @ 11 (51 bytes)
2. JVM optimizes for Integer
operations, suggesting it’s speculating all array elements are Integers.
24 8 3 java.lang.Integer::<init> (10 bytes)
24 6 3 java.lang.Integer::valueOf (32 bytes)
24 9 2 java.lang.Integer::doubleValue (6 bytes)
3. When we introduce the mixed array (Integers and Doubles), we see:
Now introducing array with mixed types...
UNCOMMON TRAP method=ArrayTypeSpeculation.sumArray([Ljava/lang/Object;)D
bci=26 pc=0x0000000112a7ce0c
reason=class_check
action=maybe_recompile
4. JVM is forced to deoptimize due to the type assumption being violated:
DEOPT PACKING thread=0x000000011f008200
Virtual frames (innermost/newest first):
VFrame 0 - ArrayTypeSpeculation.sumArray([Ljava/lang/Object;)D - instanceof @ bci=26
5. New class loading for Double
support:
[0.041s][info][class,load] java.lang.Double::valueOf (9 bytes)
41 16 3 java.lang.Double::<init> (10 bytes)
Type speclation is more sophisticated than this, feel free to explore: Type profiling and speculation in JVM.
3. Null check elimination
The JVM can speculate that certain references will never be null:
public class NullCheckSpeculation {
public static int getStringLength(String s) {
// Add explicit null check that JVM can learn to eliminate
if (s == null) {
return -1;
}
return s.length();
}
public static void main(String[] args) {
// Training phase
for (int i = 0; i < 100_000; i++) {
getStringLength("test" + i); // Never null
}
System.out.println("Now introducing null...");
// This should trigger deoptimization but not NPE
int result = getStringLength(null);
System.out.println("Result for null: " + result);
}
}
Let’s run the code with all the JVM flags:
java -XX:+PrintCompilation \
-XX:+UnlockDiagnosticVMOptions \
-XX:+LogCompilation \
-XX:+TraceDeoptimization \
-Xlog:class+load=info \
-Xlog:compilation=info \
-Xlog:jit+compilation=info \
NullCheckSpeculation
Output analysis:
1. The getStringLength
method is initially compiled at level 3 (C1 with full optimization)
54 85 3 NullCheckSpeculation::getStringLength (11 bytes)
2. The main loop gets compiled with OSR compilation (%) during the training phase:
61 106 % 3 NullCheckSpeculation::main @ 2 (50 bytes)
62 107 3 NullCheckSpeculation::main (50 bytes)
3. The deoptimization evidence when null
is introduced. The code falls back to the safe path and handles the null case correctly:
Now introducing null...
63 106 % 3 NullCheckSpeculation::main @ 2 (50 bytes) made not entrant
DEOPT PACKING thread=0x000000013e014e00 vframeArray=0x0000000149011200
Deoptimized frame (sp=0x000000016bf92a40...)
Virtual frames (innermost/newest first):
VFrame 0 (...) - NullCheckSpeculation.main([Ljava/lang/String;)V - invokedynamic @ bci=41
Result for null: -1
4. Branch Prediction
The JVM can speculate about which branches are likely to be taken:
public class BranchSpeculation {
public static int processValue(int value) {
if (value > 100) { // JVM might speculate this is rarely true
return complexCalculation(value);
}
return value + 1; // Common path optimized
}
private static int complexCalculation(int value) {
// Complex logic here
return value * 2;
}
}
5. Method Inlining with Guards
- Inlines methods based on observed receiver types
- Adds guards to check if assumption holds
- Deoptimizes if different implementation is encountered
interface Parser {
Object parse(String input);
}
// If only JsonParser is used during training
Parser parser = new JsonParser();
parser.parse(input); // JVM might inline JsonParser's implementation
6. Loop optimizations
- Range check elimination
- Loop unrolling based on observed patterns
- Loop versioning for common cases
for (int i = 0; i < array.length; i++) {
// If array bounds checks always pass
array[i].process(); // JVM might eliminate bounds checking
}
7. Exception path speculation
- Optimizes for no-exception paths
- Deoptimizes if exceptions occur frequently
try {
// If exceptions are rare
riskyOperation(); // JVM might optimize for no-exception path
} catch (Exception e) {
// uncommon path
}
8. Field value speculation
- Speculates on values of fields
- Particularly effective with final fields
- Deoptimizes if speculation proves wrong
class Config {
private final boolean debug;
void log() {
if (debug) { // If debug is usually false
// JVM might speculate debug is always false
}
}
}
9. Lock Elision
- Removes synchronization if no contention is observed
- Deoptimizes if contention occurs
synchronized void method() { // If no contention observed
// JVM might eliminate actual locking
}
Deoptimization
When speculative optimizations fail, the JVM performs deoptimization:
- Trap — The optimized code detects an assumption violation
- Stack unwinding — JVM captures the current program state
- State translation — Converts optimized state back to interpreter state
- Continuation — Execution continues in the interpreter
Best Practices
- Consistent types — Keep type hierarchies simple and consistent
- Warm-up period — Allow time for the JVM to gather profiling data
- Avoid mixed types — Don’t mix different types in the same collection or method parameter if possible
- Profile-guided optimization — Use real-world workloads during testing
How the JVM Tracks Optimizations
The JVM maintains detailed metadata about its optimizations:
- Dependencies between compiled methods
- Assumptions made during compilation
- Profiling data that led to optimizations
- Deoptimization triggers embedded in the code
When assumptions are violated:
- The JVM detects the violation
- Identifies affected compiled methods
- Marks them for deoptimization
- Recompiles with updated assumptions
Conclusion
Even the Speculative optimization are very deep in weeds of JVM, and probably not that crucial to every day coding, it is amazing how it allows our applications to achieve high performance while maintaining the safety and flexibility of the Java language.
As always, understanding these mechanisms helps us to have an idea what is going on when something, like performance, goes wrong.