"Dark Horse JVM" 1. Basics

Introduction to JVM#

What is JVM#

JVM stands for Java Virtual Machine, which is essentially a program that runs on a computer, responsible for executing Java bytecode files.

First, use javac to compile the source code .java into .class bytecode files, then use the java command to start the JVM to run the bytecode. The JVM will use a combination of interpretation and JIT compilation to execute the code on the computer.

Execution Flow of Java Program|500

Functions of JVM#

Functions of JVM:

Interpretation and Execution:
1. Interprets the instructions in the bytecode file into machine code in real-time for execution by the computer.
Memory Management:
1. Automatically allocates memory for objects, methods, etc.
2. Automatic garbage collection mechanism to reclaim unused objects.
Just-In-Time Compilation (JIT):
1. Optimizes hot code to improve execution efficiency.

Bytecode machines cannot recognize bytecode directly; they require the JVM to interpret it into machine code for execution. Observing C language, compiling .c source code directly into machine code is significantly less efficient. Therefore, JIT compilation was added, which optimizes hot bytecode instructions and stores them in memory for subsequent direct calls, eliminating the need for repeated interpretation and optimizing performance.

image.png|500

Java Virtual Machine Specification#

Specifies the standards that the current version of the JVM must meet for secondary development: including the definition of class bytecode files, loading and initialization of classes and interfaces, instruction sets, and more.

The "Java Virtual Machine Specification" outlines the requirements for JVM design, not for Java design, meaning that the JVM can run class bytecode files generated from other languages like Groovy and Scala.

Name	Author	Supported Versions	Community Activity (GitHub Stars)	Features	Applicable Scenarios
HotSpot (Oracle JDK)	Oracle	All versions	High (Closed Source)	Most widely used, stable and reliable, active community, JIT support, default JVM for Oracle JDK	Default
HotSpot (Open JDK)	Oracle	All versions	Medium (16.1k)	Same as above, open-source, default JVM for Open JDK	Default for JDK secondary development
GraalVM	Oracle	Supports Enterprise versions 11, 17, 19, 8	High (18.7k)	Multi-language support, high performance, JIT, AOT support	Microservices, cloud-native architectures requiring multi-language mixed programming
Dragonwell JDK	Alibaba	Standard version 8, 11, 17, Extended version 11, 17	Low (3.9k)	Enhanced high performance based on OpenJDK, bug fixes, security improvements, supports JWarmup, ElasticHeap, Wisp features	E-commerce, logistics, finance sectors with high performance requirements
Eclipse OpenJ9 (originally IBM J9)	IBM	8, 11, 17, 19, 20	Low (3.1k)	High performance, scalable JIT, AOT feature support	Microservices, cloud-native architectures

HotSpot is the most widely used.

HotSpot Development History|500

Detailed Explanation of Bytecode Files#

Composition of the Java Virtual Machine#

image.png|500

ClassLoader: The core component responsible for loading the contents of bytecode files into memory.

Runtime Data Area: Manages the memory used by the JVM, where created objects, class information, and other content are stored.

Execution Engine: Includes the JIT compiler, interpreter, and garbage collector; the execution engine uses the interpreter to convert bytecode instructions into machine code, optimizes performance with the JIT compiler, and reclaims unused objects with the garbage collector.

Native Interface: Calls methods compiled in C/C++, declared in Java with the native keyword, such as public static native void sleep(long millis) throws InterruptedException;.

Composition of Bytecode Files#

Viewing Bytecode#

Bytecode file viewer: jclasslib

image.png|500

Composition of Bytecode Files#

Components of bytecode files:

Basic Information: Magic number, Java version number corresponding to the bytecode file, access flags (public, final, etc.), parent class and interface information.
Constant Pool: Stores string constants, class or interface names, field names, mainly used in bytecode instructions.
Fields: Information about fields declared in the current class or interface.
Methods: Information about methods declared in the current class or interface, with the core content being the bytecode instructions of the methods.
Attributes: Class attributes, such as the source file name, list of inner classes, etc.

Basic Information#

Files cannot be identified by file extension alone, as file extensions can be arbitrarily modified without affecting the file's content. Software verifies the file type using the first few bytes (file header). If the software does not support that type, it will throw an error.

In Java bytecode files, the file header is referred to as the magic number. The JVM checks whether the first four bytes of the bytecode file are 0xcafebabe; if not, the bytecode file cannot be used normally, and the JVM will throw the corresponding error.

The major version number is used to determine whether the current bytecode version is compatible with the JVM.

image.png|500

The major and minor version numbers refer to the JDK version number used to compile the bytecode file. The major version number identifies the major version, while the minor version number distinguishes different minor version identifiers. JDK 1.0 - 1.1 used 45.0 - 45.3. After JDK 1.2, the major version number calculation method is major version number - 44, for example, 52 indicates the major version number for JDK 8.

If there is an incompatibility, such as a bytecode file version of 52 but a JVM version of 50:

Upgrade the JDK.
Lower the required version of the bytecode file, reduce the dependency version, or change the dependency.

Generally, option 2 is chosen to adjust the dependencies, as upgrading the JDK is a significant change that may cause compatibility issues.

image.png|500

Constant Pool#

Can save string literals by storing only one copy, with strings stored as constants in the String class pointing to a UTF-8 literal constant.

Example: a="abc"; abc="abc" In this case, there is only one UTF-8 literal constant, referenced by the String class constant, as the name of variable abc.

image.png|500

Methods#

Example: int i=0; i=i++; What is the final value of i?

image.png|500

The local variable table uses the order of declaration as the index, where the passed argument args is 0, i and j are 1 and 2, respectively, and the operand stack is used for operands.

Bytecode instruction parsing for int i=0; int j=i+1;:

iconst_0, pushes the constant 0 onto the operand stack, leaving only 0 on the stack.
istore_1 pops the top of the operand stack and stores it in the local variable table at index 1.
iload_1 pushes the number from local variable table index 1 onto the operand stack, which is 0.
iconst_1 pushes the constant 1 onto the operand stack.
iadd adds the top two numbers on the operand stack and puts the result back on the stack, leaving only the constant 1.
istore_2 pops the top element 1 from the operand stack and stores it in the local variable table at index 2, completing the assignment of variable j.
The return statement is executed, ending the method and returning.

Bytecode for int i=0; i=i++;
image.png|500

Bytecode for int i=0; i=++i;
image.png|500

Generally, the more bytecode instructions, the worse the performance. For the following three ways of incrementing by 1, which has the best performance?

int i=0,j=0,k=0;
i++;
j = j + 1;
k += 1;

Three typical bytecode generations, but in reality, the JIT compiler may optimize all of these into iinc:

i++; (iinc)
j = j + 1; (iload, iconst_1, iadd, istore)
k += 1; (iload, iconst_1, iadd, istore)

Fields#

Fields store the information of fields declared in the current class or interface. In the figure below, two fields a1 and a2 are defined, which will appear in the fields section along with their names, descriptors (field types), and access flags (public/private static final, etc.).

image.png|500

Attributes#

Attributes mainly refer to class attributes, such as the source file name, list of inner classes, etc.

image.png|500

Common Tools for Bytecode#

javap is a decompilation tool that comes with the JDK, allowing you to view the contents of bytecode files through the console.

Simply enter javap to see all parameters, and enter javap -v xxx.class to view specific bytecode information. If it is a jar package, you need to first use the jar –xvf command to extract it.

image.png|500

jclasslib also has an IDEA plugin version that can view the contents of the bytecode file after code compilation.

New tool: Alibaba Arthas

Arthas is an online monitoring and diagnostic product that allows real-time viewing of application load, memory, GC, and thread status information from a global perspective, and can diagnose business issues without modifying application code, using reflection.

image.png|500

Download the Arthas documentation jar package and run it.

Related commands:

dump -d /tmp/output java.lang.String saves the bytecode file to the local machine.
jad --source-only demo.MathGame decompiles the class bytecode into source code for verification.

Class Lifecycle#

Overview of Lifecycle#

Loading, Linking, Initialization, Usage, Unloading

Loading Phase#

The first step of the Loading phase is for the class loader to obtain bytecode information in binary stream form through various channels based on the fully qualified name of the class. Programmers can use different channels to extend this in Java code:
- Obtain files from local disk.
- Generate dynamically at runtime, such as through the Spring framework.
- Use Applet technology to obtain bytecode files over the network.
After the class loader loads the class, the JVM saves the information from the bytecode into the method area, generating an InstanceKlass object in the method area to store all the class information, which also contains information for implementing specific functionalities like polymorphism.
The JVM will also generate a java.lang.Class object on the heap that is similar to the data in the method area, which is used to obtain class information and store static field data in Java code from JDK 8 onwards.

Linking Phase#

The Linking phase is divided into three sub-phases:

Verification: Checks whether the content complies with the "Java Virtual Machine Specification."
Preparation: Allocates memory for static variables and sets initial values.
Resolution: Replaces symbolic references in the constant pool with direct references pointing to memory.

Linking Phase - Verification#

The main purpose of verification is to check whether the Java bytecode file adheres to the constraints in the "Java Virtual Machine Specification." This phase generally does not require programmer involvement:

File format verification, such as whether the file starts with 0xCAFEBABE, whether the major and minor version numbers meet the requirements of the current Java Virtual Machine version, and whether the JDK version is not less than the file version.
Metadata verification, such as whether the class has a superclass (super cannot be null).
Verification of the semantics of the instructions executed by the program, such as whether the instruction jumps to an incorrect location within a method.
Symbolic reference verification, such as whether private methods in other classes are accessed.

Linking Phase - Preparation#

The preparation phase allocates memory for static variables and sets initial values. Each basic data type and reference data type has its initial value. Note: The initial value here is the default value for each type, not the initial value set in the code.

Data Type	Initial Value
int	0
long	0L
short	0
char	‘\u0000’
byte	0
boolean	false
double	0.0
Reference Data Type	null

In the following example, during the linking phase - preparation sub-phase, memory will be allocated for value and initialized to 0, and the value will be modified to 1 only during the initialization phase.

public class Student {
    public static int value = 1;
}

An exception is for variables modified with final, as these variables will not change in value later, so their values will be assigned during the preparation phase.

Linking Phase - Resolution#

The resolution phase mainly replaces symbolic references in the constant pool with direct references. Symbolic references use numbers in the bytecode file to access the contents of the constant pool.

Direct references use memory addresses to access specific data.

Initialization Phase#

The initialization phase executes the bytecode instructions of the clinit (class initialization) method in the bytecode file, which includes assigning values to static variables and executing code in static blocks (in code order).

The execution order of the clinit method is consistent with the code order.

The putstatic instruction pops the number from the operand stack and places it in the heap at the static variable's location. The bytecode instruction #2 points to the static variable value in the constant pool, which will be replaced with the variable's address during the resolution phase.

image.png|500

The following methods will trigger class initialization:

Accessing a class's static variable or static method. Note that if the variable is final and the right side of the equals sign is a constant, it will not trigger class initialization, as this variable has already been assigned a value in the preparation phase of the linking stage.
Calling Class.forName(String className) can control whether to initialize.
Creating an object of that class.
Executing the current class's Main method.

Adding the -XX:+TraceClassLoading parameter in Java startup will print the classes that are loaded and initialized.

Example questions:

image.png|500

The clinit instruction may not appear in specific cases, such as:

No static code blocks and no static variable assignment statements.
Static variable declarations exist, but no assignment statements.
Static variable definitions use the final keyword; these variables will be assigned values directly during the preparation phase of the linking stage.

Inheritance situations:

Directly accessing a superclass's static variable will not trigger the subclass's initialization.
The subclass's initialization clinit will call the superclass's clinit initialization method first.

Example questions:

Initializing the subclass first triggers the superclass.
image.png|500

Directly accessing the superclass's static variable will not trigger subclass initialization.
image.png|500

Creating an array does not trigger the initialization of the elements in the array.

public class Test2 {
    public static void main(String[] args) {
        Test2_A[] arr = new Test2_A[10];

    }
}

class Test2_A {
    static {
        System.out.println("Static block of Test2_A executed");
    }
}

If the content assigned to a final variable requires executing instructions to derive the result, the clinit method will be executed for initialization.

public class Test4 {
    public static void main(String[] args) {
        System.out.println(Test4_A.a);
    }
}

class Test4_A {
    public static final int a = Integer.valueOf(1);

    static {
        System.out.println("Static block of Test4_A executed");
    }
}

Class Loader#

The ClassLoader is a technology provided by the JVM for applications to obtain class and interface bytecode data. The class loader only participates in the bytecode acquisition and loading into memory during the loading process.

image.png|500

The class loader obtains the contents of the bytecode file in binary stream form and then hands the obtained data to the JVM, which generates corresponding objects in the method area and heap to store bytecode information.

Classification of Class Loaders#

Class loaders are divided into two categories: those implemented in Java code and those implemented in the JVM's underlying source code.

Versions Before JDK 8#

image.png|500

The BootStrap class loader loads the core JAR packages of the JRE, which cannot be accessed in Java code.

image.png|500

The Extension class loader and Application class loader are both located in sun.misc.Launcher, which is a static inner class that inherits from URLClassLoader, capable of loading bytecode files into memory via directories or specified JAR packages.

image.png|500

Using the -Djava.ext.dirs=jar包目录 parameter can extend the directory of the extension JAR packages, using ; (Windows) or : (macOS/Linux) to separate directory paths.

image.png|500

The Application class loader loads class files under the classpath, primarily loading classes in the project and classes from third-party JAR packages introduced via Maven.

Parent Delegation Mechanism of Class Loaders#

Since there are multiple class loaders in the Java Virtual Machine, the core of the parent delegation mechanism is to resolve which class loader should load a class.

Mechanism functions:

Prevents malicious code from replacing core libraries in the JDK, such as java.lang.String, ensuring the integrity and security of core libraries.
Prevents duplicate loading, ensuring that a class is loaded only once by a class loader.

The parent delegation mechanism means that when a class loader receives a task to load a class, it will check upwards to see if it has already been loaded, and then try to load downwards.

Downward delegation loading prioritizes loading from the bootstrap class loader downwards, successfully loading if it finds the class in its loading directory.

image.png|500

Example: dev.chanler.my.C is in the classpath; it checks from Application upwards and finds it hasn't been loaded; it checks from Bootstrap downwards and finds it is not in the loading directory, so only Application can successfully load it because C is in the classpath.

Questions:

If a class appears in the loading locations of three class loaders, which one should load it?
- The Bootstrap class loader, as per the parent delegation mechanism, has the highest priority.
Can the String class be overridden? If a java.lang.String class is created in your project, will it be loaded?
- No, it will return the String class loaded by the Bootstrap class loader in the rt.jar package.
What is the parent delegation mechanism of classes?
- When a class loader attempts to load a class, it checks upwards to see if it has been loaded; if it has, it returns directly. If it reaches the top-level class loader without finding it, it will attempt to load downwards.
- The parent class loader of the Application class loader is the Extension class loader, and the parent class loader of the Extension class loader is the Bootstrap class loader, but in code, it is null because Bootstrap cannot be accessed.
- The benefits of the parent delegation mechanism are twofold: first, it prevents malicious code from replacing core libraries in the JDK, such as java.lang.String, ensuring the integrity and security of core libraries; second, it prevents a class from being loaded multiple times.

Breaking the Parent Delegation Mechanism#

There are three ways to break the parent delegation mechanism, but essentially only the first one truly breaks it:

Custom class loaders: Custom class loaders that override the loadClass method, such as Tomcat, use this method to achieve class isolation between applications.
Thread context class loaders: Using context class loaders to load classes, such as JDBC and JNDI.
OSGi framework class loaders: Historically, the OSGi framework implemented a new class loader mechanism that allows peer-to-peer delegation for class loading, which is rarely used today.

Breaking the Parent Delegation Mechanism - Custom Class Loader#

A Tomcat program can run multiple web applications. If these two applications have the same fully qualified name, such as a Servlet class, Tomcat must ensure that both classes can be loaded and that they should be different classes. Therefore, without breaking the parent delegation mechanism, it cannot load the second Servlet class.

image.png|500

Tomcat uses a custom class loader to achieve class isolation between applications, with each application having an independent class loader to load the corresponding classes.

image.png|500

Four core methods of ClassLoader:

public Class<?> loadClass(String name)
Entry point for class loading, providing the parent delegation mechanism. Internally calls findClass, which is important.

protected Class<?> findClass(String name)
Implemented by class loader subclasses, retrieves binary data and calls defineClass. For example, `URLClassLoader` retrieves binary data from the class file based on the file path, which is important.

protected final Class<?> defineClass(String name, byte[] b, int off, int len)
Performs some class name validation and then calls the underlying method of the virtual machine to load the bytecode information into the virtual machine memory.

protected final void resolveClass(Class<?> c)
Executes the linking phase in the class lifecycle; `loadClass` defaults to false.

The loadClass method defaults to resolve being false, meaning it will not proceed to the linking and initialization phases. Class.forName will perform loading, linking, and initialization.

To break the parent delegation mechanism, the core logic inside loadClass needs to be re-implemented.

image.png|500

Custom class loader parent defaults to AppClassLoader.

/**
 * Breaking the parent delegation mechanism - Custom Class Loader
 */
public class BreakClassLoader1 extends ClassLoader {

    private String basePath;
    private final static String FILE_EXT = ".class";

    // Set loading directory
    public void setBasePath(String basePath) {
        this.basePath = basePath;
    }

    // Load files from the specified directory using commons io
    private byte[] loadClassData(String name)  {
        try {
            String tempName = name.replaceAll("\\.", Matcher.quoteReplacement(File.separator));
            FileInputStream fis = new FileInputStream(basePath + tempName + FILE_EXT);
            try {
                return IOUtils.toByteArray(fis);
            } finally {
                IOUtils.closeQuietly(fis);
            }

        } catch (Exception e) {
            System.out.println("Custom class loader failed to load, error reason: " + e.getMessage());
            return null;
        }
    }

    // Override loadClass method
    @Override
    public Class<?> loadClass(String name) throws ClassNotFoundException {
        // If it is in the java package, still follow the parent delegation mechanism
        if(name.startsWith("java.")){
            return super.loadClass(name);
        }
        // Load from the specified directory on disk
        byte[] data = loadClassData(name);
        // Call the underlying method of the virtual machine to create objects in the method area and heap
        return defineClass(name, data, 0, data.length);
    }

    public static void main(String[] args) throws ClassNotFoundException, InstantiationException, IllegalAccessException, IOException {
        // First custom class loader object
        BreakClassLoader1 classLoader1 = new BreakClassLoader1();
        classLoader1.setBasePath("D:\\lib\\");

        Class<?> clazz1 = classLoader1.loadClass("com.itheima.my.A");
         // Second custom class loader object
        BreakClassLoader1 classLoader2 = new BreakClassLoader1();
        classLoader2.setBasePath("D:\\lib\\");

        Class<?> clazz2 = classLoader2.loadClass("com.itheima.my.A");

        System.out.println(clazz1 == clazz2);

        Thread.currentThread().setContextClassLoader(classLoader1);

        System.in.read();
     }
}

Question: Will two custom class loaders loading the same qualified name class conflict?

No, in the same JVM, only the same class loader + the same class qualified name will be considered the same class.
In Arthas, you can use the sc-d className method to view specific situations.

The parent delegation mechanism is in loadClass, while loadClass calls findClass, and overriding findClass is the real way to implement loading bytecode files from multiple channels, such as loading classes from a database and converting them into binary arrays to call defineClass to store them in memory.

Breaking the Parent Delegation Mechanism - Thread Context Class Loader JDBC Example#

The DriverManager class manages different drivers.

image.png|500

The DriverManager class is located in the rt.jar package, so it is loaded by the Bootstrap class loader. However, DriverManager delegates the loading of driver JAR packages to the Application class loader.

image.png|500

Question: How does DriverManager know where to load the driver in the JAR package?
The Service Provider Interface (SPI) is a built-in service discovery mechanism in JDK.

The thread context class loader is actually the application class loader by default.

image.png|500

Viewpoint: Does the JDBC example really break the parent delegation mechanism?

The DriverManager loaded by the Bootstrap class loader delegates to the Application class loader to load the driver class, breaking the parent delegation.
JDBC only triggers the loading of the driver class after the DriverManager has loaded, and class loading still follows the parent delegation mechanism because loading through the Application class loader still goes through the loadClass method, which contains the parent delegation mechanism.

It can only be said that from a macro perspective, it is the parent level delegating to the child level, while from a micro perspective, the execution layer's class loader internal function logic still follows the parent delegation, but the parent level refuses to execute.

Breaking the Parent Delegation Mechanism - OSGi Modular Framework#

image.png|500

Using Arthas Hot Deployment to Solve Online Bugs#

image.png|500

Notes:

After the program restarts, the bytecode files will revert, as they are only replaced in memory unless the class files are updated in the JAR package.
Using retransform cannot add methods or fields, nor can it update methods that are currently executing.

Class Loaders After JDK 9#

In versions before JDK 8, the Extension and Application class loaders inherited from sun.misc.Launcher.java's URLClassLoader.
image.png|500

After JDK 9, the concept of modules was introduced, and the design of class loaders changed significantly.

The Bootstrap class loader is implemented in Java and is located in jdk.internal.loader.ClassLoaders. The Java BootClassLoader inherits from BuiltinClassLoader, which implements finding the bytecode resources to load from modules.
image.png|500

The platform class loader follows a modular approach to load bytecode files, so its inheritance relationship changes from URLClassLoader to BuiltinClassLoader. BuiltinClassLoader implements loading bytecode files from modules, mainly for compatibility with older versions, without special logic.
image.png|500

JVM Memory Area - Runtime Data Area#

The runtime data area is responsible for managing the memory used by the JVM, such as creating and destroying objects.

The "Java Virtual Machine Specification" defines the role of each part, divided into two main blocks: thread-private and thread-shared.

Thread-private: Program counter, Java Virtual Machine stack, native method stack.
Thread-shared: Method area, heap.

image.png|500

Program Counter#

The Program Counter Register, also known as the PC register, records the address of the bytecode instruction currently being executed by each thread.

image.png|500

Example:

During the loading phase, the virtual machine reads the instructions from the bytecode file into memory and converts the offsets from the original file into memory addresses. Each bytecode instruction has a memory address.

image.png|500

During code execution, the program counter records the address of the next bytecode instruction. After executing the current instruction, the virtual machine's execution engine uses the program counter to execute the next instruction. For simplicity, offsets are used here; the actual memory execution should save addresses.

image.png|500

Continuing down to the last line, the return statement is executed, ending the current method, and the program counter will hold the address of the method exit, which is the address to return to the calling method.

image.png|500

Thus, the program counter can control the flow of program instructions, implementing branching, jumping, exceptions, and other logic by simply placing the address of the next instruction to be executed in the program counter.

In a multi-threaded scenario, the program counter can also record the instruction address to be executed next before the CPU switches, facilitating a switch back to continue execution.

image.png|500

Question: Can the program counter experience memory overflow during runtime?

Memory overflow refers to a situation where the data that needs to be stored in a certain memory area exceeds the upper limit of memory that the virtual machine can provide.
Since each thread only stores a fixed-length memory address, the program counter will not experience memory overflow.
Programmers do not need to manage the program counter.

JVM Stack#

The Java Virtual Machine Stack uses a stack data structure to manage basic data during method calls, following the First In Last Out (FILO) principle. Each method call uses a stack frame to save its data.

image.png|500

The stack frame in the Java Virtual Machine Stack mainly contains three aspects:

Local variable table: Stores all local variables during method execution.
Operand stack: A section of the stack frame used by the virtual machine to store temporary data during instruction execution.
Frame data: Mainly contains dynamic linking, method exit, and references to the exception table.

Local Variable Table#

The local variable table is used to store all local variables during method execution.

The local variable table is divided into two types:

One is in the bytecode file.
The other is in the stack frame, stored in memory. The local variable table in the stack frame is generated based on the contents of the bytecode file.

Effective range: The effective range of this local variable is the range in the bytecode where it can be accessed.
Starting PC indicates from which offset it can access this variable, ensuring the variable has been initialized.
Length indicates the length of the effective range of this local variable starting from the starting PC, such as j having an effective range until the return instruction on line 4 of the bytecode.

The local variable table in the stack frame is an array, with one position being one slot. long and double occupy two slots.

image.png|500

The this reference of instance objects and method parameters are also stored at the beginning of the local variable table in the order of their definitions.

image.png|500

Question: How many slots does the following code occupy?

public void test4(int k,int m){
    {
        int a = 1;
        int b = 2;
    }
    {
        int c = 1;
    }
    int i = 0;
    long j = 1;
}

Is this, k, m, a, b, c, i, j a total of 9 slots? Not quite.

To save space, the slots in the local variable table can be reused. Once a local variable is no longer effective, the current slot can be reused. In this case, a, b, and c will not be used later and will be reused by i and j. However, the this reference and method parameters persist throughout the method's lifecycle, so the slots they occupy will not be reused.

Thus, the number of slots in the local variable table should be the minimum required at runtime, which can be determined at compile time, and during execution, the stack frame simply creates a local variable table array of the corresponding length.

image.png|500

Operand Stack#

The operand stack is a section of the stack frame used by the virtual machine to store intermediate data during instruction execution, following a stack structure.

The maximum depth of the operand stack can be determined at compile time, allowing for correct memory allocation during execution.

Example: The maximum depth of the operand stack is 2.

image.png|500

Frame Data#

Frame data mainly contains dynamic linking, method exit, and references to the exception table.

Dynamic Linking#

When the bytecode instructions of the current class reference attributes or methods of other classes, it needs to convert symbolic references (numbers) into corresponding memory addresses in the runtime constant pool.

Dynamic linking saves the mapping relationship from numbers to memory addresses in the runtime constant pool.

image.png|500

Method Exit#

Method exit refers to the situation where the current stack frame is popped when the method ends correctly or with an exception, and the program counter should point to the address of the next instruction in the previous stack frame, which is the address of the next line of instructions for the caller.

image.png|500

Exception Table#

The exception table stores information about exception handling in the code, including the effective range of exception catching and the bytecode instruction positions to jump to after an exception occurs.

Example: In this exception table, the starting offset for exception catching is 2, and the ending offset is 4. If an object of java.lang.Exception or its subclass is thrown during execution from 2 to 4, it will be caught and then jump to the instruction at offset 7.

image.png|500

Stack Memory Overflow#

If the stack frames exceed the maximum size that can be allocated for the stack memory, a memory overflow will occur, resulting in a StackOverflowError.

image.png|500

You can set the virtual machine parameters -Xss1m or -Xss1024K. A 1M virtual machine stack can accommodate 10,676 stack frames.

Each version of the JVM will also have requirements for stack size; the HotSpot JVM requires a minimum of 180K and a maximum of 1024M for JDK 8 on Windows 64-bit.

Native Method Stack#

In the HotSpot JVM, the Java Virtual Machine stack and the native method stack use the same stack space. The native method stack stores information such as parameters, local variables, and return values for native methods.

Native methods refer to methods written in C language that are compiled within the JVM and publicly declared for calling in Java code.

Heap Memory#

In general, the heap memory in a Java program is the largest memory area, shared among threads.

All created objects exist on the heap, and the local variable table on the stack can store references to objects on the heap. Static variables can also store references to heap objects, allowing objects to be shared between threads through static variables.

image.png|500

Heap Memory Overflow#

The heap memory size has an upper limit. When objects are continuously added to the heap and reach this limit, an OutOfMemoryError (OOM) will be thrown. In this code, continuously creating 100M-sized byte arrays and placing them in an ArrayList will eventually exceed the heap memory limit and throw an OOM error.

/**
 * Usage and recycling of heap memory
 */
public class Demo1 {
    public static void main(String[] args) throws InterruptedException, IOException {
        ArrayList<Object> objects = new ArrayList<Object>();
        System.in.read();
        while (true){
            objects.add(new byte[1024 * 1024 * 100]);
            Thread.sleep(1000);
        }
    }
}

Three Important Values#

There are three values to pay attention to in heap space: used, total, and max.

used refers to the currently used heap memory.
total is the available heap memory allocated by the JVM.
max is the maximum heap memory allowed by the JVM, meaning that total can expand to a maximum of max.

In Arthas, you can see these three values used, total, and max using the command dashboard -i refresh rate (5000ms).

If no virtual machine parameters are set, max defaults to 1/4 of the system memory, and total defaults to 1/64 of the system memory.

As the number of objects in the heap increases, used grows larger. If the available memory in total is insufficient, it will continue to request memory, with an upper limit of max.

Question: So, does the heap memory overflow when used = max = total?

No, the conditions for heap memory overflow are more complex, which will be detailed in the GC explanation.

Setting Heap Size#

To modify the size of the heap, you can use the virtual machine parameters –Xmx (max maximum value) and -Xms (initial total).
Syntax: -Xmxvalue -Xmsvalue
Units: bytes (default, must be a multiple of 1024), k or K (KB), m or M (MB), g or G (GB).
Limitations: -Xmx max must be greater than 2 MB, and -Xms total must be greater than 1 MB.

Recommendation: Set -Xmx max and -Xms total to the same value to reduce the overhead of memory allocation and distribution, as well as the situation where the heap shrinks after excess memory.

Method Area#

The method area is where basic information is stored, shared among threads, including:

Class metadata, which stores basic information about all classes.
Runtime constant pool, which stores the contents of the constant pool in bytecode files.
String constant pool, which stores string constants.

Class Metadata#

The method area stores the basic information of each class, also known as metadata InstanceKlass objects.

This is completed during the class loading phase, which includes the fields, methods, and other contents from the bytecode file, as well as information needed during runtime, such as the virtual method table (the basis for implementing polymorphism).

image.png|500

Runtime Constant Pool#

In addition to storing class metadata, the method area also stores the runtime constant pool, which contains the contents of the constant pool in bytecode.

In the bytecode file, constants are found through a lookup table by number. This constant pool is called the static constant pool. When the constant pool is loaded into memory, it can be quickly located by memory address, which is called the runtime constant pool.

image.png|500

Implementation of Method Area#

The method area is a virtual concept designed in the "Java Virtual Machine Specification." Each Java virtual machine implements it differently. The Hotspot design is as follows:

In versions 7 and earlier, the method area was stored in the permanent generation space within the heap, and the heap size was controlled by virtual machine parameters.
In versions 8 and later, the method area is stored in the metaspace, which is maintained in direct memory by the operating system. By default, it can be allocated indefinitely as long as it does not exceed the operating system's upper limit.

Overflow of Method Area#

Dynamically generating bytecode data using the ByteBuddy tool and continuously loading it into memory can simulate overflow in the method area.

String Constant Pool#

In addition to class metadata and the runtime constant pool, there is a section called the string constant pool StringTable in the method area.

The string constant pool stores constant string content defined in the code, such as “123”, where 123 will be placed in the string constant pool.

Objects created with new are stored in heap memory.

image.png|500

In early designs, the string constant pool was part of the runtime constant pool, and their storage locations were the same. The string constant pool and the runtime constant pool were split; after JDK 7, the string constant pool was placed in heap memory.

image.png|500

Question: Are the addresses equal?

/**
 * String Constant Pool Example
 */
public class Demo2 {
    public static void main(String[] args) {
        String a = "1";
        String b = "2";
        String c = "12";
        String d = a + b;
        System.out.println(c == d);
    }
}

They do not point to the same address.

image.png|500

Question: Do the pointed addresses match?

package chapter03.stringtable;

/**
 * String Constant Pool Example
 */
public class Demo3 {
    public static void main(String[] args) {
        String a = "1";
        String b = "2";
        String c = "12";
        String d = "1" + "2";
        System.out.println(c == d);
    }
}

Checking the bytecode file reveals that during the compilation phase, 1 and 2 were concatenated directly, so both point to the same object in the string constant pool.

image.png|500

Summary of the two questions:

String variable concatenation uses StringBuilder to store in heap memory, while constant concatenation is directly connected during the compilation phase.

image.png|500

After JDK 7, string.intern() will return the string in the string constant pool. If it does not exist, it will place the reference of the string into the string constant pool.

Here, the string constant pool is automatically managed by the JVM.

image.png|500

Question: Where are static variables stored?

In versions 6 and earlier, static variables were stored in the method area, which is the permanent generation.
In versions 7 and later, static variables are stored in the heap within the Class object, separating them from the permanent generation.

Direct Memory#

Direct memory does not exist in the "Java Virtual Machine Specification" and is not part of the Java runtime memory area. It was introduced in JDK 1.4 with the NIO mechanism, utilizing direct memory mainly to solve two problems:

If an object in the Java heap is no longer used, its reclamation will affect the creation and use of objects.
IO operations, such as reading files, require first reading the file into direct memory (buffer) and then copying the data to the Java heap.

Files can be placed in direct memory, and the heap can maintain references to direct memory, avoiding the overhead of data copying and the creation and reclamation of file objects.

image.png|500

The size can be allocated using the parameter XX:MaxDirectMemorySize=size.

JVM Garbage Collection#

In languages like C/C++ that do not have an automatic garbage collection mechanism, if an object is no longer used, it must be manually released; otherwise, memory leaks will occur. Memory leaks refer to unused objects that are not reclaimed in the system, and the accumulation of memory leaks may lead to memory overflow.

The process of releasing objects is called garbage collection. Java simplifies the release of objects by introducing an automatic garbage collection (GC) mechanism, with the garbage collector mainly responsible for reclaiming memory on the heap.

image.png|500

Question: Which parts of memory does the garbage collector manage?
For thread-private parts, they are created with thread creation and destroyed with thread destruction; method stack frames automatically pop from the stack and release memory after method execution, so they do not require garbage collection. Therefore, the parts that need garbage collection are the thread-shared method area and heap.

Collection of Method Area#

The contents that can be reclaimed in the method area mainly consist of classes that are no longer used.

To determine whether a class can be unloaded, it must meet the following conditions:

All instance objects of this class have been reclaimed, meaning there are no instances of this class or its subclasses in the heap.

Class<?> clazz = loader.loadClass(name: "com.itheima.my.A");
Object o = clazz.newInstance();
◎ = null;

The class loader that loaded this class has been reclaimed.
The java.lang.Class object corresponding to this class is not referenced anywhere.

The two virtual machine parameters -XX:+TraceClassLoading and -XX:+TraceClassUnloading can be used to see logs of class loading and unloading (i.e., reclamation).

If you need to manually trigger garbage collection, you can call the System.gc() method, but it does not guarantee immediate garbage collection; it merely sends a request to the Java virtual machine for garbage collection, and whether garbage collection occurs will be determined by the JVM.

Heap Collection#

Reference Counting and Reachability Analysis#

To determine whether an object can be reclaimed, the GC checks whether the object is referenced. If the object is referenced, it indicates that the object is still in use and cannot be reclaimed.

Question: Do A and B need to remove their mutual references?

No, because they cannot be accessed through references anymore.

image.png|500

Common methods for determining whether an object can be reclaimed include reference counting and reachability analysis.

Reference Counting#

The reference counting method maintains a reference counter for each object. When an object is referenced, the counter increases by 1; when the reference is canceled, it decreases by 1.

In this case, canceling two references can make the reference counter return to 0, allowing it to be reclaimed.

image.png|500

However, in the following situation, objects A and B reference each other in a circular manner, and the counters are both 1. However, there are no local variable references to these two objects, and the code cannot access them, so they should be reclaimable, but according to the reference counter method, they cannot be reclaimed.

Reachability Analysis#

Java uses the reachability analysis algorithm to determine whether an object can be reclaimed.

Reachability analysis divides objects into two categories: garbage collection roots (GC Roots) and ordinary objects; there is a reference relationship between objects.

If an object can be reached from the root object (GC Roots), it is considered not reclaimable; GC Roots are not reclaimable.

image.png|500

GC Roots include:

Thread objects that reference method parameters, local variables, etc., in the thread stack frame.
java.lang.Class objects loaded by the system class loader.
Monitor objects that hold the synchronized lock.
Global objects used in native method calls.

Example: Thread object.
image.png|500

Five Types of Object References#

The references described in the reachability analysis algorithm generally refer to strong references, meaning that if a GC Root object has a reference to an ordinary object, that ordinary object cannot be reclaimed.

Java has designed five types of reference methods:

Strong Reference
Soft Reference
Weak Reference
Phantom Reference
Finalizer Reference

Soft Reference#

Soft references are weaker than strong references. If an object is only associated with a soft reference, it will be reclaimed when the program runs out of memory.

image.png|500

The execution process of soft references is as follows:

Wrap the object using a soft reference: new SoftReference<ObjectType>(object).
When memory is insufficient, the virtual machine attempts to perform garbage collection.
If garbage collection still cannot resolve the memory shortage, the objects in the soft reference will be reclaimed.
If memory is still insufficient, an OutOfMemory exception will be thrown.

Placing 100M of data in a soft reference, where bytes = null; removes the strong reference to the data, leaving only the soft reference wrapped by SoftReference. If the virtual machine's max memory is set to 200M with -Xmx=200M, the second attempt to access the data in the soft reference will fail because during the second creation of 100M data, even with GC, memory is insufficient, and the objects in the soft reference will be reclaimed successfully, freeing up enough memory to accommodate the new 100M data.

byte[] bytes = new byte[1024 * 1024 * 100];
SoftReference<byte[]> softReference = new SoftReference<byte[]>(bytes);
bytes = null;
System.out.println(softReference.get());

byte[] bytes2 = new byte[1024 * 1024 * 100];
System.out.println(softReference.get());

If the objects in the soft reference are reclaimed due to memory shortage, the SoftReference itself will also be reclaimed. SoftReference provides a queue mechanism:

When creating a soft reference, pass a reference queue through the constructor.
When the object contained in the soft reference is reclaimed, that soft reference object will be placed in the reference queue.
By traversing the reference queue through code, the strong reference to the SoftReference can be removed.

image.png|500

Using ReferenceQueue, the strong reference saves the SoftReference object. When the object wrapped by the soft reference is reclaimed, the SoftReference itself will be placed into the reference queue, allowing it to be popped and traversed, thus losing its strong reference and becoming reclaimable.

ArrayList<SoftReference> softReferences = new ArrayList<>();
ReferenceQueue<byte[]> queues = new ReferenceQueue<byte[]>();
for (int i = 0; i < 10; i++) {
	byte[] bytes = new byte[1024 * 1024 * 100];
	SoftReference studentRef = new SoftReference<byte[]>(bytes,queues);
	softReferences.add(studentRef);
}

SoftReference<byte[]> ref = null;
int count = 0;
while ((ref = (SoftReference<byte[]>) queues.poll()) != null) {
	count++;
}
System.out.println(count);

You can extend SoftReference, storing _key in the constructor to clean up the key in the HashMap when the soft reference object is reclaimed.

image.png|500

private void cleanCache() {
	StudentRef ref = null;
	while ((ref = (StudentRef) q.poll()) != null) {
		StudentRefs.remove(ref._key);
	}
}

Weak Reference#

Weak references are similar to soft references, but the difference is that weak references will be reclaimed regardless of whether memory is sufficient or not. The implementation class is WeakReference, mainly used in ThreadLocal. Weak references also provide a reference queue that will place the weak reference into the queue when the data it wraps is reclaimed.

image.png|500

Manual GC causes the data wrapped by the weak reference to be directly reclaimed, resulting in null on the second attempt.

byte[] bytes = new byte[1024 * 1024 * 100];
WeakReference<byte[]> weakReference = new WeakReference<byte[]>(bytes);
bytes = null;
System.out.println(weakReference.get());

System.gc();

System.out.println(weakReference.get());

Phantom Reference and Finalizer Reference#

These two types of references are generally not used in regular development.

Phantom references, also known as ghost references, cannot be used to access the contained object. The only purpose of a phantom reference is to receive notifications when the object is reclaimed by the garbage collector. In Java, phantom references are implemented using PhantomReference, which is used in direct memory to know when direct memory objects are no longer used, allowing for memory reclamation.

Finalizer references indicate that when an object needs to be reclaimed, it will be placed in the reference queue of the Finalizer class and later retrieved by a finalizerThread thread from the queue to execute the object's finalize method. During this process, it is possible to associate the object with a strong reference again, but this is not recommended, as if it takes too long, it may affect the reclamation of other objects.

Garbage Collection Algorithms#

Introduction#

For garbage collection, there are only two steps:

Find the live objects in memory.
Release the memory of the dead objects, allowing the program to reuse this space.

In 1960, John McCarthy published the first GC algorithm: the mark-and-sweep algorithm.
In 1963, Marvin L. Minsky published the copying algorithm.
Subsequent garbage collection algorithms, such as mark-and-compact and generational GC, are optimizations based on these two algorithms.

Standard Evaluation#

The Java garbage collection process is completed by a separate GC thread. However, regardless of which GC algorithm is used, there will be phases that require stopping all user threads. This process is called Stop The World (STW). If the STW time is too long, it will affect user experience.

User code execution and garbage collection execution alternate, causing user threads to stop during STW. To evaluate whether a GC algorithm is excellent, there are three aspects:

Throughput: Throughput refers to the ratio of CPU time spent executing user code to the total CPU execution time, i.e., throughput = execution time of user code / (execution time of user code + GC time). The higher the throughput value, the more efficient the garbage collection.
Maximum Pause Time: Maximum pause time refers to the maximum value of all STW times during the garbage collection process.
Heap Utilization Efficiency: Different garbage collection algorithms use heap memory differently. For example, the mark-and-sweep algorithm can utilize the entire heap memory, while the copying algorithm divides the heap memory into two, using only half at a time. In terms of heap utilization efficiency, the mark-and-sweep algorithm is superior to the copying algorithm.

Generally, the larger the heap memory, the longer the maximum pause time. To reduce the maximum pause time, throughput will decrease.
Moreover, heap utilization efficiency, throughput, and maximum pause time cannot all be optimized simultaneously.

Mark-and-Sweep Algorithm#

The mark-and-sweep algorithm consists of two core phases:

Marking Phase: Marks all live objects. In Java, the reachability analysis algorithm is used, starting from GC Roots to traverse all reachable objects.
Sweeping Phase: Deletes unmarked (dead) objects from memory.

For example, if object D is unmarked, it will be cleared.
image.png|500

Advantages: Simple implementation, requiring only a flag for each object in the first phase and deleting objects in the second phase.
Disadvantages:

Fragmentation Problem: Memory is continuous, but deleted objects may not be continuous, leading to many small available memory units that cannot be allocated. For example, if a total of 9 bytes of space is reclaimed, but even 5 bytes of an object cannot be allocated.
Slow Allocation Speed: Requires maintaining a free list to record memory fragments, and traversing to find suitable-sized memory fragments takes too long.

Copying Algorithm#

The copying algorithm's core idea is:

Prepare two spaces, From space and To space. During the object allocation phase, only the From space is used.
During the garbage collection (GC) phase, all live objects in the From space are copied to the To space.
The names of the two spaces are swapped, ensuring that the From space is always the allocated and used space.

image.png|500

Advantages:

High throughput: Only requires one traversal of live objects and copying them to the To space, which is more efficient than the mark-and-compact algorithm, as it avoids an additional traversal, but it is not as efficient as the mark-and-sweep algorithm due to the extra object movement.
No fragmentation: Objects are placed in order.

Disadvantages:

Only half of the memory space can be used at a time.

Mark-and-Compact Algorithm#

The mark-and-compact algorithm, also known as the mark-and-compress algorithm, solves the fragmentation problem that arises from the mark-and-sweep algorithm.

The mark-and-compact algorithm's core idea is:

Marking Phase: Marks all live objects, using the reachability analysis algorithm from GC Roots to traverse all reachable objects.
Compaction Phase: Moves live objects to one end of the heap, clearing the memory space of the live objects.

image.png|500

Advantages:

High memory utilization efficiency: The entire heap memory can be used, unlike the copying algorithm, which can only use half of the heap memory.
No fragmentation: During the compaction phase, objects can be moved to one side of memory, leaving the remaining space as effective space for allocating objects.

Disadvantages:

The efficiency of the compaction phase is not high. There are many types of compaction algorithms; for example, the Lisp2 compaction algorithm requires searching through the entire heap for objects three times, resulting in poor overall performance.

Generational Garbage Collection Algorithm#

The generational garbage collection algorithm combines the ideas of the above algorithms, dividing the entire memory area into young and old generations.
image.png|500

JDK 8 can use the JVM parameter -XX:+UseSerialGC to run the program with generational GC. You can view the three areas using the memory command in Arthas.

Eden space + Survivor space form the young generation; tenured_gen refers to the promotion area, i.e., the old generation.

Related JVM Parameters:

Parameter Name	Parameter Meaning	Example
-Xms	Sets the minimum and initial size of the heap, must be a multiple of 1024 and greater than 1MB	For example, initial size 6MB: -Xms6291456 -Xms6144k -Xms6m
-Xmx	Sets the maximum size of the heap, must be a multiple of 1024 and greater than 2MB	For example, maximum heap 80MB: -Xmx83886080 -Xmx81920k -Xmx80m
-Xmn	Size of the young generation	Young generation 256MB: -Xmn256m -Xmn262144k -Xmn268435456
-XX	The ratio of Eden to Survivor spaces, default is 8 1g memory in the young generation, 800MB in Eden, 100MB each in S0 and S1	To adjust the ratio to 4: -XX=4
-XX:+PrintGCDetailsverbose	Print GC logs	None

Heap refers to available heap, while the survivor area can only use one block at a time.

Note that the SurvivorRatio ratio means eden:s0:s1 = SurvivorRatio:1:1.

Execution Flow:

Under the generational garbage collection algorithm, newly created objects are first placed in the Eden area.
As objects accumulate in the Eden area, when it becomes full and cannot accommodate new objects, it triggers a young generation GC, known as Minor GC or Young GC; Minor GC will reclaim objects from Eden and the From space, placing the remaining objects in the To space. Here, From and To refer to the two survivor areas, using the idea of the copying algorithm.
The two survivor areas, From and To, are then swapped, with S1 becoming From and S0 becoming To. When the Eden area fills up, Minor GC will still occur, reclaiming objects from Eden and the From S1 area, placing the remaining objects in the To area, i.e., S0 space; note that each Minor GC records the age of objects, starting at 0 and incrementing by 1 after each GC.
If the age of an object reaches the threshold (maximum 15, default value depends on the garbage collector), it will be promoted to the old generation.
If an object exceeds half the size of a Region, it will be placed directly into the old generation. This type of old generation is called the Humongous area. For example, if the heap memory is 4G and each Region is 2M, any large object exceeding 1M will be placed in the Humongous area, and if the object is too large, it may span multiple Regions.
After multiple collections, many old generation areas may appear. If the total heap occupancy rate reaches the threshold -XX:InitiatingHeapOccupancyPercent (default 45%), it will trigger a mixed GC, reclaiming objects from both the young and some old generations, as well as large object areas, using the copying algorithm.

Why does the generational GC algorithm divide the heap into young and old generations?

The characteristics of objects in heap memory:

Most objects in the system are created and quickly become unused and reclaimable, such as user order data that can be released after being returned to the user.
The old generation stores long-lived objects, such as most Spring bean objects, which will not be reclaimed after the program starts.
In the default settings of the virtual machine, the size of the young generation is much smaller than that of the old generation.

The reasons for dividing the heap into young and old generations in the generational GC algorithm are:

It allows adjusting the ratio of the young and old generations to adapt to different types of applications, improving memory utilization and performance.
The young and old generations use different garbage collection algorithms; the young generation typically uses the copying algorithm, while the old generation can use mark-and-sweep or mark-and-compact algorithms, providing greater flexibility for programmers.
The design of generational GC allows for only reclaiming the young generation (Minor GC). If it meets the object allocation requirements, there is no need to perform a full heap collection (Full GC), reducing STW time.

Garbage Collector#

The garbage collector is the specific implementation of the garbage collection algorithm. Since garbage collectors are divided into young and old generations, other than G1, other garbage collectors must be used in pairs.

image.png|500

Serial - Serial Old#

JVM parameter -XX:+UseSerialGC uses this pair of GC.

Young Generation - Serial Garbage Collector

Serial is a single-threaded garbage collector for the young generation.

image.png|500

Old Generation - Serial Old Garbage Collector

SerialOld is the old generation version of the Serial garbage collector, using single-threaded collection.

image.png|500

ParNew - CMS#

Young Generation - ParNew Garbage Collector

JVM parameter -XX:+UseParNewGC uses ParNew GC.

ParNew garbage collector is essentially an optimization of Serial for multi-CPU environments, using multi-threading for garbage collection.

image.png|500

Old Generation - CMS Concurrent Mark Sweep Garbage Collector

JVM parameter -XX:+UseConcMarkSweepGC uses CMS GC.

CMS garbage collector focuses on system pause time, allowing user threads and garbage collection threads to execute concurrently during certain steps, reducing user thread wait times.

image.png|500

CMS execution steps:

Initial marking: Quickly marks objects that can be directly associated with GC Roots.
Concurrent marking: Marks all objects without pausing user threads.
Final marking: Due to changes during the concurrent marking phase, some objects may be incorrectly marked or missed, requiring re-marking.
Concurrent cleaning: Cleans up dead objects without pausing user threads.

Disadvantages:

CMS uses the mark-and-sweep algorithm, which can lead to significant memory fragmentation after garbage collection. CMS will perform compaction during Full GC, which can cause user threads to pause. You can use the -XX:CMSFullGCsBeforeCompaction=N parameter (default 0) to adjust N times of Full GC before compaction.
It cannot handle "floating garbage" generated during concurrent cleaning, meaning it cannot achieve complete garbage collection.
If the old generation runs out of memory and cannot allocate objects, CMS will degrade to Serial Old, which blocks the old generation because it is now blocking.

Parallel Scavenge - Parallel Old#

Young Generation - Parallel Scavenge Garbage Collector

Parallel Scavenge is the default young generation garbage collector in JDK 8, using multi-threaded parallel collection and focusing on system throughput, automatically adjusting heap memory size.

image.png|500

Old Generation - Parallel Old Garbage Collector

JVM parameter -XX:+UseParallelGC or -XX:+UseParallelOldGC can use the combination of Parallel Scavenge + Parallel Old.

Parallel Old is the old generation version designed for the Parallel Scavenge garbage collector, utilizing multi-threaded concurrent collection.

image.png|500

G1 Garbage Collector#

The default garbage collector after JDK 9 is the G1 (Garbage First) garbage collector.

Parallel Scavenge focuses on throughput and allows users to set maximum pause