Introduction to Java Virtual Machine (JVM)

Tuesday, 02/05/2023

Tram Ho

1. Introduction

The Java Virtual Machine (JVM) is a central component of the Java platform that plays an important role in running Java applications. JVM is a virtual machine designed to execute Java code as bytecode, ensuring portability of Java source code through the “Write Once, Run Anywhere” model (Write Once, Run Anywhere – WORA) . The purpose of the JVM is to help Java users run Java applications on a variety of operating systems and hardware without having to recompile the source code. This is done using an abstraction class between the Java source code and the operating system, where the JVM executes Java bytecode and manages memory for Java applications. The importance of the JVM to the Java programming language cannot be denied. Understanding the JVM, its structure, how it works, and the role it plays in running Java applications helps Java programmers get the most out of the Java platform, optimize application performance, and solve problems. memory and performance issues.

2. JVM structure and main components

2.1. Class Loader

Class Loader is the component responsible for loading Java classes (classes) into the JVM. Class Loader reads .class files containing bytecode and converts them to class objects in the JVM’s heap memory. Class Loader is divided into three types:

Bootstrap Class Loader : used to load main Java classes
Extension Class Loader : used to load Java extension classes
Application Class Loader : used to load classes defined in Java program Class Loader in Java provides flexibility for class loading, allowing classes to be loaded into the JVM from various sources, but also can cause class loading related problems like loading different class versions or class name conflicts. The class loading process in the JVM includes the following stages:
Loading : Class Loader loads the class bytecode files into memory and creates a class object in the Runtime Data Area.
Linking : The Class Loader performs the steps of linking the class, including:
- Verification : Check the validity of class bytecode files to ensure that they comply with Java rules.
- Preparation : Prepare data structures for the class, including static variables.
- Resolution : Finds and links classes and libraries related to the loaded class.
Initialization : Perform initialization of static variables and static blocks of the class.

2.2. Runtime Data Area

Runtime Data Area is an area of memory used by the JVM to store data during program execution. It includes the following parts:

2.2.1. Method Area

The Method Area in the JVM is a memory area used to store the structure of classes, including the class name, information about the fields and methods of the class, as well as the bytecode of the methods. The Method Area is shared among all threads in the JVM. It is part of the memory allocated to the JVM at startup. Method Area includes some of the following components:

Constant Pool : contains information related to constants, such as integers, strings, references to classes, methods, and fields.
Field Area : Contains information about the fields of the class, including field names, data types, and access scope information.
Method Area : contains information about the methods of the class, including the method name, return type, parameter list, access scope information, and the method’s bytecode.
Class Area : Contains information about the class, including the class name, information about the parent class, a list of interfaces that the current class implements, and other information about the class. The Method Area can be used by all threads of execution in the JVM to access information about classes and methods loaded into memory. However, Method Area can face problems like overflow or collision when some classes are too large or when multiple execution threads access the same class concurrently.

2.2.2. Heap

The heap in the JVM is a memory area used to store objects and arrays initialized by Java programs. The heap is also shared among all threads of execution in the JVM. When a Java program runs, it uses the Heap to allocate memory for objects and arrays. When an object is created in the program, the JVM looks for an empty memory area in the Heap and allocates memory for that object. The heap is managed by the Garbage Collector (GC) in the JVM. GC will scan and clean up unused objects to free the heap memory allocated for these objects. This cleanup reduces memory overflow and helps increase the performance of your Java program. However, using Heap can face problems such as overflow, conflicts when multiple execution threads access the same object in the Heap or when using large objects. Therefore, it is necessary to design and implement a Java program that uses the Heap efficiently to minimize memory-related problems in the program.

2.2.3. Stack

Stack in JVM is an area of memory used to store stack frames for each method call in each execution thread of a Java program. Each stack frame contains the local variables, parameters, and results of the method being called. Each thread of execution in the JVM has its own stack, so that the execution of each thread does not affect other threads. When a method is called, a new stack frame is created and pushed onto the stack of the current thread of execution. After the method completes execution, the stack frame is removed from the execution thread’s stack and returned to memory. The stack is also used to store method return values, return addresses, and exception information. When an exception occurs in the program, the JVM will search the stack to find the stack frame where the exception occurred and perform handling related to that exception. However, using too much memory on the Stack can also lead to overflow and cause problems related to program performance. Therefore, it is necessary to design and implement the program so that it uses the Stack efficiently to minimize memory related problems in the program.

2.2.4. Program Counter (PC) Register

The Program Counter (PC) Register in the JVM is a separate register used to store the address of the bytecode being executed for each thread of execution in a Java program. The PC Register tells the JVM which bytecode is being executed in the program. When a method is called, the value of the PC Register is updated to point to the address of the first bytecode in that method. While the program is executing, the value of the PC Register will be updated every time the JVM executes a new bytecode. When the method terminates, the value of PC Register will return to the address of the next bytecode after the method call. The PC Register is also used to manage the execution of jump instructions in the program. When a jump instruction is executed, the value of the PC Register is updated to point to the new address of the bytecode to execute. Using the PC Register allows the JVM to execute bytecode efficiently and reliably. However, the PC Register is not used to store data, it is only used to store the address of the bytecode being executed.

2.2.5. Native Method Stack

The Native Method Stack in the JVM is a separate stack used for native methods, i.e. methods written in a programming language other than Java. The Native Method Stack is used to store the information needed to execute native methods. When a native method is called, the value of the PC Register will point to the address of that native method and the Native Method Stack will be created to store information related to the execution of the native method. The Native Method Stack consists of stack frames containing the parameters and local variables of the native method. The Native Method Stack is independent of the JVM’s Stack and is managed by the JVM. When the native method finishes executing, the Native Method Stack is removed from memory. Using the Native Method Stack allows native methods to be executed efficiently and reliably. However, because native methods are written in a programming language other than Java, using the Native Method Stack may encounter problems related to memory management and compatibility between different languages. .

2.3. Execution Engine

Execution Engine is the component responsible for executing bytecode in JVM. It includes:

2.3.1. Interpreter

An interpreter in the JVM is a component used to read and execute the bytecode of a Java program. The Interpreter reads bytecode instructions one by one and executes them on the JVM. When a Java program is compiled, the source code is converted to bytecode. The Interpreter will read these bytecode instructions and execute them one instruction at a time. The Interpreter uses the values on the stack to perform calculations, assign values, and other operations performed by the bytecode instruction. Interpreter makes Java program executable on different platforms without having to recompile the program for each platform. However, because the Interpreter reads and executes the bytecode one instruction at a time, the execution performance of the program may not be optimal. To improve performance, the JVM may use other techniques such as Just-In-Time (JIT) Compiler to compile frequently used portions of bytecode into machine code and store it in memory for reuse. in the next execution.

2.3.2. Just-In-Time (JIT) Compiler

Just-In-Time (JIT) Compiler in JVM is a tool used to optimize the bytecode execution of Java programs by compiling frequently invoked portions of bytecode into machine code before execution. . This speeds up the execution of the program and improves the performance of the Java program. When the Java program is executed, the JIT Compiler scans and analyzes the program’s bytecode and finds the frequently called (hot code) bytecode sections. The JIT Compiler then compiles these pieces of code into machine code and stores it in memory for reuse in the next execution. Compilation of these bytecodes is only done while the program is executing and these pieces of code have been identified as hot code, so the JIT Compiler is called a dynamic optimization engine. Using JIT Compiler makes Java programs execute faster and optimizes program performance. However, compiling these bytecodes can take a lot of time and system resources, so the use of JIT Compiler should be carefully considered to ensure the efficiency and stability of the program.

2.3.3. Garbage Collector (GC)

Garbage Collector (GC) in JVM is an important component used for garbage collection and freeing heap memory occupied by objects that are no longer used in Java program. When an object is no longer in use, the GC releases the memory used by that object and returns it to heap memory for use by other objects in the program. GC helps prevent out-of-memory situations in the JVM, making the program run stably and avoiding memory-related errors. The GC will run automatically in the JVM to release objects that are no longer in use, and the Java program doesn’t need to worry about detailed memory management. Memory cleanup algorithms are the methods that Garbage Collector (GC) uses to free the heap memory in the JVM. Each algorithm will have its own advantages and limitations, suitable for different use cases of Java programs. Here are some common memory cleanup algorithms in JVM:

Serial GC : One of the oldest and simplest GC algorithms in the JVM. It uses a thread for garbage collection and stops the whole application during garbage collection
Parallel GC : GC algorithm uses multiple threads to collect garbage on heap memory
Concurrent Mark Sweep (CMS) GC : Garbage collection algorithm designed to minimize latency during garbage collection by using multiple threads to perform garbage collection
Garbage First (G1) GC : New GC algorithm introduced from JDK 7u4. G1 GC is designed to address the performance and scalability issues of CMS GC, while minimizing latency during garbage collection.
Z Garbage Collector (ZGC) : Oracle’s garbage collection algorithm introduced in Java 11. ZGC is designed to support Java applications that require high availability and low response times.
Shenandoah GC : New GC algorithm introduced in 2018 by Oracle. It is a concurrent GC algorithm that minimizes latency and optimizes application performance. However, using GC can also affect the performance of the program if the GC is done too often or not often enough. Therefore, it is very important to design and implement the program so that the GC is used efficiently and optimally to ensure the efficiency and stability of the program.

3. How the JVM works and its role in running Java applications

The JVM plays an important role in running Java applications. The main steps in this process include:

3.1. JVM startup process

When you run a Java application, the Java Virtual Machine (JVM) is started by the Java Runtime Environment (JRE). The JVM startup process goes through the following steps: When running a Java application, the Java Virtual Machine (JVM) is started by the Java Development Kit (JDK) or the Java Runtime Environment (JRE). The JDK or JRE includes the components needed to run Java programs, including the JVM, libraries, and support tools. However, from Java 11 onwards, the JRE has been dropped from the JDK and can only be downloaded and installed individually.

3.3.1. JVM Initialization

When running a Java program on the JDK, the JDK will use jlink to create an image file containing the modules needed to execute the Java program. During startup, the JVM loads the modules defined in the image to execute the program. JVM parameters and configuration can be set using command line options or through configuration files provided in the Java configuration file. Some important parameters and configurations of the JVM include:

Heap Memory Size: This is the size of the heap memory allocated to the Java program.
Garbage collection parameters: These parameters specify how garbage collection is implemented in the JVM, including how often garbage is collected and the garbage collection method used.
Runtime parameters: These parameters specify the runtimes of the JVM, including the time to wait before garbage collection starts and the time to wait before the program terminates.
Memory management parameters: These parameters specify how memory is managed in the JVM, including how memory is allocated and how memory resources are handled. These parameters and configurations can be customized to meet the different requirements and conditions of the Java program.

3.3.2. Load starter class

After the JVM is initialized, it will start loading Java classes. The class loading process is performed by the Class Loader and takes place in the stages: loading, linking and initialization (details in the Class Loader section). The boot class (Bootstrap Class) is the first class loaded into the JVM. This is the class that contains the core Java classes, including the classes in java.lang , java.util and java.io packages. After the startup class has been loaded into the JVM, the JVM loads the application class specified by the user when running the Java application. The application class contains the main() method and is the starting point for the execution of the Java program.

3.3.3. Initialize the application class

After loading the application class, the JVM will instantiate that class by calling the static main() method of the application class. The main() method will be executed first when the Java program is started and it will be the starting point for the program’s execution flow.

3.2. Application’s code execution

After the JVM startup is complete, the application’s code execution begins. Here are the main steps in the application’s code execution:

3.2.1. Loading and linking classes

During the execution of the application’s code, the JVM needs to load and bind the application classes. Class loading is the process of reading a .class file from the file system or from the network, analyzing the file’s structure, and generating an in-memory representation of the class. The JVM uses Class Loaders to perform class loading (details in the Class Loader section). Application classes are usually loaded using the Application Class Loader. This Class Loader searches for .class files in the application’s CLASSPATH. If not found in CLASSPATH, Class Loader will search the application or network libraries.

3.2.2. Initialize classes and create objects

After loading and linking the classes, the JVM will proceed to instantiate the classes and create their objects. Class initialization involves assigning initial values to static fields and implementing static initialization blocks of the class. These static initialization blocks are executed only once when the class is loaded and before any objects of that class are created. After the class has been initialized, the JVM will allocate memory for the objects of that class and call the corresponding constructors to initialize the objects. These constructors perform operations to initialize the instance variables of the object and assign initial values to them. After the objects are initialized, they are stored in the heap memory and can be used during the execution of the program.

3.2.3. Executing the application’s code

Finally, the JVM will start executing the application’s bytecode by executing the methods of the loaded and instantiated classes. First, the JVM will call the main method of the application’s main class to start the execution. During execution, the JVM may call other methods of the same class or of different classes, depending on the execution flow of the application. The JVM uses the program counter to keep track of the bytecode instruction being executed and uses the stack to store intermediate values and stack frames for the methods being executed. called. If necessary, the JVM will use the Just-In-Time (JIT) Compiler to optimize the execution performance of the program.

3.2.4. Garbage Collection and Memory Management

When executing the application’s code, the JVM manages the allocation and release of memory for the objects. Memory is allocated for objects in the heap, and the JVM uses a garbage collection algorithm to automatically free the memory of objects that are no longer in use.

3.2.5. End of application

The execution of the application will end when the JVM encounters one of the following cases:

The main method or another running method ends up calling System.exit() or equivalent methods.
All the main threads of the application have ended.
The JVM encounters a fatal error or an unhandled exception and is forced to stop.

When the program finishes executing, the JVM releases all the resources used by the program and terminates the execution. The JVM will perform resource-freeing operations, including:

Freeing heap memory: JVM will release all objects and arrays created in JVM heap memory.
Freeing stack memory: JVM will release all stack frames created in JVM’s stack memory.
Freeing other resources: The JVM will release all other resources used by the Java program, including I/O resources and network connections. After all the resources have been released, the JVM will end the execution and return to the initial state. This process ensures that the resources used by the Java program will not continue to occupy memory and system resources after the program has terminated.

4. Conclusion

In this article, we learned about the Java Virtual Machine (JVM), an important component in running Java applications. We have explored the main structure and components of the JVM, including Class Loader, Runtime Data Areas, Execution Engine, and Native Interface. We also dived into how the JVM works and its role in starting, executing, and terminating Java applications. A better understanding of the JVM will help Java programmers develop and optimize applications more efficiently, and deepen their understanding of the Java language and Java platform.

5. References

Oracle. (2021). The Java® Virtual Machine Specification, Java SE 17 Edition . Oracle. https://docs.oracle.com/javase/specs/jvms/se17/html/index.html
Gosling, J., Joy, B., Steele, G., Bracha, G., & Buckley, A. (2014). The Java® Language Specification, Java SE 8 Edition . Oracle. https://docs.oracle.com/javase/specs/jls/se8/html/index.html
Java Virtual Machine (JVM) Stack Area . GeeksforGeeks. https://www.geeksforgeeks.org/java-virtual-machine-jvm-stack-area/
Understanding the Java Virtual Machine: Class loading and Reflection . Pluralsight. https://www.pluralsight.com/guides/java/java-virtual-machine-class-loading-and-reflection
Inside the Java Virtual Machine . Artima Developer. https://www.artima.com/insidejvm/ed2/
JVM Architecture: Understanding JVM Internals . DZone. https://dzone.com/articles/jvm-architecture-explained
_How Java Virtual Machine (JVM) Works?. Java Code Geeks. https://examples.javacodegeeks.com/core-java/how-java-virtual-machine-jvm-works/

Share the news now

Source : Viblo