Monday, February 28, 2011

Magnifying HelloWorld App Part Two


(Java Virtual Machine Insights)

In the first part of the article, we know that what happens when we compile and run the HelloWorld program. Then you got to know how JDK directories are structured. When it comes to the java platform, Java Virtual Machine is the heavy lifting guy who does the great job to serve the most of the java buzzwords to the table. Today, it is a good time to snoop out what really goes in the JVM. Tighten you buckles up we are planning for a big leap (just kiddingJ). As beginner you don’t need to go through each and every aspects of JVM. But there are several things to be kept in your permanent memory area. If you like to dig deeper JVM is the good area to work around. I would try to keep thing as much as simpler.In nutshell, JVM stands for Java Virtual Machine. What does it mean to you? Virtual machine means that it is machine which doesn’t have a physical existence. In the essence, it is virtually lives on top of your operating system. You have seen it.JDK folder structure and where jvm.dll file is. That’s why most people say java platform is a software only platform. Simple point to be highlighted, JVM is part of the java runtime environment and it resides on top of the operating system. Biggest buzzword Java people are proud of ‘platform independence ’, “hell yeah we write code once and run anywhere” exists because of the JVM. JVM is the guy who works behind you to write the platform independence code. These things might be confusing you although Java Virtual Machine implementation for a specific operating system is platform dependent. Moreover, JVM implementation for Windows NT is different form JVM implantation for Mac OSX.


Sun Microsystems was the company who invented the java technology, in 2010 oracle acquired sun. Now it is part of the Oracle Cooperation. Here we are taking more about JVM. When it comes to real fact, Java Virtual Machine is just a specification initially provided by the Sun Microsystems. After agreeing upon the terms and conditions anyone can implement the JVM with respect to the specification. By default most of the developers use the Sun’s implementation of the JVM. Open JDK is another project that implements the open source version of JVM. Azul System provides the enterprise level java run time environment for large scale projects. It is the java runtime where you can have large memory for garbage collected heap (usually maximum garbage collected heap size for JVM is 2GB) Then we are jumping to the how JVM is defined by the specification at high level. First point JVM specification is very flexible that means it is less constrained. So that it gives more freedom to implementers to design JVM implementation by their own way.


In simplest form, JVM has two parts. It contains class loader and the execution engine. Now we come to our simple HelloWorld program, you have seen the JVM loads classes form rt.jar file. Class loader is totally responsible for this job. And execution engine is done the running your HelloWorld program. I would say it is responsible for interpreting byte code made by java compiler. The statement is not totally correct when JVM combined with just in time compilation and adoptive optimization. You will be learned those things in my next articles. Perhaps you hear about hotspot JVM, or you use it without knowing it. (Type the java –version in command prompt to verify which version and type of the JVM you are using).Hotspot JVM is a recent JVM which has just in time compilation and adoptive optimization. Hotspot JVM is really capable enough to identify hotspots (heavily active code) and compile them into native machine code. Rest of the code will be interpreted as program goes.




I will simplify it in this way. Think class loader and execution engine are piece of software which are interacting each other to run the java program. When JVM instance starts these components are in main memory. They require a memory to store things where loaded class files reside. This figure shows how main memory is being composed to facilitate the JVM work. This figure gives you the glimpse of memory area that JVM uses.

Run time data area is the entire memory which is utilized by the class loader sub system and execution engine. Runtime data area is divided into five areas based on JVM specification and purpose they serve. Implementation of those memory areas might be varying depending on the implementer’s decision. However, every JVM should have these five components inside runtime data area.

Before explaining each component, you need to understand main method in our HelloWorld program runs as separate thread. It is known as main thread. As beginner you might not be familiar with thread at first hand. I will give you the quick tips to catch up threading concepts. By definition thread is a light weight process and it is an independent path of execution within a single program. Threads share resources among them. You will get to know what memory shared among thread when I explain each component. Java threading is an important area where you have to spin out your brain. For the sake of this article, you just need to know these points.

Method Area:
Class loader loads the information about classes, interfaces and types that you have used in your program to this memory area. Simply it contains the all the details relevant to classes. For an instance in our HelloWorld program, information about HelloWorld class is stored in method area. In reality method area contains the actual representation of the .class file in memory. It has the following memory areas such as runtime constant pool, method code, attributes and fields. If you are more interested about these terms I will leave those things for you as a home work J.

Heap:
When you crate object using new keyword within your programme, those objects are placed onto the heap memory. Heap stores the objects that you created. In our example, we haven’t created any object so that, we wouldn't utilize the heap memory area.

Important
  • Method area and heap is being shared among threads, any threads can have access to the resources inside those two memory area.

Stack:
JVM is built on stack based architecture; stack helps keep track of method invocations done by the particular thread. Stack is data structure which has the LIFO property, pop and push operations can be done upon the stack agreeing to the LIFO property. LIFO means Last in First Out, you can only pop /get last element from the top of the stack. Both operations are done form the top of the stack. This is the representation of a stack.


Same thing occurs inside the JVM. Each and every thread has own stack. One stack has many elements called stack frames. Stack frame contains the state of the method invocation. When particular thread invokes a method JVM push a stack frame to the thread’s stack. When thread completes the method invocation it simply pops up the stack frame and discards it. State of the method invocation consists of operand stack, array of local variable to the method, parameters to the method, intermediate values of computations and reference to the runtime constant poll of the class of current method. Conceptually, stack frame is looks like this.


When it comes to our HelloWorld programme, we do have only one thread: main thread. So we do only have the one stack and main method is one that is to be executed. Most probably JVM has only one stack frame for our programme.

PC Register:
Simply we call it, programme counter. Each thread has its own programme counter. The purpose of the programme counter is to point out the next instruction to be executed in the current method. .class file contains the byte code. Byte codes are instruction to the JVM. As I said before byte code is represented in method area. If you open the .class file using hexadecimal editor, you will see the byte code instructions. This figure shows the classes file view in hexadecimal editor.



You do have another option to see the byte code instructions. Use the javap.exe command line tool which aka java disassembler with –c option. You are free to use the –verbose option with it. When you disassemble it you will see the actual instructions in a meaningful way. This is the command do disassembling. Javap –c HelloWorld > HelloWorld.bc. HelloWorld.bc will be created in directory where HelloWorld.class file resides. This is the output you get when you disassemble the HelloWorld.class file.



Inside the main () method, you will see the instructions, JVM will execute these instructions. I want to point out this thing. Pc counter will point for each instruction to be executed for a particular method invocation in a particular thread. Eventually this is complex compact representation of the runtime data area in particular JVM implementation.


I don’t want you to be boarded while you read this article, but I can’t help you out all the way. These are totally theoretical stuff to be familiar with. Now you have experienced what happens behind the sense when you run the simple HelloWorld program. A lot of the things happen. Still journey doesn’t finish.


This is the wrap up session, now we all know those components of the JVM and how JVM works. Afterwards, we just need to apply these concepts to our HelloWorld program. It will be easy if I explain this step by steps.

 Class loading mechanism

When java.exe laugher command executed, class loader sub system is looking for a HelloWorld.class file and core java library classes.  All of them are loaded into method area and then JVM generates the internal representation of the each class. (See the method area).



Execution mechanism

1.       Everything in java is a thread. Main method is also run inside a separate thread. It is known as main thread.

2.       In HelloWorld program we didn’t create any instance so that we didn't you the heap memory. Our example won’t consume heap memory.
3.       JVM is looking for a main method and it is the starting point of a program. When it is found, JVM creates a separate thread for main method.
4.       As you already know, separate thread means, thread gets an own stack and pc register. Simply, inside main method, it is just printing a string value to the console.
5.       In side main method we invoke the println (“Hello World Java”), when this method invoked, definitely stack frame is pushed to the stack. You have seen the conceptual structure of the stack frame by going through the ‘stack’ section. It has all the details to complete the method invocation. When println () method executed, stack frame will be popped up then main thread completes its job.
6.       Our simple HelloWorld program does its job and then become dead main thread is completed. Main thread is the only thread in this HelloWorld program.

Eventually we are 100% done.I believe this article helps you to get a depth knowledge about JVM and really interesting area which should be drilled down deep by yourself.Now it's your turn.Learn something new everyday and keep sharing with others.


© Nuwan Arambage-"transcending verge of life"

No comments:

Post a Comment