Room Banner

JVM Reverse Engineering

Learn Reverse Engineering for Java Virtual Machine bytecode

medium

75 min

Room progress ( 0% )

To access material, start machines and answer questions login.

Task 1Introduction

When java applications are compiled, they are turned into an intermediary form of machine code, known as bytecode. While java source code is designed to be easy for humans to read, bytecode is designed to be easy for machines to read.

When you execute a compiled java application the class file is read and interpreted by a Java Virtual Machine. This is like a custom virtual CPU that runs inside your existing CPU and follows a different instruction set, the JVM instruction set.

Java Bytecode is a stack based language. This means that temporary variables are stored in the stack, rather than how x86 stores in registers. Stacks are like buckets. When you add a variable to the stack, you put it at the top of the bucket. When you remove/use a variable from the stack you use the variable at the top of the stack. If you attempt to retrieve a variable from an empty stack this is known as a Stack Underflow. If you add too many variables such that the stack reaches its memory limit, this is known as a Stack Overflow (Think of a bucket overflowing from too many items).

The java bytecode to print "Hello World" to console is shown below:

getstatic java/lang/System.out:Ljava/io/PrintStream; // Retrieve the static variable "out" in the System class and store it on the stack

ldc "Hello World" // Load the string "Hello World" onto the stack

invokevirtual java/io/PrintStream.println:(Ljava/lang/String;)V // Invoke the "println" function on the System.out variable using the string at the top of the stack as an argument
 

For more information on the JVM instruction set I highly recommend https://en.wikipedia.org/wiki/Java_bytecode_instruction_listings.


Because JVM Bytecode is a high level representation of the original source code, constructs such as methods, fields and classes are still visible.

Classes are compiled into .class files, one class per file. These can reference other classes which will be linked by the JVM at runtime. By using a parser such as javap we are able to see the methods and fields present in a class. Each will have a name and a descriptor. The descriptor is a representation of the arguments and return type a method can take, or the type of a field.

The following method:

void main(String[] args, int i)

would produce this descriptor and name:

main([Ljava/lang/String;I)V

The args are surrounded in brackets. A [ brace represents an array. An object is represented by a fully qualified internal name prepended by an L and appended by an ;. The I represents an Int, and the V at the end represents the type void. A full writeup on descriptors can be seen here: https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html#jvms-4.3.


Javap is a tool bundled with JDK releases that can disassemble compiled classes. Example usage:

(p = show private members, v = verbose)

javap -v -p HelloWorld.class

Answer the questions below

Read the above text

Consider the following bytecode:

LDC 0

LDC 3

SWAP

POP

INEG

Which value is now at the top of the stack?

Which opcode is used to get the XOR of two longs? (answer in lowercase)

What does the -v flag on javap stand for? (answer in lowercase)

Complete the follow challenges.

Answer the questions below

Find the name of the file that this class was compiled from (AKA Source File)

What is the super class of the Main class? (Using internal name format, i.e. /)

What is the value of the local variable in slot 1 when the method returns? (In decimal format)

The given class file takes a password as a parameter. You need to find the correct one. Tools like javap will be sufficient.

Answer the questions below

What is the correct password

Like the previous task, this program takes a password as an argument, and outputs whether or not it is correct. This time the string is not directly present in the class file, and you will need to use either a decompiler, bytecode analysis or virtualisation to find it.

Answer the questions below

What is the correct password?

ASM is a powerful open source library for manipulating bytecode. It gives a high level representation of bytecode that is easy to parse and modify.

You can use asm to programmatically remove obfuscation in java applications. Java Deobfuscator is an open source project that aims to use ASM to remove common obfuscation. They provide already implemented transformers, as well as the ability to make your own. A simple way to solve advanced crackmes like the one below is to virtualise method calls, for example the method calls to decrypt the strings. Java deobfuscator provides the necessary tools to do this, and there are prewritten examples that you can adapt to any program.

Answer the questions below

Read the above

This program follows the same logic as the previous task, however it has a custom obfuscation layered on top. You might require a decompiler for this, as well as custom tools. This uses anti virtualisation techniques as well, so be warned.

Answer the questions below

Find the correct password

This final jar has nearly every exploit I know packed into it. I dont know of any decompilers that will work for it. You will have to use custom tools and bytecode analysis to pick apart this one.

Same format as the previous tasks, takes one argument as the password.

Answer the questions below

What is the correct password?

Created by

Room Type

Free Room. Anyone can deploy virtual machines in the room (without being subscribed)!

Users in Room

3,148

Created

1917 days ago

Ready to learn Cyber Security? Create your free account today!

TryHackMe provides free online cyber security training to secure jobs & upskill through a fun, interactive learning environment.

Already have an account? Log in

We use cookies to ensure you get the best user experience. For more information contact us.

Read more