Java was initially designed as a safe managed environment.
Nevertheless, Java HotSpot VM contains a “backdoor†that provides a
number of low-level operations to manipulate memory and threads
directly. This backdoor – sun.misc.Unsafe – is widely used by JDK itself in packages like java.nio or java.util.concurrent.
It is hard to imagine a Java developer that uses this backdoor in any
regular development because this API is extremely dangerous, not
portable, and volatile. Nevertheless, Unsafe provides an easy
way to look into HotSpot JVM internals and do some tricks. Sometimes it
is simply funny, sometimes it can be used to study VM internals without
C++ code debugging, sometimes it can be leveraged for profiling and
development tools.
Obtaining Unsafe
The sun.misc.Unsafe class is so unsafe that JDK developers
added special checks to restrict access to it. Its constructor is
private and caller of the factory method getUnsafe() should be loaded by Bootloader (i.e. caller should also be a part of JDK):
01 | public final class Unsafe { |
04 | private static final Unsafe theUnsafe = new Unsafe(); |
06 | public static Unsafe getUnsafe() { |
07 | Class cc = sun.reflect.Reflection.getCallerClass( 2 ); |
08 | if (cc.getClassLoader() != null ) |
09 | throw new SecurityException( "Unsafe" ); |
Fortunately there is an theUnsafe field that can be used to retrieve Unsafe instance. We can easily write a helper method to do this via reflection:
1 | public static Unsafe getUnsafe() { |
3 | Field f = Unsafe. class .getDeclaredField( "theUnsafe" ); |
5 | return (Unsafe)f.get( null ); |
6 | } catch (Exception e) { } |
In the next sections we will study several tricks that become possible due to the following methods of Unsafe:
- long getAddress(long address) and void putAddress(long address, long x) that allows to read and write dwords directly from memory.
- int getInt(Object o, long offset) , void putInt(Object o, long offset, int x), and other similar methods that allows to read and write data directly from C structure that represents Java object.
- long allocateMemory(long bytes) which can be considered as a wrapper for C’s malloc().
sizeof() Function
The first trick we will do is C-like sizeof() function, i.e. function
that returns shallow object size in bytes. Inspecting JVM sources of
JDK6 and JDK7, in particular src/share/vm/oops/oop.hpp and src/share/vm/oops/klass.hpp, and reading comments in the code, we can notice that size of class instance is stored in _layout_helper which is fourth field in C structure that represents Java class. Similarly, /src/share/vm/oops/oop.hpp shows
that each instance (i.e. object) stores pointer to a class structure in
its second field. For 32-bit JVM this means that we can first take
class structure address as a 4-8 bytes in object structure and next
shift by 3×4=12 bytes inside class structure to capture_layout_helper field which is instance size in bytes. These structures are shown in the picture below:
As so, we can implement sizeof() as follows:
1 | public static long sizeOf(Object object) { |
2 | Unsafe unsafe = getUnsafe(); |
3 | return unsafe.getAddress( normalize( unsafe.getInt(object, 4L) ) + 12L ); |
6 | public static long normalize( int value) { |
7 | if (value >= 0 ) return value; |
8 | return (~0L >>> 32 ) & value; |
We need to use normalize() function because addresses between
2^32 and 2^32 will be automatically converted to negative integers, i.e.
stored in complement form. Let’s test it on 32-bit JVM (JDK 6 or 7):
4 | class MyStructure { int x; } |
5 | class MyStructure { int x; int y; } |
This function will not work for array objects, because _layout_helper field has another meaning in that case. Although it is still possible to generalize sizeOf() to support arrays.
Direct Memory Management
Unsafe allows to allocate and deallocate memory explicitly via allocateMemory and freeMemory methods.
Allocated memory is not under GC control and not limited by maximum JVM
heap size. In general, such functionality is safely available via NIO’s
off-heap bufferes. But the interesting thing is that it is possible to
map standard Java reference to off-heap memory:
01 | MyStructure structure = new MyStructure(); |
04 | long size = sizeOf(structure); |
05 | long offheapPointer = getUnsafe().allocateMemory(size); |
06 | getUnsafe().copyMemory( |
14 | Pointer p = new Pointer(); |
15 | long pointerOffset = getUnsafe().objectFieldOffset(Pointer. class .getDeclaredField( "pointer" )); |
16 | getUnsafe().putLong(p, pointerOffset, offheapPointer); |
19 | System.out.println( ((MyStructure)p.pointer).x ); |
So, it is virtually possible to manually allocate and deallocate real
objects, not only byte buffers. Of course, it’s a big question what may
happen with GC after such cheats.
Inheritance from Final Class and void*
Imagine the situation when one has a method that takes string as an
argument, but it is necessary to pass some extra payload. There are at
least two standard ways to do it in Java: put payload to thread local or
use static field. With Unsafe another two possibilities
appears: pass payload address as string and inherit payload class from
String class. The first approach is pretty close to what we see in the
previous section – one just need obtain payload address using Pointer
and create a new Pointer to payload inside the called method. In other
words, any argument that can carrier an address can be used as analog of
void* in C. In order to explore the second approach we start with the
following code which is compilable, but obviously produces
ClassCastException in run time:
01 | Carrier carrier = new Carrier(); |
04 | String message = (String)(Object)carrier; |
09 | void handler(String message) { |
10 | System.out.println( ((Carrier)(Object)message).secret ); |
To make it work, one need to modify Carrier class to simulate
inheritance from String. A list of superclasses is stored in Carrier
class structure starting from position 28, as it shown in the figure.
Pointer to object goes first and pointer to Carrier itself goes after it
(at position 32) since Carrier is inherited from Object directly. In
principle, it is enough to add the following code before line that casts
Carrier to String:
1 | long carrierClassAddress = normalize( unsafe.getInt(carrier, 4L) ); |
2 | long stringClassAddress = normalize( unsafe.getInt( "" , 4L) ); |
3 | unsafe.putAddress(carrierClassAddress + 32 , stringClassAddress); |
Now cast works fine. Nevertheless, this transformation is not correct
and violates VM contracts. More careful approach should include more
steps:
- Position 32 in Carrier class actually contains pointer to Carrier
class itself, so this pointer should be shifted to position 36, not
simply overwritten by pointer to String class.
- Since Carrier is now inherited from String, final markers in String class should be removed.
Conclusion
sun.misc.Unsafe provides almost unlimited capabilities for
exploring and modification of VM’s runtime data structures. Despite the
fact that these capabilities are almost inapplicable in Java development
itself, Unsafe is a great tool for anyone who want to study HotSpot VM
without C++ code debugging or need to create ad hoc profiling
instruments.
Source:http://highlyscalable.wordpress.com/2012/02/02/direct-memory-access-in-java/