A Brief Explanation of Memory Management in NatJ
NatJ is the Java library used by MOE for interoperating with native APIs, such as creating Java bindings of Objective-C classes and calling C APIs directly in Java without the need of writing wrappers in JNI by your own. One critical problem NatJ needs to handle is the difference in the way Java and native APIs manage memory: Java uses GC, while Objective-C uses ARC, and C requires manual memory management by calling free()
explicitly.
C
When interoperating with C APIs, the rules are simple:
- When the native object (anything you allocate with
malloc
etc.) has the same lifecycle as the corresponding Java representative object (org.moe.natj.general.Pointer
), i.e., the native object is OWNED by your Java code, then as soon as thePointer
instance is GC-ed, the native memory will befree()
-ed automatically (by theCRuntime.strongReleaser
). - When the native object is not owned by the Java side, then NatJ won’t do anything for you. You need to free it by yourself when appropriate.
These two rules correspond to the owned
parameter of the CRuntime.createStrongPointer
method.
Objective-C
Objective-C uses ARC. Every OC object has a reference counter. The object will be deallocated as soon as the reference counter reaches 0. This creates several restrictions when using OC objects in Java world (and vice versa):
- When a Java instance is passed to native side by mapping to an OC object (i.e., a Java object of the hybrid or inherited mode), and that OC object is retained on the native side (e.g. the
retain
function is called on this object), the Java GC should not attempt to free that object. - When a Java representative object is still used by the Java code (i.e., the object has not been finalized by GC thread), the corresponding native object should not be deallocated by ARC (i.e., the reference count should not reach 0).
NatJ has two major different types of relationships between Java object and OC object, which can be found here, based on whether the Java object stores any extra state:
- A Binding class, which simply represents an existing Objective-C object, with ALL of the states store in the OC object, and ALL methods are implemented on the native side;
- A class which stores part of (or all) the states in Java object, and/or implement part of (or all) the methods in Java, which then can be divided further into two types:
- An Inherited class, which ALL states are stored in the Java object, and ALL methods are implemented in Java;
- A Hybrid class, which sits in between.
Binding Class
The reason of categorizing them into 2 major types is whether they are affected by the restriction 1.
When the states are stored purely on the native side, there is no need of keeping a permanent 1-to-1 mapping between the native object and the Java object. and the life span of the Java object can also be shorter than the native object. In other word, you can have multiple Java objects that point to the same native object, and those Java bindings can be GC-ed at anytime BEFORE the native object is deallocated.
For these reasons, a Binding class works similar to ARC: when a binding object is created, the corresponding native object will be retained (i.e., reference count + 1), and when the binding object is finalized by GC, the corresponding native object will be released, thus guarantees the restriction 2.
Inherited & Hybrid Class
On the other hand, for an inherited/hybrid class, it’s important to make sure the Java object has the same lifetime as the native object.
To meet restriction 1, NatJ needs a way of telling if the native object is held (retained) by any native code. Normally when an OC object is now held by anything, it’s reference count will be 0 and that’s when this object is deallocated.
However this violates restriction 2 as the corresponding Java object has not been freed by GC yet. NatJ does a clever trick: when the OC object is created, NatJ immediately calls retain
on it, then use this object normally. Now whenever the reference count reaches 1 instead of 0, we know this object is not held by anything, other than the initial retain
by NatJ. And when the Java object is finalized, NatJ will call the final release
on that to free it.
The problem now is: how to stop the Java object been freed by GC when the reference count is greater than 1, and how to allow it when the reference count reaches 1. The answer is simple: when the reference count is greater than 1, the corresponding Java object will be stored in a global static map (the strong reference map), so it will never be released by GC; when the reference count reaches 1, the Java object is removed from that map (and added into another map with WeakReference, the weak reference map), thus allows GC to do its work.