Babbel Bytes

Insights from the Babbel engineering team

I've trusted you! You promised no null pointer exceptions!

Frederico Gonçalves

So you’ve just switched to Kotlin and thought it would be great to have all your API entities written using data classes with proper nullability rules. You’ve set up your Gson objects and prepare to deserialize your API response and surprise! Your non-nullable fields are actually null…

This post explains what’s happening and why is this still an issue if you specify default values for your fields.


Kotlin gave us in my personal opinion one of the most powerful features one can have in a language - the ability to specify that a type is nullable or not.

By making null part of the type system we now can benefit from compile checks. In short, code that was dangerous before becomes harmless now. Take the following example:

void displayName(@NonNull User user) {
  someTextView.setText(user.getName());
}

void someOtherMethod() {
  displayName(null);
}

This code compiles in Java, but when run it will throw a NullPointerException. Granted that the example is a bit silly since someone would hardly do this, but the point here is that we simply get a hint from the IDE saying that user is marked as @NonNull. However, the compiler won’t know about this and therefore the code compiles fine.

Take the same code in Kotlin

fun displayName(user: User) {
  someTextView.setText(user.name)
}

fun someOtherMethod() {
  displayName(null)
}

This time the code doesn’t even compile. The reason is that user is of type User, meaning it cannot be null. Since this information is now part of the type, the compiler actually knows this and seeing that you’re passing null it forces you to deal with the problem. You can either change the argument passed to displayName making it not null or you can change the type of the argument:

fun displayName(user: User?) {
  someTextView.setText(user?.name)
}

fun someOtherMethod() {
  displayName(null)
}

Now that the type is nullable we can pass in null and since the method TextView.setText can deal with nulls, we’re all set.

This is really good because the approach in Java will crash in the users’ hands. The approach in Kotlin will not even build forcing us to deal with it right away.

Great! So wouldn’t it make sense to make all our types null safe? Personally, I think so and we did this here at Babbel. However, once it came to the API entities we had some surprises.

The step by step approach

Once I’ve learned about null being part of the type system I quickly tried to take advantage of it. The next examples use Kotlin 1.2.10 and Gson 2.7. Naively I thought the following code would break:

data class User(
    val email: String,
    val firstName: String)

fun main(args: Array<String>) {
  val json = "{}"
  val gson = Gson()

  println(gson.fromJson(json, User::class.java).email)
}

The code reads an empty JSON into a data class that has 2 fields that cannot be null. Yet once you run it the output is:

null

That’s quite strange. Let’s try and explicitly break it:

fun main(args: Array<String>) {
  val json = """{
    "email": null
    }"""
  val gson = Gson()

  println(gson.fromJson(json, User::class.java).email)
}

Now we set the email field in the JSON to null and run the code again and it still prints null. In fact, changing the print statement to println(gson.fromJson(json, User::class.java).email == null) results in the IDE telling you that this condition is always false yet it prints true.

Of course, trying to use the email field results in a NullPointerException. Now we begin to feel a bit betrayed. We were promised no more NullPointerExceptions and moreover, we were promised that once a type is not-nullable, then either the compiler will complain or at runtime, we will get an IllegalStateException with a detailed message about which fields are null and should not be.

So what’s happening here? My next attempt was to add a default value for one of the fields. I thought maybe if you specify a default field it will use that value if the JSON doesn’t have it explicitly set.

data class User(
    val email: String,
    val firstName: String = "<EMPTY>")

fun main(args: Array<String>) {
  val json = """{
    }"""
  val gson = Gson()

  println(gson.fromJson(json, User::class.java))
}

And sure enough the code above prints:

User(email=null, firstName=null)

Everything is null again. I thought maybe there’s actually some consistency to this. Maybe it doesn’t matter what the data class specifies, Gson takes the JSON as the source of truth, meaning if the fields are not present in the JSON then they should also not be present in the data class. So if we assign default values to everything, it should still come out all as nulls. The proof is in the following code:

data class User(
    val email: String = "<EMPTY>",
    val firstName: String = "<EMPTY>")

fun main(args: Array<String>) {
  val json = """{
    }"""
  val gson = Gson()

  println(gson.fromJson(json, User::class.java))
}

Which prints:

User(email=<EMPTY>, firstName=<EMPTY>)

Wait… what? So now all fields have the default value?

I’ve dug a bit into the source code of Gson to try and find out what’s happening. The answer is not as trivial as I expected, but not very hard to grasp. It’s a mix between how Kotlin generates constructors for the data classes and how Gson uses them and the default deserializers.

Gson and Kotlin classes

Let’s jump into the code in Gson that causes the problematic behavior. You can find the whole class here, but I’ll paste the relevant part:

public <T> ObjectConstructor<T> get(TypeToken<T> typeToken) {
    final Type type = typeToken.getType();
    final Class<? super T> rawType = typeToken.getRawType();

    final InstanceCreator<T> typeCreator = (InstanceCreator<T>) instanceCreators.get(type);
    if (typeCreator != null) {
      return new ObjectConstructor<T>() {
        @Override public T construct() {
          return typeCreator.createInstance(type);
        }
      };
    }

    final InstanceCreator<T> rawTypeCreator =
        (InstanceCreator<T>) instanceCreators.get(rawType);
    if (rawTypeCreator != null) {
      return new ObjectConstructor<T>() {
        @Override public T construct() {
          return rawTypeCreator.createInstance(type);
        }
      };
    }

    ObjectConstructor<T> defaultConstructor = newDefaultConstructor(rawType);
    if (defaultConstructor != null) {
      return defaultConstructor;
    }

    ObjectConstructor<T> defaultImplementation = newDefaultImplementationConstructor(type, rawType);
    if (defaultImplementation != null) {
      return defaultImplementation;
    }

    return newUnsafeAllocator(type, rawType);
}

I’ve suppressed most of the comments and annotations for clarity

Let’s break it into chunks.

final InstanceCreator<T> typeCreator = (InstanceCreator<T>) instanceCreators.get(type);
if (typeCreator != null) {
  return new ObjectConstructor<T>() {
    @Override public T construct() {
      return typeCreator.createInstance(type);
    }
  };
}

final InstanceCreator<T> rawTypeCreator =
    (InstanceCreator<T>) instanceCreators.get(rawType);
if (rawTypeCreator != null) {
  return new ObjectConstructor<T>() {
    @Override public T construct() {
      return rawTypeCreator.createInstance(type);
    }
  };
}

The first thing Gson tries to do is to check if you’ve registered any instance creator. If you’re not familiar with these don’t worry because they’re not essential to this blog post. However, think of them as classes that you register with your gson object that specify how an object should be built. We didn’t do this, so the if statement will fail and the code will carry on onto the next one. The second if is quite similar, but only for the raw type - the type without generics info. This also fails the check and the code carries on to:

ObjectConstructor<T> defaultConstructor = newDefaultConstructor(rawType);
if (defaultConstructor != null) {
  return defaultConstructor;
}

If your class has a defaultConstructor - meaning a no-args constructor - then Gson will use it to build your objects.

ObjectConstructor<T> defaultImplementation = newDefaultImplementationConstructor(type, rawType);
if (defaultImplementation != null) {
  return defaultImplementation;
}

Last but not least the final if seems to be creating a “default” implementation. The 2 last if statements are the ones that concern us and these are the ones where I’m going to focus the rest of the blog post. Turns out that if you decompile the data classes in Kotlin to Java you’ll notice that sometimes you have a no-args constructor and sometimes you don’t.

When all your fields have default values Kotlin will generate a no-args constructor for you. When there’s at least one field that has no default value, then the no-args constructor is omitted.

The null checks happen in the constructors - in the initialization.

It’s fairly easy to understand how the object is created by Gson when using the no-args constructor. It’s a matter of calling it using reflection and we’re done. The code inside the constructor will initialize the fields to the correct default values.

The tricky part comes when trying to understand how can Gson create instances of classes that have no no-args constructor and completely ignore the initialization process. The fact is Gson uses Unsafe.

Believe it or not Unsafe is unsafe. It enables bypassing the object initialization effectively avoiding all the null checks that are generated in the constructor.

Here’s a streamlined example in Java that shows how Unsafe can bypass initialization.

import java.lang.reflect.Field;
import sun.misc.Unsafe;

class User {
  private final String email;

  public User() {
    this.email = "<EMPTY>";
  }

  public String getEmail() {
    return email;
  }
}

public class Main {
  public static void main(String[] args)
      throws Exception {
    // Use simple reflection to get the no-args constructor and instantiate the class.
    // Prints <EMPTY>
    System.out.println(User.class.newInstance().getEmail());

    // Use the unsafe class to bypass initialization and create instances of a class
    Class<?> unsafeClass = Class.forName("sun.misc.Unsafe");
    Field f = unsafeClass.getDeclaredField("theUnsafe");
    f.setAccessible(true);
    Unsafe unsafe = (Unsafe) f.get(null);
    User user = (User) unsafe.allocateInstance(User.class);
    // Prints null
    System.out.println(user.getEmail());
  }
}

unfortunately, you need to do some dirty reflection work to even be able to use the unsafe instance

As you see allocateInstance allows you to bypass the initialization process.

Since Kotlin does the null checks in the initialization process allocateInstance will effectively be bypassing them.

It’s as a last resort that Gson uses the Unsafe approach to create your objects.

How do we avoid this?

At the time of writing this document, Google has a proposal to add a method disableUnsafe to the GsonBuilder which will enforce the usage of no-args constructors or instance creators. Although this is not enough to play well with null-safety, it’s a really good start. For now, we’re left without this feature and have to take our own measures.

The straightforward answer is to always make sure the classes you’re serializing and deserializing have no-args constructors or instance creators registered. The more interesting part is having this play out well with Kotlin’s null-safety. Personally, I haven’t found a way to avoid having my default fields set to null, without custom type adapters.

For example, if we have the class:

data class User(
    val email: String = "<EMPTY>",
    val firstName: String = "<EMPTY>")

Parsing the JSON:

{"email": null}

Will result in a user object with the field email set to null even though there’s a no-args constructor. What happens here is that Gson uses the no-args constructor to build the user object. The default values are assigned and the null checks pass, but then Gson uses reflection to set the email field to null. Here there’s no null check.

A solution with custom type adapters could be the following:

import com.google.gson.GsonBuilder
import com.google.gson.TypeAdapter
import com.google.gson.stream.JsonReader
import com.google.gson.stream.JsonToken
import com.google.gson.stream.JsonWriter

data class User(
    val email: String = "<EMPTY>",
    val firstName: String = "<EMPTY>")

class UserTypeAdapter : TypeAdapter<User>() {
    override fun read(reader: JsonReader?): User {
        var email: String? = null
        var firstName: String? = null

        reader?.beginObject()
        while (reader?.hasNext() == true) {
            val name = reader.nextName()

            if (reader.peek() == JsonToken.NULL) {
                reader.nextNull()
                continue
            }

            when (name) {
                "email" -> email = reader.nextString()
                "firstName" -> firstName = reader.nextString()
            }
        }
        reader?.endObject()

        return when {
            email == null && firstName != null -> User(firstName = firstName)
            email != null && firstName == null -> User(email = email)
            email != null && firstName != null -> User(email, firstName)
            else -> User()
        }
    }

    override fun write(out: JsonWriter?, value: User?) {
        out?.apply {
            beginObject()
            value?.let {
                name("email").value(it.email)
                name("firstName").value(it.firstName)
            }
            endObject()
        }
    }
}

fun main(args: Array<String>) {
    val json = """{
        "email": null,
        "firstName": "My first name"
    }"""
    val gson = GsonBuilder()
        .registerTypeAdapter(User::class.java, UserTypeAdapter())
        .create()

    println(gson.fromJson(json, User::class.java))
    println()
}

There are quite some downsides to this approach. The first and more obvious one is that we would have to write a type adapter for each class we have in our models. Personally, I believe the code is not that easy to follow because of the JsonReader API. This approach also requires that the models have default values.

One can also throw an exception in the event of an illegal null value instead of constructing the user object. Of course, if you’d change the type from String to String? you’d have to update your type adapter too. Not very elegant.

All things considered, we decided to keep all our API models with nullable fields. This at least forces us to use them with care before putting it into the users’ hands.

Summary

In this post, we saw how easy it is to break null-safety in Kotlin using Gson. Essentially, if a field is marked as non-nullable, it can still be null if the parsed JSON contains this field with a null value or we built our models with no no-args constructors or instance creators. This leads to a false safety net where we think our models cannot be null, when in fact they can.

We can leverage type adapters to prevent this situation, but it’s an approach that can easily lead to a lot of custom deserialization and hard to follow. In big projects, this is quite undesirable.

Perhaps the safest approach is to keep everything nullable in the models and deal with this on each access to the fields.

Facebook Twitter Google+ Reddit EMail
Comments