3

I'm writing a client for a third-party REST API that returns JSON with a variety of alternative values instead of proper null or omitting the property entirely if null. Depending on the entity or even property in question, null could be represented by either null, "", "0" or 0.

It's easy enough to make a custom serializer, e.g. something like this works fine:


@Serializable
data class Task(
    
    val id: String,
    
    @Serializable(with = EmptyStringAsNullSerializer::class)
    val parentID: String?
)

object EmptyStringAsNullSerializer : KSerializer<String?> {

    private val delegate = String.serializer().nullable

    override val descriptor: SerialDescriptor =
        PrimitiveSerialDescriptor("EmptyStringAsNull", PrimitiveKind.STRING)

    override fun serialize(encoder: Encoder, value: String?) {
        when (value) {
            null -> encoder.encodeString("")
            else -> encoder.encodeString(value)
        }
    }

    override fun deserialize(decoder: Decoder): String {
        return delegate.deserialize(decoder) ?: ""
    }
}

fun main() {
    val json = """
        {
            "id": "37883993",
            "parentID": ""
        }
    """.trimIndent()
    val task = Json.decodeFromString(json)
    println(task)
}

But annotating many properties like this is a bit ugly/noisy. And I'd also like to use inline/value classes for strong typing, like this:

@Serializable
data class Task(
    val id: ID,
    val parentID: ID?
    /* .... */
) {

    @JvmInline
    @Serializable
    value class ID(val value: String)
}

This means that in addition to annotating these properties I also need a custom serializer for each of them. I tried some generic/parameters-based solution that can work for all cases like this:

open class BoxedNullAsAlternativeValue<T, V>(
        private val delegate: KSerializer<T>,
        private val boxedNullValue: T,
        private val unboxer: (T) -> V
    ) : KSerializer<T> {

        private val unboxedNullValue by lazy { unboxer.invoke(boxedNullValue) }

        override val descriptor: SerialDescriptor =
            PrimitiveSerialDescriptor(this::class.simpleName!!, PrimitiveKind.STRING)

        override fun serialize(encoder: Encoder, value: T) {
            when (value) {
                null -> delegate.serialize(encoder, boxedNullValue)
                else -> delegate.serialize(encoder, value)
            }
        }

        override fun deserialize(decoder: Decoder): T {
            @Suppress("UNCHECKED_CAST")
            return when (val boxedValue = delegate.deserialize(decoder)) {
                boxedNullValue -> null as T
                else -> boxedValue
            }
        }
    }

But that doesn't work because @Serializable(with = ...) expects a static class reference as argument, so it can't have parameters or generics. Which means I'd still need a concrete object for each inline/value type:

@Serializable
data class Task(
    val id: ID,  // <-- missing serializer because custom serializer is of type ID? for parentID
    val parentID: ID?
) {

    @JvmInline
    @Serializable(with = IDSerializer::class)
    value class ID(val value: String)
}

internal object IDSerializer : BoxedNullAsAlternativeValue<Task.ID?, String>(
        delegate = Task.ID.serializer().nullable,   // <--- circular reference
        boxedNullValue = Task.ID(""),
        unboxer = { it.value }
    )

That doesn't work because there is no longer a generic delegate like StringSerializer and using Task.ID.serializer() would mean the delegate would be the custom serializer itself, so a circular reference. It also fails to compile because one usage of the ID value class is nullable and the other not, so I would need nullable + non-nullable variants of the custom serializer and I would need to annotate each property individually again, which is noisy.

I tried writing a JsonTransformingSerializer but those need to be passed at the use site where encoding/decoding happens, which means I'd need to write one for the entire Task class, e.g. Json.decodeFromString(TaskJsonTransformingSerializer, json) and then also for all other entities of the api.

I found this feature request for handling empty strings as null, but it doesn't appear to be implemented and I need it for other values like 0 and "0" too.

Question

Using kotlinx.serialization and if necessary ktor 2, how to deserialize values like "", "0" and 0 as null for inline/values classes, considering that:

  • Properties of the same (value) type can be nullable and non-nullable in the same class, but I'd like to avoid having to annotate each property individually
  • I'd like a solution that is as generic as possible, i.e. not needing a concrete serializer for each value class
  • It needs to work both ways, i.e. deserializing and serializing

I read in the documentation that serializing is done in 2 distinct phases: breaking down a complex object to it's constituent primitives (serializing) --> writing the primitives as JSON or any other format (encoding). Or in reverse: decoding -> deserializing;

Ideally I'd let the compiler generate serializers for each value class, but annotate each of them with a reference to one of three value transformers (one each for "", "0" and 0) that sit in between the two phases, inspects the primitive value and replaces it when necessary.

I've been at this for quite some time, so any suggestions would be much appreciated.

Rolf W.
  • 1,379
  • 1
  • 15
  • 25
  • By "it needs to work both ways", do you mean to say that if "0" comes in on deserialization, "0" needs to come out again, and `null` would be invalid? At a glance, I think you'll need to end up using a custom type with custom serializer wherever you want this behavior. It is impossible to say, "all strings should be serialized like this" with the normal `Json` encoder/decoder: writing a custom encoder seems overkill. – Steven Jeuris Aug 07 '22 at 07:52
  • Indeed, that's what I'm after. Specifically: I want the Kotlin code to have proper null safety despite receiving alternate values from fetched entities, and when sending entities I want null transparently converted to the appropriate alternate value again. What do you mean by a custom type, do you mean a custom Kotlin type? Because I have that already. If so, how do you suggest I go from there? – Rolf W. Aug 10 '22 at 15:30
  • "I want null transparently converted to the appropriate alternate value again" This means you will necessarily need to store the original input in your received data types. You can't have 'lossy' deserialization. – Steven Jeuris Aug 10 '22 at 15:46
  • 1
    By "custom type" I mean you will need to use something like a `OptionalString`, which contains the original JSON value which upon serialization is returned. And, which then has a `.toNullableString()` method to derive `String?` while the deserialized data is in memory. – Steven Jeuris Aug 10 '22 at 15:51
  • I think I see what you mean, so something like this: each entity of the API would have either `value class ID(val value: NullAsZeroString)`, `value class ID(val value: NullAsZeroInt)` or `value class ID(val value: NullAsEmptyString)` as a type definition for it's ID field, depending on whichever is the case for a particular entity, and then write a custom serializer for each special case `NullAsEmptyString`, `NullAsZeroString` and `NullAsZeroInt`, right? That would certainly work but it leaks into the Kotlin API, which is unfortunate. It also involves double boxing if i want full type-safety – Rolf W. Aug 10 '22 at 17:27
  • When you say "it leaks into the Kotlin API", you seem to be implying that your domain objects and data transfer objects (DTOs) are one and the same. That's definitely a possibility with a powerful serializer such as `kotlinx.serialization`, _but_ with extremely esoteric requirements as you seem to have, I think splitting DTO from domain object may be best. Or, you could expose a common interface on both. – Steven Jeuris Aug 11 '22 at 11:56
  • 1
    You won't get around to using `String?` and `Int?` directly, because that would be 'lossy'. Upon deserialization, there is no place to hold whether the original was "null", "0", or "". You could have a generic `WeirdNull`, if you are saying that 0 for ID should be a `null` for integers. The nullable T could be exposed as a `.value` property. But, if those are the real reqs, it might also be time to push back to whoever is serialization things in such a random/ill-defined way. What _are_ the reqs? If they are the same for all types, you can have a generic implementation; otherwise not. – Steven Jeuris Aug 11 '22 at 11:57
  • 1
    Now, if the system you are communicating with is at least as flexible as the requirements they seem to be imposing upon you, and it is okay to serialize an originally received "0" as `null`, _then_ you can maintain `String?` on your objects, but you would still have to specify a custom serializer on each property which requires this custom, more "flexible" (but definitely error-prone), behavior. There are good reasons "0" is different from `null`. – Steven Jeuris Aug 11 '22 at 12:09
  • 1
    Agreed, but the API is not within my control so this is the data I have to work with. These are indeed DTOs and the code I wrote to work around these inconsistencies manually is internal to the component, so it's not a big deal. It's ugly and more error-prone though, as it defeats the compile-time safety kotlin provides. I occasionally encounter APIs with similar issues (especially empty string for null), so I was wondering if there is an easy way to handle such things transparently at the deserializer level. – Rolf W. Aug 12 '22 at 10:52

0 Answers0