Typing tricks to reduce ambiguity

One of the fundamentals of programming is defining interfaces, or contracts between parts of systems. Interfaces create layers of abstraction and allow scaling code beyond the complexity a single person can hold in their head. They appear at the level of function signatures, to dependency signatures, up to service-to-service network calls.

There are almost always grey areas within an individual interface; areas where behavior isn’t explicitly defined. This causes a couple problems. Interpretations of behavior can differ between people or over time, leading to subtle bugs due to drifting behavior in different modules. It also requires overhead and more error handling in each consumer, as you’ll see below.

In a strongly typed language there are some not-immediately-obvious techniques to reduce this ambiguity and create a more intuitive design.

I’ll provide code examples throughout this post, available in several different languages.






The basics: Comments

As the author, the easiest way to avoid ambiguity is through documenting your code.

interface Foo {
  type: string; // describes how foo should be represented
}

interface Bar {
  discount?: number;
  discountApplied?: boolean; // indicates if `discount` is possible, or actually applied, not present if no discount
}
data class Foo(
  val type: String // describes how foo should be represented
)

data class Bar(
  val discount: String?,
  val discountApplied: Boolean? // indicates if `discount` is possible, or actually applied, not present if no discount
)
type Foo struct {
	Type string // describes how foo should be represented
}

type Bar struct {
	Discount        *string
	DiscountApplied *bool // indicates if `discount` is possible, or actually applied, not present if no discount
}
components:
  schemas:
    Foo:
      type: object
      properties:
        type:
          type: string
          description: describes how foo should be represented
    Bar:
      type: object
      properties:
        discount:
          type: string
          nullable: true
        discountApplied:
          type: boolean
          nullable: true
          description: indicates if `discount` is possible, or actually applied, not present if no discount

This helps, but couples you to additional tooling within your editor to expose without reading the source. Comments don’t protect against human issues, such as misinterpretation or plain laziness, and conformance can’t automatically be verified by build tooling.

Taking advantage of typing

Enums

A simple pattern is to use enumeration values. Explicit enumerations can reduce the need for inline error handling and allows for self-documentation through good naming. This sounds simple, but I see people forget about enums often. The key question to ask is:

Do I care about how the consumer uses this value?

If the answer is yes, a string type doesn’t capture the correct nuance.

Before

interface Foo {
  type: string;
}

function convertFooTypeToX(x: Foo): string {
  switch (x.type) {
    case "a":
      return "1";
    case "b":
      return "2";
    case "c":
      return "3";
    default:
      throw new Error(`unknown type: ${x.type}`);
  }
}

After

enum FooType {
  a = "a",
  b = "b",
  c = "c"
}

interface Foo {
  type: FooType;
}

function convertFooTypeToX(x: Foo): string {
  switch (x.type) {
    case FooType.a:
      return "1";
    case FooType.b:
      return "2";
    case FooType.c:
      return "3";
    // no default is required to satisfy type checking
  }
}

Before

data class Foo(
    var type: String
)

fun convertFooTypeToX(x: Foo): String = when (x.type) {
    "a" -> "1"
    "b" -> "2"
    "c" -> "3"
    else -> throw Exception("unknown type: ${x.type}")
}

After

enum class FooType {
    a,
    b,
    c,
}

data class Foo(
    val type: FooType
)

fun convertFooTypeToX(x: Foo): String = when (x.type) {
    FooType.a -> "1"
    FooType.b -> "2"
    FooType.c -> "3"
    // no else is required to satisfy type checking
}

Before

type Foo struct {
	Type string
}

func ConvertFooTypeToX(x Foo) string {
	switch x.Type {
	case "a":
		return "1"
	case "b":
		return "2"
	case "c":
		return "3"
	}
	panic(fmt.Sprintf("unknown type: %s", x.Type))
}

After

type FooType string

const (
	FooTypeA FooType = "a"
	FooTypeB         = "b"
	FooTypeC         = "c"
)

type Foo struct {
	Type FooType
}

func ConvertFooTypeToX(x Foo) string {
	switch x.Type {
	case FooTypeA:
		return "1"
	case FooTypeB:
		return "2"
	case FooTypeC:
		return "3"
  }
  // unfortunately, without real enums in go, we still need this
	panic(fmt.Sprintf("unknown type: %s", x.Type))
}

Before

components:
  schemas:
    Foo:
      schema:
        type: string

After

components:
  schemas:
    Foo:
      schema:
        type: string
        enum: [a, b, c]

String formats

Enums work well, but only work for explicit sets of values—they can’t be applied if I don’t know the full set of values. If I do know, adding anything new breaks backwards compatibility.

To get around this, I can use a type alias to provide more semantic information.

type HtmlType = string;
typealias HtmlType = String
type HtmlType string
components:
  schemas:
    Foo:
      schema:
        type: string
        format: html

There’s a loophole: The consumer could create their own string and pass it in place of HtmlType. Depending on the serialization library and language, it can be possible to discourage this with custom deserialization of a non-constructable type.

type HtmlType = string & { __t: "html_type" }

function read(response: Response): HtmlType {
  return response.body as HtmlType;
}
class HtmlTypeDeserializer extends JsonDeserializer<HtmlType> {
  @Override
  fun deserialize(JsonParser jp, DeserializationContext ctx) throws IOException, JsonProcessingException {
    // read contents of jp...
    return HtmlType(content)
  }
}

@JsonDeserialize(using = HtmlTypeDeserializer::class)
class HtmlType private constructor(private val content: String)
// no example in this language
# no example in this language

In the real world, it’s risky to migrate to a format from an existing enumeration. There’s often logic in the client that depends on knowledge of a specific value, and all that logic needs to be transformed into an api driven model. It’s tempting to keep it in the client, but that makes removal of the value not-backwards compatible and undocumented.

Nested nullables

A common pattern I see when designing api responses is to use a flat structure like the following.

interface Workout {
  swim_id: string | null;
  swim_miles: number | null;
  swim_time: Time | null;
  bike_id: string | null;
  bike_miles: number | null;
  bike_time: Time | null;
  run_id: string | null;
  run_miles: number | null;
  run_time: Time | null;
}
import kotlin.time.Duration;

data class Workout(
    var swim_id: String?,
    var swim_miles: String?,
    var swim_time: Duration?,
    var bike_id: String?,
    var bike_miles: Double?,
    var bike_time: Duration?,
    var run_id: String?,
    var run_miles: Double?,
    var run_time: Duration?
)
type Workout struct {
	SwimID    *string
	SwimMiles *float64
	SwimTime  *time.Duration
	BikeID    *string
	BikeMiles *float64
	BikeTime  *time.Duration
	RunID     *string
	RunMiles  *float64
	RunTime   *time.Duration
}
components:
  schemas:
    Workout:
      type: object
      properties:
        swim_id:
          type: string
          nullable: true
        swim_miles:
          type: number
          nullable: true
        swim_time:
          type: number
          nullable: true
        bike_id:
          type: string
          nullable: true
        bike_miles:
          type: number
          nullable: true
        bike_time:
          type: number
          nullable: true
        run_id:
          type: string
          nullable: true
        run_miles:
          type: number
          nullable: true
        run_time:
          type: number
          nullable: true

This leads to open questions: What happens a mix of id, miles, and time is present? Should I always expect miles to be present if id is present? How does time relate to miles?

By breaking up the flat structure these can be answered.

Here, it’s clear that an id is always paired with miles and time, and each type of activity might be missing:

interface Segment {
  id: string;
  miles: number;
  time: Time;
}

interface Workout {
  swim: Segment | null;
  bike: Segment | null;
  run: Segment | null;
}
import kotlin.time.Duration;

data class Segment(
    val id: String,
    val miles: Double,
    val time: Duration
)

data class Workout(
    var swim: Segment?,
    var bike: Segment?,
    var run: Segment?
)
type Segment struct {
	ID    string
	Miles float64
	Time  time.Duration
}

type Workout struct {
	Swim *Segment
	Bike *Segment
	Run  *Segment
}
components:
  schemas:
    Segment:
      type: object
      properties:
        id:
          type: string
        miles:
          type: number
        time:
          type: number
    Workout:
      type: object
      properties:
        swim:
          $ref: "#/components/schemas/Segment"
          nullable: true
        bike:
          $ref: "#/components/schemas/Segment"
          nullable: true
        run:
          $ref: "#/components/schemas/Segment"
          nullable: true

With this alternate structure, I can see that the other fields might be missing:

interface Segment {
  id: string;
  miles: number | null;
  time: Time | null;
}

interface Workout {
  swim: Segment | null;
  bike: Segment | null;
  run: Segment | null;
}
import kotlin.time.Duration;

data class Segment(
    val id: String,
    val miles: Double?,
    val time: Duration?
)

data class Workout(
    var swim: Segment?,
    var bike: Segment?,
    var run: Segment?
)
type Segment struct {
	ID    string
	Miles *float64
	Time  *time.Duration
}

type Workout struct {
	Swim *Segment
	Bike *Segment
	Run  *Segment
}
components:
  schemas:
    Segment:
      type: object
      properties:
        id:
          type: string
        miles:
          type: number
          nullable: true
        time:
          type: number
          nullable: true
    Workout:
      type: object
      properties:
        swim:
          $ref: "#/components/schemas/Segment"
          nullable: true
        bike:
          $ref: "#/components/schemas/Segment"
          nullable: true
        run:
          $ref: "#/components/schemas/Segment"
          nullable: true

And here, I understand that the statistics might be missing, but are tied together:

interface Segment {
  id: string;
  stats: {
    miles: number;
    time: Time;
  } | null;
}

interface Workout {
  swim: Segment | null;
  bike: Segment | null;
  run: Segment | null;
}
import kotlin.time.Duration;

data class Stats(
    val miles: Double,
    val time: Duration
)

data class Segment(
    val id: String,
    val stats: Stats?
)

data class Workout(
    var swim: Segment?,
    var bike: Segment?,
    var run: Segment?
)
type Stats struct {
	Miles float64
	Time  time.Duration
}

type Segment struct {
	ID    string
	Stats *Stats
}

type Workout struct {
	Swim *Segment
	Bike *Segment
	Run  *Segment
}
components:
  schemas:
    Stats:
      type: object
      properties:
        miles:
          type: number
        time:
          type: number
    Segment:
      type: object
      properties:
        id:
          type: string
        stats:
          $ref: "#/components/schemas/Stats"
          nullable: true
    Workout:
      type: object
      properties:
        swim:
          $ref: "#/components/schemas/Segment"
          nullable: true
        bike:
          $ref: "#/components/schemas/Segment"
          nullable: true
        run:
          $ref: "#/components/schemas/Segment"
          nullable: true

This avoids type asserations and null pointer errors: workout.swim_id!: workout.swim_id!!: panic: runtime error: invalid memory address or nil pointer dereference.

Type unions

Another type structure that can reduce ambiguity is a union type. Union types allow representing “exclusive or” in the interface and avoid locking in an inheritance structure. Kotlin’s sealed classes are a good way to implement these. This is a first class feature in Typescript.

For example, say I want to represent a figure on my site.

interface Figure {
  source: string;
  attr: string;
  height?: number;
  width?: number;
}
data class Figure(
    val source: String,
    val attr: String,
    val height: Int?,
    val width: Int?
)
type Figure struct {
	Source string
	Attr   string
	Height *int
	Width  *int
}
components:
  schemas:
    Figure:
      type: object
      properties:
        source:
          type: string
        attr:
          type: string
        height:
          type: number
          nullable: true
        width:
          type: number
          nullable: true

This works, but allows for accidentally distorting the image by specifying a height and width that doesn’t match the aspect ratio. A type union can prevent this.

type Size = {} | { height: number } | { width: number };

interface UnsizedFigure {
  source: string;
  attr: string;
}

type Figure = UnsizedFigure & Size;
sealed class Figure(
    open val source: String,
    open val attr: String
)

class UnsizedFigure(
    override val source: String,
    override val attr: String
): Figure(source, attr)

class HeightSizedFigure(
    override val source: String,
    override val attr: String,
    val height: Int
): Figure(source, attr)

class WidthSizedFigure(
    override val source: String,
    override val attr: String,
    val width: Int
): Figure(source, attr)
type UnsizedFigure struct {
	Source string
	Attr   string
}

type HeightSizedFigure struct {
	UnsizedFigure
	Height int
}

type WidthSizedFigure struct {
	UnsizedFigure
	Width int
}

type Figure interface{}
components:
  schemas:
    UnsizedFigure:
      type: object
      properties:
        source:
          type: string
        attr:
          type: string
    Size:
      oneOf:
        - type: object
          properties:
            height:
              type: number
        - type: object
          properties:
            width:
              type: number
    SizedFigure:
      allOf:
        - $ref: "#/components/schemas/UnsizedFigure"
        - $ref: "#/components/schemas/Size"
    Figure:
      oneOf:
        - $ref: "#/components/schemas/UnsizedFigure"
        - $ref: "#/components/schemas/SizedFigure"

Type unions force explicit checking of which type is used. They’re possible in many languages, but not all.

function isHeightSized(x: Size): x is { height: number } {
  return (x as { height?: number }).height != null;
}

function isWidthSized(x: Size): x is { width: number } {
  return (x as { width?: number }).width != null;
}
fun eval(figure: Figure): Int = when (figure) {
    is UnsizedFigure -> 0
    is HeightSizedFigure -> 1
    is WidthSizedFigure -> 2
}
func eval(figure Figure) int {
	switch t := figure.(type) {
	case UnsizedFigure:
		return 0
	case HeightSizedFigure:
		return 1
	case WidthSizedFigure:
		return 2
	default:
		panic(fmt.Sprintf("Unknown type: %v", t))
	}
}
# See other languages for reference implementations.

In the real world

These techniques can give great build-time safety when applied within a single codebase, but it’s harder to apply that safety across different codebases (e.g. calling an api, or using a typescript dependency in a javascript codebase). Responsibility for conforming to the contract has been absolved from the consumer and has been shifted explicitly to the producer. Since there’s less incentive for explicit protection through validation in client code, errors caused by backwards incompatibility are harder to track down. Documenting changes through the use of techniques like semver make this easy to avoid.

Introducing into an existing system

In a well-established environment it’s sometimes not possible to apply these patterns in the producer; especially when dealing with legacy systems or other teams. In these cases these patterns can be introduced in the middle of the stack, either within a fronting service or a module within the consuming client. This gives an explicit place to add validation and audit that the contract is being followed.

 

To summarize: Introducing more complexity into your interface; be it a function call, library, or api; makes it harder for consumers to make the wrong assumptions without asking them to think about logic outside of their business domain.