io7m | archive (zip, signature)

1 Algebraic Types On The JVM

This document is a survey of the various implementations of algebraic data types on the JVM.

In order to provide the requisite background for discussion, the document attempts to describe the theory behind so-called algebraic types, which can be thought of as the general form of the restricted case classes typically implemented on the JVM. It makes some attempt to describe the motivation behind having the types, but does not attempt to spend much time advocating for them.

Then, an examination is made of existing JVM languages that have implemented analogous versions of the types. In addition, an attempt is made to examine implementations of the types as libraries for the Java language.

Finally, a simple pattern is described that attempts to combine the best aspects of each implementation in Java.

1.1 Algebraic Data Types

An algebraic data type in most typed functional languages is defined as a closed type with a fixed set of injective and disjoint constructors. In those languages, a constructor (more precisely, a data constructor) of type T is simply a function that takes zero or more arguments and yields a value of type T. Informally, the injective property of constructors means if two values of T are considered equal, then they must have been produced using the same constructor applied to the same arguments. The disjoint property means that if two values of T were constructed with different constructors, then they are not equal. Accordingly, it is assumed that the only way to produce a value of type T is by applying arguments to one of T's constructors. The term closed is taken to mean that all of the constructors of the type are introduced in the type definition itself. In other words, the compiler knows all of the possible constructors of a type by looking at one definition. An open type, in contrast, would be somewhat analogous to Java's classes: New subclasses of a class can be added at any time and it is not possible for the compiler to know all of the existing subclasses. The property of being closed allows the compiler to perform useful static checks, as will be shown shortly.

The classic example for an algebraic type is the modelling of simple arithmetic expressions, using Haskell syntax:

data Expression =
    Constant Integer
  | Add Expression Expression
  | Multiply Expression Expression

For those unfamiliar with Haskell syntax, the preceding definition defines a new type called Expression. The type has three data constructors: Constant, Add, and Multiply. Each constructor is a function that takes values of the specified types and yields a value of type Expression. For example, the Add constructor takes two values of type Expression. The Expression type is closed: There is no way to add additional constructors later without modifying the type. The compiler can see all of the possible ways to construct a value of type Expression and makes use of this information as will shortly be demonstrated.

Astute readers will notice that at no point were names specified for any of the parameters. The reason for this is that functional languages such as Haskell rely on pattern matching to extract the values of fields specified in types. Pattern matching is achieved using case expressions 1, as used here to implement a trivial evaluator for arithmetic expressions:

evaluate :: Expression -> Integer
evaluate x =
  case x of
    (Constant n)     -> n
    (Add e0 e1)      -> (evaluate e0) + (evaluate e1)
    (Multiply e0 e1) -> (evaluate e0) * (evaluate e1)

The above code defines a new (recursive) function called evaluate that takes a single argument that we name x. The type signature Expression -> Integer indicates that the function takes a value of type Expression and returns a value of type Integer. For those unfamilar with pattern matching, the case expression can seem quite surprising. In order to correctly read the expression, one should note that the left hand side of each branch of the case expression is a pattern that destructures a given value, optionally binding names to each field within the destructured value. The discriminee of the case expression (here, x), is matched against each of the patterns from top to bottom, stopping at the first pattern that matches.

For example, the (Add e0 e1) pattern matches all Expression values that were constructed using the Add constructor. It binds the name e0 to the first Expression argument, and the name e1 to the second. These names are only in scope on the right hand side of that branch. In this case, they are passed recursively to the evaluate function.

Evaluating the function in the ghci command-line interpreter gives the expected results:

ghci> evaluate (Add (Multiply (Constant 5) (Constant 10)) (Constant 14))
64

In Haskell, patterns can essentially be nested to provide extremely specific matches. For example, the following is a valid (although in this case, not very useful) pattern:

  (Add e0@(Add k1 k2) (Multiply k3 k4)) -> ...

The pattern matches Expression values that were constructed by passing two Expressions that were themselves constructed using the Add and Multiply constructors, respectively, to the Add constructor. The e0@(...) syntax binds the name e0 to the given subexpression so that it can be used in the right hand side of the branch, if necessary.

The compiler is capable of detecting inexhaustive patterns. For example, the following definition would be rejected at compile-time:

evaluate :: Expression -> Integer
evaluate x =
  case x of
    (Constant n) -> n
    (Add e0 e1)  -> (evaluate e0) + (evaluate e1)

The programmer has forgotten to handle the Multiply case and so the match is inexhaustive. This specific feature is often championed by users of languages such as Haskell and OCaml as one of the reasons why algebraic types are instrumental in making and keeping programs correct. When a new constructor is added to the Expression type, the compiler immediately notifies the programmer of all of the places where their code is now incomplete! In the words of Yaron Minksy in his Effective ML 2 talk:

Case analysis is a deep, ingrained part of programming. You see it everywhere. Having basic checks like the fact that your case analysis is exhaustive is very powerful. The key benefit you get from it over and over again is as a refactoring tool. As your code evolves, you can use these exhaustiveness checks to figure out where you need to fix things in response to changes in the codebase.

It's hard for me to overemphasize how central this is. Like... Every day that I program, I use this! It is one of the key things that keeps me sane. You can take lots of other features of the language away from me but with my dying breath, I will keep my exhaustiveness checks. They are, I think, the single best feature of the language. This kind of automated checking for what is just a really important part of the day-to-day part of the pragmatics of programming.

Overlapping cases are handled similarly. The following program is rejected at compile-time:

evaluate :: Expression -> Integer
evaluate x =
  case x of
    (Constant n) -> n
    (Add e0 e1)  -> (evaluate e0) + (evaluate e1)
    (Add e0 e1)  -> (evaluate e0) + (evaluate e1)

This point is somewhat less important for the purposes of enforcing program correctness, but is included here for completeness.

Finally, patterns are checked to ensure that they match the arity of the mentioned constructors. The following program is rejected at compile-time:

evaluate :: Expression -> Integer
evaluate x =
  case x of
    (Constant n)        -> n
    (Add e0 e1)         -> (evaluate e0) + (evaluate e1)
    (Multiply e0 e1 e2) -> (evaluate e0) * (evaluate e1)

The programmer has accidentally assumed that the Multiply constructor takes three arguments instead of two. This is also useful for the purposes of refactoring: If the number of arguments to a constructor changes, the compiler will highlight every case expression that now needs to be updated.

The point that should be taken from the above examples is that algebraic types and pattern matching provide two main features: destructuring and case analysis. Destructuring is the process of examining the actual structure of a value, and Case analysis is the process of of statically distinguishing between a fixed set of possible cases: "Is this expression a constant, an addition, or a multiplication?".

It is the position of the author of this document that for a language such as Java, destructuring is far less important than case analysis for the purposes of producing correct programs, and for the purposes of getting the compiler to assist during refactoring. This is particularly true when balanced against the intrusive changes that would be required to the language to support destructuring (such as the introduction of some form of pattern language to describe the structure of Java classes). In Java, we simply access fields of a given class by name (or by accessor method). In the experience of the author, the addition or removal of cases to/from a type tends to indicate a more fundamental change to the program than the addition or removal of a single field in a type, and therefore it is much more important that the programmer be told about all of the places where changes to the code are now required.

1.2 Existing JVM Language Implementations

1.2.1 Scala

The Scala language is an impure, strict, hybrid functional/object-oriented programming language for the JVM.

The language provides algebraic types via so-called case classes. It implements full structural pattern matching with exhaustiveness checking. The Scala equivalent of the Haskell Expression example given earlier is as follows:

sealed trait Expression
case class Constant (n : Int)                          extends Expression
case class Add      (e0 : Expression, e1 : Expression) extends Expression
case class Multiply (e0 : Expression, e1 : Expression) extends Expression

This example demonstrates a theme common to all of the implementations discussed here: In contrast to Haskell, where data constructors are represented by functions, all of the JVM implementations of algebraic types represent the constructors as classes. The sealed keyword indicates that the Expression class can only be extended by classes defined in the same file. This essentially closes the type and allows the compiler to perform exhaustiveness checking when pattern matching on a value of type Expression. Scala also supports non-sealed case classes, but in the opinion of the author, their utility is limited due to lacking exhaustiveness checks. As an additional note, a case class in Scala also contains automatically generated equals, hashCode and toString methods.

Pattern matching in Scala is very similar to Haskell, and full structural matching is supported:

object Expression {
  def evaluate (e : Expression) : Integer = {
    e match {
      case Constant (n)      => n
      case Add (e0, e1)      => evaluate (e0) + evaluate (e1)
      case Multiply (e0, e1) => evaluate (e0) * evaluate (e1)
    }
  }
}

The object keyword introduces a new singleton object. This is used here simply because the language does not directly support static methods. Examination of the bytecode produced by compilation shows that the above match expression is simply transformed to a chain if instanceof checks. Use of the CFR 3 tool makes this clear:

/*
 * Decompiled with CFR 0_110.
 */

public Integer evaluate(Expression e) {
  Expression expression = e;
  if (expression instanceof Constant) {
    Constant constant = (Constant)expression;
    int n = constant.n();
    return Predef..MODULE$.int2Integer(n);
  }
  if (expression instanceof Add) {
    Add add = (Add)expression;
    Expression e0 = add.e0();
    Expression e1 = add.e1();
    return Predef..MODULE$.int2Integer(Predef..MODULE$.Integer2int(this.evaluate(e0)) + Predef..MODULE$.Integer2int(this.evaluate(e1)));
  }
  if (!(expression instanceof Multiply)) throw new MatchError((Object)expression);
  Multiply multiply = (Multiply)expression;
  Expression e0 = multiply.e0();
  Expression e1 = multiply.e1();
  return Predef..MODULE$.int2Integer(Predef..MODULE$.Integer2int(this.evaluate(e0)) * Predef..MODULE$.Integer2int(this.evaluate(e1)));
}

1.2.2 Ceylon

The Ceylon language is, like Scala, an impure, strict, hybrid functional/object oriented programming language for the JVM.

Ceylon does not implement full algebraic types and pattern matching. It instead implements a subset of the types that it refers to as enumerated types. The Ceylon equivalent of the Haskell the Expression example given earlier is as follows:

abstract class Expression () of Constant | Add | Multiply { }

class Constant (n) extends Expression () {
  shared Integer n;
}

class Add (e0, e1) extends Expression () {
  shared Expression e0;
  shared Expression e1;
}

class Multiply (e0, e1) extends Expression () {
  shared Expression e0;
  shared Expression e1;
}

The main point to note is that the Expression class declaration explicitly enumerates its subclasses. This allows the compiler to perform exhaustiveness checking when matching on values of type Expression. Ceylon does not implement full pattern matching - only case analysis:

object expressions {
  Integer evaluate (Expression e) {
    switch (e)
    case (is Constant) { return e.n; }
    case (is Add)      { return evaluate (e.e0) + evaluate (e.e1); }
    case (is Multiply) { return evaluate (e.e0) * evaluate (e.e1); }
  }
}

In the above code, each case of the switch expression specifies an assertion that e is an instance of one of the subclasses of Expression. The right hand side of each case branch is written as if e had been cast to an instance of the specified subclass. Exhaustiveness checks work as expected:

object expressions {
  Integer evaluate (Expression e) {
    switch (e)
    case (is Constant) { return e.n; }
    case (is Add)      { return evaluate (e.e0) + evaluate (e.e1); }
  }
}
error: case types must cover all cases of the switch type or an else clause must appear: 'Constant|Add' does not cover 'Expression'
    switch (e)
    ^
1 error

Inspection of the bytecode demonstrates that the above code is transformed to a simple series of instanceof checks. The CFR tool produces the following Java code when executed on the bytecode produced by the Ceylon compiler:

/*
 * Decompiled with CFR 0_110.
 */

@TypeInfo(value="ceylon.language::Integer")
private final long evaluate$priv$(@Name(value="e") @TypeInfo(value="Expression") Expression e) {
  Expression sel$0 = e;
  if (sel$0 instanceof Constant) {
    Constant e$3 = (Constant)sel$0;
    return e$3.getN();
  }
  if (sel$0 instanceof Add) {
    Add e$2 = (Add)sel$0;
    return this.evaluate$priv$(e$2.getE0()) + this.evaluate$priv$(e$2.getE1());
  }
  if (sel$0 instanceof Multiply) {
    Multiply e$1 = (Multiply)sel$0;
    return this.evaluate$priv$(e$1.getE0()) * this.evaluate$priv$(e$1.getE1());
  }
  throw new EnumeratedTypeError("Supposedly exhaustive switch was not exhaustive");
}

1.2.3 Kotlin

The Kotlin language is a continuation of the theme of impure, strict, hybrid functional/object oriented programming languages for the JVM.

Like Ceylon, the Kotlin language implements only case analysis. A class can be marked as sealed in a similar manner to Scala and this fact can then be used for exhaustiveness checking in the language's analogue of pattern matching. The Kotlin equivalent of the Haskell the Expression example given earlier is as follows:

sealed class Expression {
  class Constant (val n: Int)                             : Expression ()
  class Add      (val e0: Expression, val e1: Expression) : Expression ()
  class Multiply (val e0: Expression, val e1: Expression) : Expression ()
}

The evaluate function is similarly unsurprising:

object Expressions {
  fun evaluate (e: Expression): Int =
    when (e) {
      is Expression.Constant -> e.n
      is Expression.Add      -> evaluate (e.e0) + evaluate (e.e1)
      is Expression.Multiply -> evaluate (e.e0) * evaluate (e.e1)
    }
}

The definitions are almost identical to those in Ceylon. The generated bytecode is equally unsurprising:

/*
 * Decompiled with CFR 0_110.
 */

public final int evaluate(@NotNull Expression e) {
  int n;
  Intrinsics.checkParameterIsNotNull((Object)e, (String)"e");
  Expression expression = e;
  if (expression instanceof Expression.Constant) {
    n = ((Expression.Constant)e).getN();
  } else if (expression instanceof Expression.Add) {
    n = this.evaluate(((Expression.Add)e).getE0()) + this.evaluate(((Expression.Add)e).getE1());
  } else if (expression instanceof Expression.Multiply) {
    n = this.evaluate(((Expression.Multiply)e).getE0()) * this.evaluate(((Expression.Multiply)e).getE1());
  } else {
    throw new NoWhenBranchMatchedException();
  }
  return n;
}

1.2.4 Frege

The Frege language is a lazily-evaluated, pure-functional programming language for the JVM, heavily inspired by Haskell.

As Frege more or less aspires to be a direct implementation of Haskell for the JVM, the existing Haskell Expression example compiles without any changes beyond the addition of an explicit module name:

module Expressions where

data Expression =
    Constant Integer
  | Add Expression Expression
  | Multiply Expression Expression

evaluate :: Expression -> Integer
evaluate x =
  case x of
    (Constant n)     -> n
    (Add e0 e1)      -> (evaluate e0) + (evaluate e1)
    (Multiply e0 e1) -> (evaluate e0) * (evaluate e1)

Although the Frege language follows Haskell in representing data constructors as functions in the language itself, it actually transforms the constructors to individual classes in a similar manner to all of the languages surveyed so far. The evaluate function is transformed into the following bytecode:

/*
 * Decompiled with CFR 0_110.
 */

public static final BigInteger evaluate(TExpression tExpression) {
  TExpression.DConstant dConstant = tExpression._Constant();
  if (dConstant != null) {
    BigInteger bigInteger = (BigInteger)Delayed.forced((Object)dConstant.mem1);
    return bigInteger;
  }
  TExpression.DAdd dAdd = tExpression._Add();
  if (dAdd != null) {
    TExpression tExpression2 = (TExpression)dAdd.mem2.forced();
    TExpression tExpression3 = (TExpression)dAdd.mem1.forced();
    return Expressions.evaluate(tExpression3).add(Expressions.evaluate(tExpression2));
  }
  TExpression.DMultiply dMultiply = tExpression._Multiply();
  assert (dMultiply != null);
  TExpression tExpression4 = (TExpression)dMultiply.mem2.forced();
  TExpression tExpression5 = (TExpression)dMultiply.mem1.forced();
  return Expressions.evaluate(tExpression5).multiply(Expressions.evaluate(tExpression4));
}

In this case, a somewhat different representation is used than a chain of instanceof checks: The TExpression case contains one method per data constructor, and each data constructor class implements this interface. For example, the generated DConstant class (the class generated for the Constant constructor) is as follows:

/*
 * Decompiled with CFR 0_110.
 */

public static final class DConstant extends Algebraic implements TExpression {
  public final Object mem1;

  private DConstant(Object object) {
    this.mem1 = object;
  }

  public final int _constructor() {
    return 0;
  }

  public static final TExpression mk(Object object) {
    return new DConstant(object);
  }

  @Override
  public final DConstant _Constant() {
    return this;
  }

  @Override
  public final DAdd _Add() {
    return null;
  }

  @Override
  public final DMultiply _Multiply() {
    return null;
  }
}

The DConstant class only returns a reference to itself via the _Constant method. The other classes work similarly. In effect, a chain of instanceof checks is transformed into a chain of non-null checks.

1.3 Java Representations

Various attempts have been made to retrofit algebraic types into the Java language. The main requirements appear to be that pattern matching should:

  • Not allocate memory 4.

  • Operate in O(1) time, if possible.

  • Become a compile-time error if cases are added or removed from the discriminee type.

1.3.1 Enumerations and Casts

One representation of algebraic types in Java attempts to re-use the language's enum types. First, the Expression type:

import java.util.Objects;

abstract class Expression
{
  enum Type
  {
    TYPE_CONSTANT,
    TYPE_ADD,
    TYPE_MULTIPLY
  }

  private final Type type;

  private Expression(final Type t)
  {
    this.type = Objects.requireNonNull(t);
  }

  public final Type type()
  {
    return this.type;
  }

  public final class Constant extends Expression
  {
    public final int n;

    public Constant(int i)
    {
      super(Type.TYPE_CONSTANT);
      this.n = i;
    }
  }

  public final class Add extends Expression
  {
    public final Expression e0;
    public final Expression e1;

    public Add(final Expression in_e0, final Expression in_e1)
    {
      super(Type.TYPE_ADD);
      this.e0 = Objects.requireNonNull(in_e0);
      this.e1 = Objects.requireNonNull(in_e1);
    }
  }

  public final class Multiply extends Expression
  {
    public final Expression e0;
    public final Expression e1;

    public Multiply(final Expression in_e0, final Expression in_e1)
    {
      super(Type.TYPE_MULTIPLY);
      this.e0 = Objects.requireNonNull(in_e0);
      this.e1 = Objects.requireNonNull(in_e1);
    }
  }
}

The code assigns one enum value to each class extending the base Expression type. The code attempts to make the Expression type closed by making the constructor private. The compiler does not understand that the type is supposed to be a closed type, it merely happens as a side effect of the language's visibility rules.

The evaluate method requires dangerous casts:

public class Expressions
{
  public static int evaluate(Expression e)
  {
    switch (e.type())
    {
      case TYPE_CONSTANT:
      {
        Expression.Constant c = (Expression.Constant) e;
        return c.n;
      }
      case TYPE_ADD:
      {
        Expression.Add a = (Expression.Add) e;
        return evaluate (a.e0) + evaluate (a.e1);
      }
      case TYPE_MULTIPLY:
      {
        Expression.Multiply m = (Expression.Multiply) e;
        return evaluate (m.e0) * evaluate (m.e1);
      }
    }

    throw new IllegalStateException("Unreachable code");
  }
}

Advantages of this approach:

  • The generated bytecode is very efficient due to the compiler's handling of enum types: The code is transformed to a tableswitch instruction, which operates on integers. At no point are any allocations performed, and the performance should be O(1) if the tableswitch instruction is compiled to code that is O(1).

  • Some degree of exhaustiveness checking is performed: If a case is added to or removed from the enum type, each switch statement that examines it will become a compilation error.

  • The switch statement is fairly syntactically light.

Disadvantages of this approach:

  • The programmer is expected to know that each enum case corresponds to a particular Expression subclass and must manually cast each time.

  • Even for enum types, the Java compiler cannot prove that no statements beyond each of the switch branches are reachable. The programmer must manually add a non-terminating statement such as a throw to indicate code that is assumed to be unreachable.

  • The actual Expression declaration contains a lot of boilerplate with the potential for programmers to assign the wrong enum value in the constructor of each case class.

1.3.2 Generic Visitors

Another approach uses a so-called generic visitor type 5. A given type is paired with a visitor type (typically just an interface) and case analysis is achieved by passing an instance of this visitor type to a match method declared on each expression class. There are mutual dependencies between the expression class, the visitor type, and each expression case class, so they are presented here in one code segment:

import java.util.Objects;

interface Expression
{
  <A> A match(ExpressionVisitor<A> m);
}

interface ExpressionVisitor<A>
{
  A onConstant(Constant e);

  A onAdd(Add e);

  A onMultiply(Multiply e);
}

final class Constant implements Expression
{
  public final int n;

  Constant(int i)
  {
    this.n = i;
  }

  public <A> A match(ExpressionVisitor<A> m)
  {
    return m.onConstant(this);
  }
}

final class Add implements Expression
{
  public final Expression e0;
  public final Expression e1;

  Add(final Expression ie0, final Expression ie1)
  {
    this.e0 = Objects.requireNonNull(ie0);
    this.e1 = Objects.requireNonNull(ie1);
  }

  public <A> A match(ExpressionVisitor<A> m)
  {
    return m.onAdd(this);
  }
}

final class Multiply implements Expression
{
  public final Expression e0;
  public final Expression e1;

  Multiply(final Expression ie0, final Expression ie1)
  {
    this.e0 = Objects.requireNonNull(ie0);
    this.e1 = Objects.requireNonNull(ie1);
  }

  public <A> A match(ExpressionVisitor<A> m)
  {
    return m.onMultiply(this);
  }
}

The definition of the Expression interface is such that subclasses are required to implement the match method. The definition of the ExpressionVisitor interface is such that a method is provided for each possible subclass and no others. The only way for a (terminating) match method to produce a value of an arbitrary type A is to call one of the methods on the given ExpressionVisitor type, passing the implementing class as an argument.

Pattern matching involves constructing a (possibly anonymous) instance of type ExpressionVisitor and passing it to the match method of a given Expression. This is how the evaluate function is implemented:

public class Expressions
{
  public static int evaluate (Expression e)
  {
    return e.match(new ExpressionVisitor<Integer>() {
      @Override public Integer onConstant(Constant c)
      {
        return c.n;
      }

      @Override public Integer onAdd(Add a)
      {
        return evaluate (a.e0) + evaluate (a.e1);
      }

      @Override public Integer onMultiply(Multiply a)
      {
        return evaluate (a.e0) * evaluate (a.e1);
      }
    });
  }
}

Advantages of this approach:

  • There are fewer opportunities for mistakes. The type system requires the programmer to implement certain methods, and barring non-termination, it is easier for the programmer to implement them correctly than incorrectly.

  • Upon adding a case to the Expression class, all instances of type ExpressionVisitor now need to be updated. Essentially, this provides the desired exhaustiveness checking.

  • It is O(1) in terms of time. Each match statement compiles down to a simple invokeinterface instruction. The evaluate method compiles down to a new instruction followed by an invokeinterface instruction.

Disadvantages of this approach:

  • Every pattern match incurs an allocation. Whether the JIT may or may not be able to eliminate those allocations is entirely JVM-specific 6. In low-latency code that is trying to produce as little garbage as possible, this can mean that the types are outright unusable.

  • The anonymous class instance syntax is verbose. At the very minimum, the programmer is expected to write out the full definition of the ExpressionVisitor class every time pattern matching is performed.

  • If code inside a pattern match needs to throw a checked exception - it cannot! The visitor type must be extended with another type parameter E with an upper bound of Throwable or Exception, and each visitor method must be annotated with throws E.

An issue that is either an advantage or a disadvantage depending on your point of view is that nothing prevents a non-final case class from being added. This can actually be used to useful effect, as described in Type Tricks 3 - Generic Visitors 7, but could also result in unintended side effects if the programmer designing the API forgets to add a final keyword to one or more classes.

1.3.3 Catamorphisms

Catamorphism is a term from category theory8. Without attempting a long-winded and tedious description of the exact meaning, it can informally be considered as a higher order function that accepts one function-typed parameter for each constructor of an algebraic type. The only practical difference between this formulation and the generic visitor approach described above is the use of individual functions as opposed to anonymous class instances containing sets of methods. The two approaches are morally equivalent.

import java.util.Objects;
import java.util.function.Function;

public interface Expression
{
  <A> A match (
    Function<Constant, A> constant,
    Function<Add, A> add,
    Function<Multiply, A> multiply
  );
}

final class Constant implements Expression
{
  public final int n;

  Constant(int i)
  {
    this.n = i;
  }

  public <A> A match(
    Function<Constant, A> constant,
    Function<Add, A> add,
    Function<Multiply, A> multiply)
  {
    return constant.apply(this);
  }
}

final class Add implements Expression
{
  public final Expression e0;
  public final Expression e1;

  Add(final Expression ie0, final Expression ie1)
  {
    this.e0 = Objects.requireNonNull(ie0);
    this.e1 = Objects.requireNonNull(ie1);
  }

  public <A> A match(
    Function<Constant, A> constant,
    Function<Add, A> add,
    Function<Multiply, A> multiply)
  {
    return add.apply(this);
  }
}

final class Multiply implements Expression
{
  public final Expression e0;
  public final Expression e1;

  Multiply(final Expression ie0, final Expression ie1)
  {
    this.e0 = Objects.requireNonNull(ie0);
    this.e1 = Objects.requireNonNull(ie1);
  }

  public <A> A match(
    Function<Constant, A> constant,
    Function<Add, A> add,
    Function<Multiply, A> multiply)
  {
    return multiply.apply(this);
  }
}

The evaluate method is somewhat refreshingly terse 9:

public class Expressions
{
  public static int evaluate (Expression e)
  {
    return e.match(
      (Constant constant) -> constant.n,
      (Add add)           -> evaluate (add.e0) + evaluate (add.e1),
      (Multiply multiply) -> evaluate (multiply.e0) * evaluate (multiply.e1));
  }
}

Advantages of this approach:

  • As with the generic visitor approach, there are few opportunities for mistakes. The type system requires the programmer to implement certain methods, and barring non-termination, it is easier for the programmer to implement them correctly than incorrectly.

  • It is O(1) in terms of time. Each match call appears to be compiled down to three invokedynamic calls followed by a single invokeinterface call. The invokedynamic calls are due to the current implementation of lambda functions on the JVM 10.

  • It appears to be better in terms of allocations than the generic visitor approach. The evaluate method is transformed into a form that appears to avoid allocations by using static lambda instances. Recent work on the JVM has supposedly been focused on having the JIT compiler aggressively inline and cache lambdas for performance reasons, and this approach directly benefits from that work.

Disadvantages of this approach:

  • If a case class is added or removed, every single case class must now be updated in addition to every match call, due to the fact that the match method mentions all of the possible cases.

  • The Java language specification limits methods to 256 parameters. This effectively limits algebraic types to 256 cases or fewer. Whether or not this is likely to be a problem is debatable.

  • If code inside a pattern match needs to throw a checked exception - it cannot! The passed-in function types must be extended with another type parameter E with an upper bound of Throwable or Exception, and the match method must be annotated with throws E.

Finally, this approach is somewhat less useful on versions of Java prior to Java 8. Without syntax support for function instances, calls to match typically end up becoming more verbose than the generic visitor approach.

1.4 Existing Java Library Implementations

1.4.1 adt4j

The adt4j package is an implementation of algebraic types in Java.

It is difficult to give a description of the package without simply duplicating the documentation presented in the adt4j repository. Suffice to say, it is an annotation processor that transforms abstract classes or interfaces into a set of classes using the generic visitor formulation for pattern matching. The adt4j documentation provides an example analogous to the Expression type used in this document so far:

@GenerateValueClassForVisitor
@Visitor(resultVariableName="R", selfReferenceVariableName="E")
interface ExpressionVisitor<E, R> {
  @GeneratePredicate(name = "isLiteral");
  R lit(int i);

  R sum(@Getter(name = "leftOperand") E e1, @Getter(name = "rightOperand") E e2);
  R mul(@Getter(name = "leftOperand") E e1, @Getter(name = "rightOperand") E e2);
}

An example of the evaluate function is also provided:

int value = e.accept(new ExpressionVisitor<Expression, Integer>() {
  @Override
  public Integer int(int i) {
    return i;
  }
  @Override
  public Integer sum(Expression e1, Expression e2) {
    return e1.accept(this) + e2.accept(this);
  }
  @Override
  public Integer mul(Expression e1, Expression e2) {
    return e1.accept(this) * e2.accept(this);
  }
});

The package protects the programmer from some of the verbosity of the type declarations required to implement the generic visitor approach, but otherwise has the same advantages and disadvantages as documented.

1.4.2 derive4j

The derive4j package is another implementation of algebraic types in Java.

The package has the programmer define a container class with a generic visitor interface called Cases:

@Data
public abstract class Expression {
  interface Cases<R> {
    R Constant (Integer value);
    R Add (Expression left, Expression right);
    R Multiply (Expression left, Expression right);
  }

  public abstract <R> R match(Cases<R> cases);
}

The package then generates a rather alarming amount of code 11 that defines a so-called fluent interface that allows the programmer to build, for example, the evaluate function:

private static Function<Expression, Integer> eval = Expressions.cases()
  .Constant (value         -> value)
  .Add      ((left, right) -> eval (left) + eval (right))
  .Mult     ((left, right) -> eval (left) * eval (right));

public static Integer evaluate (Expression expression)
{
  return eval.apply(expression);
}

Suffice to say, the generated code uses aspects of both the generic visitor and catamorphism approaches. An inspection of the generated code seems to suggest that every match call will entail not only the allocation of multiple anonymous Cases class instances, but a set of lambda instances too. The example code appears to work around this issue by declaring static instances and reusing them (as the evaluate example above demonstrates).

Update: The author of derive4j, JB Giraudeau, kindly contacted me to clarify a few aspects of the above:

Firstly, catamorphisms are only defined for non-recursive types, and derive4j derives catamorphisms as shown in the following lines:

https://gist.github.com/jbgi/3904e696fb27a2e33ae1#file-expressions-java-L61

Additionally, derive4j supports both the visitor and catamorphism forms as user-visible matching options, so users may pick which approach they want to use. The derive4j implementation generates 2N classes for a type with N cases, and supports more advanced features like (safe) partial pattern matches. Checked exceptions are not supported because a typical functional programming approach will instead use structures such as the Either12 monad for error handling.

Finally, derive4j does avoid allocations on visitor-based matching by storing static references to visitors - the allocation is performed once and then is not repeated for subsequent matches.

1.4.3 javagadt

The javagadt package is another implementation of algebraic types in Java.

Rather than relying on annotated abstract classes, it defines a description language that is used to generate Java classes directly:

data Expression
  = Constant (n  : Integer)
  | Add      (e0 : Expression, e1 : Expression)
  | Multiply (e0 : Expression, e1 : Expression)

The package uses the generic visitor representation for the generated classes, with an unfortunate use of method overloading for the generated visitor:

public interface Matcher<A> {
  A match(Expression.Constant c);
  A match(Expression.Add a);
  A match(Expression.Multiply m);
}

int r = e.match(new Matcher<Integer>() {
@Override
  public Integer match(Expression.Constant c) {
    return c.n();
  }

  @Override
  public Integer match(Expression.Add a) {
    return a.e0() + a.e1();
  }

  @Override
  public Integer match(Expression.Multiply m) {
    return m.e0() + m.e1();
  }
});

As expected, the package has the same advantages and disadvantages inherent in the generic visitor representation.

1.4.4 Weaknesses

The library implementations of algebraic types in Java all suffer from the same weaknesses due to lack of support two concepts in Java: The ability to make a set of classes closed, and the ability to efficiently select one class from a closed set in O(1) time and without allocating memory.

1.5 Implementation

A Java implementation that combines as many of the advantages (with as few of the disadvantages as possible) of each possible encoding of algebraic types is fairly straightforward. The catamorphism encoding is used as a base, with a slight modification that should guarantee (on current JVM implementations) that pattern matching will never incur an allocation.

On current JVM implementations, a lambda expression is only guaranteed not to incur an allocation if it does not capture any local variables or fields. It is possible in the majority of cases to avoid capturing variables by threading an explicit context value through the lambda applications. Code that previously looked like this:

interface Producer
{
  int apply();
}

static int runner(Producer f)
{
  return f.apply();
}

private String data;

int f(int x)
{
  return runner(() -> x + Integer.parseInt(this.data));
}

Is rewritten to allow passing of contextual values explicitly:

interface Producer<T>
{
  int apply(T context);
}

static <T> int runner(T context, Producer<T> f)
{
  return f.apply(context);
}

private String data;
private int z;

int f(int x)
{
  this.z = x;
  return runner(this, (t) -> t.z + Integer.parseInt(t.data));
}

The lambda expression passed to runner no longer captures any variables, and an inspection of the bytecode shows a simple transformation to a static method call:

private String data;
private int z;

static <T> int runner(T t, Producer<T> producer) {
  return producer.apply(t);
}

int f(int n) {
  this.z = n;
  return Test3.runner(this, (Producer<Test3>)LambdaMetafactory.metafactory(null, null, null, (Ljava/lang/Object;)I, lambda$f$0(Test3 ), (LTest3;)I)());
}

private static /* synthetic */ int lambda$f$0(Test3 test3) {
  return test3.z + Integer.parseInt(test3.data);
}

static interface Producer<T> {
  public int apply(T var1);
}

The first part of the implementation therefore requires the creation of a function type used for matching:

interface Matcher<T, C, A, E extends Exception>
{
  A apply(T case_type, C context) throws E;
}

The T parameter corresponds to the type of a specific case. The C parameter corresponds to the type of context values. The A parameter denotes the type of returned values, and the E parameter denotes the type of exceptions that may be raised.

Then, for a type K with three cases C0, C1, and C2:

interface K
{
  <C, A, E extends Exception> A match(
    C context,
    Matcher<C0, C, A, E> on_c0,
    Matcher<C1, C, A, E> on_c1,
    Matcher<C2, C, A, E> on_c2)
    throws E;
}

class C0 implements K
{
  private final String data;

  C0(String x)
  {
    this.data = x;
  }

  public <C, A, E extends Exception> A match(
    C context,
    Matcher<C0, C, A, E> on_c0,
    Matcher<C1, C, A, E> on_c1,
    Matcher<C2, C, A, E> on_c2)
    throws E
  {
    return on_c0.apply(this, context);
  }
}

class C1 implements K
{
  private final Integer data;

  C1(Integer x)
  {
    this.data = x;
  }

  public <C, A, E extends Exception> A match(
    C context,
    Matcher<C0, C, A, E> on_c0,
    Matcher<C1, C, A, E> on_c1,
    Matcher<C2, C, A, E> on_c2)
    throws E
  {
    return on_c1.apply(this, context);
  }
}

class C2 implements K
{
  private final Boolean data;

  C2(Boolean x)
  {
    this.data = x;
  }

  public <C, A, E extends Exception> A match(
    C context,
    Matcher<C0, C, A, E> on_c0,
    Matcher<C1, C, A, E> on_c1,
    Matcher<C2, C, A, E> on_c2)
    throws E
  {
    return on_c2.apply(this, context);
  }
}

Then, for matching on a value of type K:

private final String extra;

String doSomething(K k)
  throws IOException
{
  return k.match (this,
    (c0, t) -> c0.data + t.extra,
    (c1, t) -> { throw new IOException("Ouch!"); },
    (c2, t) -> c2.data + t.extra);
}

In summary, the advantages of this encoding are:

  • As with the generic visitor approach, there are few opportunities for mistakes. The type system requires the programmer to implement certain methods, and barring non-termination, it is easier for the programmer to implement them correctly than incorrectly.

  • It is O(1) in terms of time. Each match call appears to be compiled down to three invokedynamic calls followed by a single invokeinterface call. The invokedynamic calls are due to the current implementation of lambda functions on the JVM 13.

  • With an explicit context parameter and non-capturing lambdas, the code is guaranteed to be better in terms of allocations than any of the other Java encodings. The use of lambdas over anonymous class instances ensures that this code benefits from any future improvements to the JVM infrastructure used to support them.

  • Functions passed to the match call can raise checked exceptions via the E type parameter.

  • The encoding has the most terse matching syntax out of any of the encodings.

The disadvantages of this encoding are:

  • If a case class is added or removed, every single case class must now be updated in addition to every match call, due to the fact that the match method mentions all of the possible cases.

  • The Java language specification limits methods to 256 parameters. This effectively limits algebraic types to 256 cases or fewer. Whether or not this is likely to be a problem is debatable.


  1. This is not the only way to perform pattern matching in the Haskell language, but for the sake of example, we assume that it is.

  2. An absolutely critical talk for any programmer interested in producing code that is correct: https://blogs.janestreet.com/effective-ml-video/

  3. http://www.benf.org/other/cfr/

  4. Consider what would happen if every day control structures such as conditionals and loops required creating new objects on the heap just to function...

  5. http://io7m.com/documents/tt3-genvis/

  6. Consider what would happen if every day control structures such as conditionals and loops required creating new objects on the heap just to function...

  7. http://io7m.com/documents/tt3-genvis/

  8. https://en.wikipedia.org/wiki/Category_theory

  9. Note that the left hand side of each "match branch" could be reduced further (at the risk of making the code potentially incomprehensible) due to type inference for lambda parameters.

  10. "From Lambdas to Bytecode", Brian Goetz: http://wiki.jvmlangsummit.com/images/1/1e/2011_Goetz_Lambda.pdf

  11. Around 400 lines: https://gist.github.com/jbgi/3904e696fb27a2e33ae1

  12. A reasonably readable description of the Either monad (amongst others) is given here: http://book.realworldhaskell.org/read/error-handling.html

  13. "From Lambdas to Bytecode", Brian Goetz: http://wiki.jvmlangsummit.com/images/1/1e/2011_Goetz_Lambda.pdf