The Type Zoo

- 5.1. Overview
- 5.2. Unit
- 5.3. Boolean
- 5.4. Enumeration
- 5.5. Option
- 5.6. Choice
- 5.7. Pairs
- 5.8. Naturals (inductive)
- 5.9. Lists
- 5.10. Trees
- 5.11. Hiding
- 5.12. Arrays

Overview

As stated previously, the combination of algebraic data types and pattern
matching is extremely powerful. Whereas languages that lack them have to
keep various type system features completely distinct (in both concept
and implementation), the language of algebraic types is expressive enough
to support a wide range of structures with little or no "magic". Many
structures that require explicit language support in other systems can
simply be expressed in terms of algebraic types (possibly with extra
syntax support).

Unit

The Unit type is the simplest possible algebraic
type. It has one data constructor that takes no values. In impure languages
like OCaml,
the Unit type is often given as the return value
of functions that are only of interest for their side effects, such as
print_string (which is of type
string -> unit, and simply prints its argument
to standard out and returns unit). In Haskell,
the Unit type is called ()
and has a single data constructor, also called ()
[4].

Boolean

The Boolean type is an algebraic type with
two data constructors:

module Boolean where data Boolean = True | False deriving Show

This definition of Boolean type allows languages
to implement the usual if ..
then .. else
construct in terms of pattern matching, simplifying the language definition
and semantics.

Enumeration

As may or may not be obvious, it's possible to achieve the normal
enumeration types present in other languages. The data constructors of the
type are simply declared as not taking any values:

module Enumeration where data Color = Red | Blue | Green | Yellow deriving Show

Option

The Option type is
the first polymorphic type that has
been described so far. The definition of Option
is as follows:

module Option where data Option a = Some a | None deriving Show

The definition introduces a type Option, which
is parameterized by a type variable arbitrarily named
a. It defines two data constructors,
Some and None,
the first of which takes a value of type a. The
type encodes the concept of an optional value:
The programmer is required to pattern match on values of type
Option to determine whether or not a value is
present.

module OptionPresent where import Option present :: Option a -> IO () present o = case o of Some _ -> print "present" None -> print "not present"

Typically, most programs represent the above as a nullable reference type. This
leads to endless bugs involving accidentally dereferencing null references.
Programs written using Option types are
immune to this class of bugs, as it's statically impossible to get access
to a nonexistent value.

Note that as Option is polymorphic, it may
be instantiated with any other type, including other Option
types:

*Option> :type None None :: Option a *Option> :type Some Some :: a -> Option a *Option> :type Some True Some True :: Option Bool *Option> :type Some (23 :: Integer) Some (23 :: Integer) :: Option Integer *Option> :type Some (Some (23 :: Integer)) Some (Some (23 :: Integer)) :: Option (Option Integer) *Option> :type Some None Some None :: Option (Option a)

Choice

The Choice type is a logical continuation
of the Option type. It represents a choice
between values of two types:

module Choice where data Choice a b = ChoiceLeft a | ChoiceRight b deriving Show

As with Option, it may be instantiated with any
other types:

*Choice> :type ChoiceLeft ChoiceLeft :: a -> Choice a b *Choice> :type ChoiceRight ChoiceRight :: b -> Choice a b *Choice> :type ChoiceLeft True ChoiceLeft True :: Choice Bool b *Choice> :type ChoiceRight True ChoiceRight True :: Choice a Bool

Pairs

Where the Choice type represents a choice between
two possible values, the Pair type represents
a structure where values of two (possibly different) types will be present
simultaneously. The Pair type is equivalent
to the tuple type in other languages:

module Pair where data Pair a b = MakePair a b deriving Show

Naturally, Pair types can be nested, forming
arbitrarily complex structures:

*Pair> :type MakePair MakePair :: a -> b -> Pair a b *Pair> :type MakePair True MakePair True :: b -> Pair Bool b *Pair> :type MakePair True (23 :: Integer) MakePair True (23 :: Integer) :: Pair Bool Integer *Pair> :type MakePair (MakePair (23 :: Integer) True) (23 :: Integer) MakePair (MakePair (23 :: Integer) True) (23 :: Integer) :: Pair (Pair Integer Bool) Integer *Pair> :type MakePair (23 :: Integer) (MakePair (MakePair False True) (MakePair True False)) MakePair (23 :: Integer) (MakePair (MakePair False True) (MakePair True False)) :: Pair Integer (Pair (Pair Bool Bool) (Pair Bool Bool))

Most languages have shorthand syntax for constructing values of some standard library
pair type: "(x, y)".

Naturals (inductive)

Many functional languages (particularly those that act as proof assistants)
use the following definition for natural numbers:

module NaturalInd where data Natural = Z | S Natural deriving Show

This definition is taken directly from axioms 1 and 6 of the
Peano axioms,
and is probably the first type encountered by programmers upon being
introduced to most proof assistants. The type is the first recursive type
described so far; that is, a type that refers to itself in its own
definition.

*NaturalInd> :type Z Z :: Natural *NaturalInd> :type S S :: Natural -> Natural -- One *NaturalInd> :type S Z S Z :: Natural -- Two *NaturalInd> :type S (S Z) S (S Z) :: Natural -- Three *NaturalInd> :type S (S (S Z)) S (S (S Z)) :: Natural

It's also possible to copy the definition of, for example,
plus, almost directly:

module NaturalIndPlus where import NaturalInd plus :: Natural -> Natural -> Natural plus x y = case (x, y) of (n, Z) -> n (n, S m) -> S (plus n m)

-- 0 + 0 = 0 ghci> plus Z Z Z -- 0 + 1 = 1 ghci> plus Z (S Z) S Z -- 1 + 1 = 2 ghci> plus (S Z) (S Z) S (S Z) -- 2 + 2 = 4 ghci> plus (S (S Z)) (S (S Z)) S (S (S (S Z)))

Lists

The List type is a recursive type found in
just about every functional language in existence.
A List is either empty or an
element of some type prepended to another list of elements of that type.
By this definition, lists may contain elements of any type, but each
element of the list has the same type.

module List where data List a = Empty | Cell a (List a) deriving Show

Constructing values of type List works
as expected:

*List> :type Empty Empty :: List a *List> :type Cell Cell :: a -> List a -> List a *List> :type Cell True Cell True :: List Bool -> List Bool *List> :type Cell True Empty Cell True Empty :: List Bool *List> :type Cell True (Cell False Empty) Cell True (Cell False Empty) :: List Bool -- A list of lists! *List> :type Cell (Cell True Empty) Empty Cell (Cell True Empty) Empty :: List (List Bool)

Typically, the programmer writes recursive functions to manipulate
values of recursive types. Using pattern matching, it's
possible to write a function such as length
in a couple of lines:

module ListLength where list_length :: List a -> Integer list_length list = case list of Null -> 0 Cell _ rest -> 1 + (list_length rest)

As with the Pair type, Most languages have shorthand
syntax for constructing and matching on lists:
"[x, y, ...]".

Trees

The definition of a binary tree type is similarly simple (and is both polymorphic
and recursive):

module BinaryTree where data BTree a = Leaf | Tree (BTree a) a (BTree a) deriving Show

Manipulating values of these types should be unsurprising by this point:

*BinaryTree> :type Leaf Leaf :: BTree a *BinaryTree> :type Tree Leaf Tree Leaf :: a -> BTree a -> BTree a *BinaryTree> :type Tree Leaf True Tree Leaf True :: BTree Bool -> BTree Bool *BinaryTree> :type Tree Leaf True Leaf Tree Leaf True Leaf :: BTree Bool *BinaryTree> :type Tree Leaf True (Tree Leaf True Leaf) Tree Leaf True (Tree Leaf True Leaf) :: BTree Bool

Note that, as demonstrated, it's obviously possible to build binary trees
that are unbalanced. In order to preserve invariants such as "the tree must
be balanced", it's often necessary to hide the data constructors of the
given type and then use the module system of the language in question to
expose functions that manipulate values of the tree types internally
[6]. As an example, the Haskell standard library offers high performance
maps and sets based on red-black trees. The programmer is not exposed to the
internal construction of the trees that back the Map and
Set types, but internally they are composed of
nothing more than the algebraic data types described so far.

The BinaryTree type, as with all the types
discussed so far, is an example of a
persistent
data structure. Persistent data structures offer rather
large benefits with regards to both simplification of and confidence in the
correctness of the code that deals with them compared to their mutable counterparts
(particularly in heavily concurrent/parallel programs).
See Purely functional data structures by Chris Okasaki,
Cambridge University Press, 1998, ISBN 0-521-66350-4.

Hiding

As mentioned in the previous section, there may sometimes be additional
restrictions on the creation of values. For example, if the programmer wants
to define a type representing the set of integers greater than or equal to
0, it can't be achieved directly. Most
languages have multiple ways of dealing with this problem. In Haskell, the
simplest way to achieve this is to hide the data constructors of the declared
type, and provide one or more functions that enforce the desired invariants:

module Natural ( Natural, make_natural, from_natural ) where import Option data Natural = MakeNatural Integer deriving Show make_natural :: Integer -> Option Natural make_natural x = if x >= 0 then Some (MakeNatural x) else None from_natural :: Natural -> Integer from_natural n = case n of MakeNatural m -> m

The module declaration exports the Natural type,
but does not export its constructor, MakeNatural.
This effectively means that it's not possible for the programmer to construct
values of type Natural directly! However, the module also
exports a function make_natural which takes
an integer and returns an optional value based on whether the integer falls
within the desired range. It also exports a convenient function,
from_natural, which "unwraps" a
Natural value and returns the Integer
contained within [7].

Essentially, this is a technique to make a given type
abstract. That is, the structure of the type
essentially becomes private. Languages such as
OCaml allow types
to be made abstract using module signatures.
Java and Ada have the private keyword for controlling
visibility of (various parts of) types.

Arrays

Arrays do not have an inductive
structure, unlike all of the data structures described so far,
and are therefore usually provided by the programming language's
standard library as abstract types with associated
get and set
functions for accessing elements. They are mentioned here for completeness.

[4]

Apparently, to purposefully confuse new users of the language.

[5]

See
Lightweight static exceptions
for an exploration of this.

[6]

Languages with dependent types allow
the expression of types that statically guarantee that the resulting trees will
be balanced. See Certified
programming with dependent types for some impressive examples.

[7]

This function is optimized away by the compiler. In fact,
the MakeNatural constructor would be optimized away too, leaving
a plain Integer at run-time, if not for the unfortunate
presence of lazy evaluation in Haskell. Other languages do not share this deficiency.