Introduction to Type Theory
Your favorite programming language has the concept of types. If you’re an undergraduate learning Java or C++, you’ve probably encountered numerous errors relating to types. You’re most likely asking yourself, “What are these types good for? When I was learning Python I didn’t need types.” What I have found surprising is that a lot of programmers can’t identify when they’re using types. After their first Computer Science courses, they still don’t know what type theory is. This article serves as an introduction to type theory for people who have a small amount of programming knowledge under their belt.
The topics we will be covering are:
- Defining Types, Type Systems, & Type Theory
- Understanding how Type Theory is applicable to Programming Languages
Defining our “Terms”… haha get it?
Within Type Theory, there are two (sometimes equal) entities that we talk about: Terms and Types.
Terms
A simple definition for term would be an object that acts as a component to a formula or function.
For example, in the statement x = 2
, we define a term called x
which can be used in later
statements. Its value is an expression containing the term 2
. Here are some examples of
assignments with annotations:
x = 2 term = term y = x * 4 term = term operator term z = f(y, x) term = function(term, term)
Are functions and operators considered terms? In some languages, yes! Higher-order functions are functions that may have some parameters that are other functions. Languages that support this, like Haskell and OCaml, are treating these functions as terms.
Here’s an example in Python:
# Define a function that takes a function
# as a parameter and returns a function
def doTwice(f):
def do(x):
return f(f(x))
return do
# Create a lambda that we will pass
timesTwo = lambda x: x * 2
timesFour = twice(timesTwo)
timesFour(5) # = 20
Types
Types can be thought of as categories for terms. The most common examples would be:
- Numbers:
1
,2
,4
,pi
,e
- Strings:
"test"
- Array:
int[], String[]
- Function Type:
int foo(int a, int b)
If you’ve been writing Java or other C like languages, you’re probably used to the notation of
type term
, like in the function type example above (int a
, int b
). Functional languages
opt for the notation term : type
where :
means “is of type”. We will be using the latter
notation because that is what a lot of academia uses.
The “function” type stores the types of its parameters in the type signature. Haskell fans will
recognize this. foo : Int -> Int -> Int
defines the type of foo
as a function that takes in two
Int
s and returns an Int
(the parameter types are every type in the list except the last type).
We can also say that String : Type
, which is saying “String is a Type” (or “String is of
type Type” but that is of course a little confusing).
Type Constructor
The last examples are interesting because they are types composed of types. This is known as a type constructor. It’s like a function, but instead of taking in values and returning a value, it takes in types and returns a type.
Let’s take a look at List
. It would be appropriate to say List : Type -> Type
which means List
takes in a Type
and returns a new Type
. In Java we may write List<String>
, where String
is
acting as the parameter for the List
constructor. This gives us a new type that is separate from
every other type constructed from that type constructor. For example, List<String> != List<Int>
.
We are also free to define type constructors that take in multiple types. If you’ve dealt with large
generic classes, you’ve probably seen something like Foo<A, B, C, D>
. In a more formal notation,
we’d write this as Foo : Type -> Type -> Type -> Type -> Type
.
Kinds of Type Systems
A type system can be thought of as a set of rules that govern the correctness of programs. More concretely, if the use of types in a piece of code follows the rules of the type system, that code will compile correctly. As a familiar example, if we define variables as one type and set them equal to another type, we will get an error at compile-time.
String s = 2; // ERR
int a = "wow"; // ERR
int b = 2; // OK
String wow = "wow"; // OK
There are many kinds of type system in the wild that are implemented in real programming languages.
Static vs. Dynamic Typing
In Python, we can define a variable without declaring what type of values the variable can
hold, such as x = 2
. There is no dance to change the value that variable stores:
x = "test"
x = 4
x = b"test"
x = 2
Variables in Python can be of any type, and the type of a variable can change during runtime.
We can say the type of x
is dynamic, also known as dynamic typing.
In Java, we are not allowed to do the same. Once we declare a variable, that variable has that type forever.
String s = "test";
s = "otherTest"; // OK
s = 2; // ERR
In this code, s
is declare as a String
. You are not allowed to change the value of the
variable to something that isn’t of the same time. I can assign any String
to s
but as soon
as I try a integer, the compiler will yell at us. We call this static typing because the
type of a variable is unchanging or “static”.
Nominal vs. Structural Typing
C-like languages allow the user to define structures containing other values, usually called
struct
s or class
es. Let’s define two structures with the same values:
struct X {
int x;
int y;
};
struct Y {
int x;
int y;
};
With nominal typing, you are not allowed to use X
where Y
is expected and vice versa.
The compiler will error even if they contain the same fields.
void useX(X x) {
// ...
}
int main(void) {
struct X x = {1, 2};
struct Y y = {3, 4};
useX(x); // OK
useX(y); // ERR
return 0;
}
Some languages like OCaml allow this behaviour. The name doesn’t have to be the same, only the structure of the objects. This is called structural typing.
# let x =
object
val mutable x = 5
method get_x = x
method set_x y = x <- y
end;;
val x : < get_x : int; set_x : int -> unit > = <obj>
# let y =
object
method get_x = 2
method set_x y = Printf.printf "%d\n" y
end;;
val y : < get_x : int; set_x : int -> unit > = <obj>
# let set_to_10 a = a#set_x 10;;
val set_to_10 : < set_x : int -> 'a; .. > -> 'a = <fun>
You can see that the types of these variables are defined by the signatures of the methods
inside it. The function defined at the end, set_to_10
, takes in an object that contains a
method called set_x
. The rest of the object’s internals don’t matter.
Where does the Theory come in?
Now that we’ve looked at a few examples of type systems, you’re probably wondering where type theory comes in. Type theory is the theoretical research of new type systems that could have an impact on real world programming languages.
For example, dependent types are special kinds of types that depend on values (we will discuss this in depth later), and languages like Idris and Agda are general purpose languages that use dependent types to provide an extra layer of safety.
I hope this has served as a good introduction to type theory. In later articles, we will be diving into more complex topics.