Escape Analysis Optimization

December 4, 2017 plt

I’m making a new programming language that includes structs, and I’ve been working on automatically converting heap allocations to stack allocations.

When a struct is allocated and it doesn’t go above the function it is in (meaning it doesn’t get returned), it can be allocated on the stack. It will get freed once the function returns. Heap allocations need to be explicitly freed, meaning we can send them upwards on the call stack.

(Code in this document will be using pseudo-C syntax. new X() is a builtin that allocates X and returns a pointer to it.)

Let’s say I have a struct X that contains an integer.

struct X {
    int x;
}

And I have a function that takes an integer and wraps it in an X (returning a pointer to the X allocated).

X* f(int a) {
    X* x = new X(a);
    return x;
}

x needs to be allocated on the heap so it can be returned with being freed.

Let’s say I have another function that takes an integer, boxes it in an X, and is retrieved out of the box. This is redundant, but it is a valid use case of a stack allocated struct.

int g(int a) {
    X* x = new X(a);
    int b = x.a;
    return b;
}

x can be allocated on the stack because it is not going outside the function.

Now to the good stuff: converting heap allocations to stack allocations. First, we want to start with every allocation being a heap allocation. This gives us some leniency because if we miss a heap to stack conversion, we loose some memory. If we miss a stack to heap conversion, we get a use after free error. The latter is obviously a lot worse, so we’ll stick with the former.

Here’s a replica of the function using an IR I made for my language. In it, stack and heap allocations are explicitly different instructions. It’s kind of similar to LLVM IR, which it is converted to after the optimizations, but it operates using a “value stack”. Instructions like heap alloc generate values that are put on a stack, and then they can be stored in registers (prefixed with %).

struct X { int a }

f (%0 : int) -> X*     ; takes an int and returns a pointer to X
  heap alloc X         ; pushes a pointer to a heap allocated X to the stack
  store %1             ; stores the pointer in %1 : X*
  field set %1, 0, %0  ; sets the 0th field in %1 to %0
  return %1            ; returns the pointer stored in %1

g (%0 : int) -> int    ; takes an int and returns an int
  heap alloc X         ; pushes a pointer to a heap allocated X to the stack
  store %1             ; stores the pointer in %1
  field set %1, 0, %0  ; sets the 0th field in %1 to %0
  field get %1, 0      ; gets the 0th field in %1
  store %2             ; stores the value in %2
  return %2            ; returns the value in %2

First, we want to go through each instruction and keep track of the registers that hold an allocation. Keeping track of the index is necessary because we need to replace the heap allocation instruction with a stack allocation instruction.

(This code is in Kotlin because that is what my compiler is written in. It looks a lot like every other C language, so I’m sure you can infer things that might look strange.)

var lastValue: Pair<Int/*index*/, Instruction>? = null

instructions.forEachIndexed { index, instruction ->
    if (instruction is HeapAllocationInstruction) {
        lastValue = Pair(index, instruction)
    }
}

We want a map that maps registers to the index of the allocation instruction.

val heapInstructionLocation = mutableMapOf<Int/*register*/, Int/*index*/>()

Now we want to check store instructions to see if they are storing a heap allocation. (This is inside the forEachIndexed block.)

if (instruction is StoreInstruction) {
    if(lastValue != null && lastValue.instruction is HeapAllocationInstruction) {
        // Keep track of the register's allocation instruction index
        heapInstructionLocations.put(instruction.register, lastValue.index)
    }
}

The key thing we need to keep track of is what the allocations are being used for. We have the where, and we can get the what by making a list of allocations that haven’t gone outside the function.

val allocationsNotSeen = mutableListOf<Int/*reg*/>()

We want to add all stored allocations into this map. We can simply add them with the last check.

if (instruction is StoreInstruction) {
    if (lastValue != null && lastValue.instruction is HeapAllocationInstruction) {
        // Keep track of the register's allocation instruction index
        heapInstructionLocations.put(instruction.register, lastValue.index)

        // Add the register to the allocation list
        allocationsNotSeen.add(instruction.register)
    }
}

Now we have a list of all the allocations in the function. Next, we want to filter out the allocations that need to be heap allocated. For now, only allocations that are returned outside the functinon need to be heap allocated, so we can add a check for those in our loop.

The return instruction provides the register that holds the value we want to return. registerTypes is a map that, as you can guess, holds the types of the registers.

if (instruction is ReturnInstruction) {
    if (registerTypes[instruction.register] is Struct) {
        allocationsNotSeen.remove(instruction.register)
    }
}

Now we have a list of all allocations in our function that can be stack allocated. All we need to do now is change the heap instructions to stack instructions.

instructions is a mutable list containing the instructions in the function.

allocationsNotSeen.forEach { register : Int ->
    // Get the index from the map we created earlier
    val instructionIndex = heapInstructionLocation[register]

    // Get the instruction from the index
    val currentInstruction = instructions[instructionIndex]

    // If the instruction is a Heap Allocation, change it to a Stack Allocation
    if (currentInstruction is HeapAllocateInstruction) {
        val type = currentInstruction.type
        instruction[instructionIndex] = StackAllocateInstruction(type)
    }
}

And there we have it! This particular code won’t work for structs containing pointers to other structs, but it will properly translate heap allocations to stack allocations for structs containing primitives. Here’s what the final IR looks like after running it through the pass:

struct X { int a }

f (%0 : int) -> X*     ; takes an int and returns a pointer to X
  heap alloc X         ; pushes a pointer to a heap allocated X to the stack
  store %1             ; stores the pointer in %1 : X*
  field set %1, 0, %0  ; sets the 0th field in %1 to %0
  return %1            ; returns the pointer stored in %1

g (%0 : int) -> int    ; takes an int and returns an int
  stack alloc X        ; pushes a pointer to a stack allocated X to the stack
  store %1             ; stores the pointer in %1
  field set %1, 0, %0  ; sets the 0th field in %1 to %0
  field get %1, 0      ; gets the 0th field in %1
  store %2             ; stores the value in %2
  return %2            ; returns the value in %2

The heap allocation has correctly been identified and replaced!