crafting interpreters 25.2 – compiler upvalues

the purpose of closure objects is to hold references to closed over variables. but how does it find those variables if they may or may not be on the stack? we cant rely on the exact mechanism of local resolution because locals are always guaranteed to be on the stack during a functions execution!

Since local variables are lexically scoped in Lox, we have enough knowledge at compile time to resolve which surrounding local variables a function accesses and where those locals are declared. That, in turn, means we know how many upvalues a closure needs, which variables they capture, and which stack slots contain those variables in the declaring function’s stack window.

the new abstraction introduced here is something called an upvalue. an upvalue is what the compiler sees as a closed over variable. what bob is saying above is that we can figure out exactly what our upvalues are at compile time and make sure that at runtime those variables are accessible on the vm stack

exactly how that is done is a bit more complicated and its not immediately clear when reading the section on upvalues how the implementation supports the eventual runtime variable capturing behavior. in a way he’s basically saying, here’s how we want to compile upvalues – trust me we’ll need this information at runtime when we create closures.

one of the first questions i had reading this section was, how does the vm at runtime, given these upvalue indices, differentiate between locals and upvalues? we know that locals get pushed onto the vm stack when they’re referenced by other expressions and the OP_RESOLVE_LOCAL calls index into the relative position of the stack inside call frames. but what about upvalues? not all upvalues are necessarily on the stack.

this wasn’t answered until later when he added an array of pointers to upvalues (ObjUpvalue** upvalues;) to closure objects. so these indices we’re building at compile time are going to index into that array in our closures. since these are pointers, they could be pointing at either captured that are still on the stack or maybe ones that bob eventually moves onto the heap.

from objfuncs to objclosures

at compile time, at the end of a functions block compilation we now emit a new instruction OP_CLOSURE that the VM will use at runtime to wrap our function objects within a new closure object. the idea is that we’re going to use this closure object to store references to closed over variables (upvalues).

as a refresher, each time we create a compiler instance per function declaration, we also create a new function object via newFunction.

 static void function(FunctionType type) {
    Compiler compiler;
    initCompiler(&compiler, type);
    beginScope();
    ...
    ObjFunction* function = endCompiler();
    // emit a closure instruction!
    emitBytes(OP_CLOSURE, makeConstant(OBJ_VAL(function)));
  }
  
  ...
  
 static void initCompiler(Compiler* compiler, FunctionType type) {
  compiler->enclosing = current;
  compiler->function = NULL;
  compiler->type = type;
  compiler->localCount = 0;
  compiler->scopeDepth = 0;
  compiler->function = newFunction();
  current = compiler;
  if (type != TYPE_SCRIPT) {
    current->function->name = copyString(parser.previous.start,
                                         parser.previous.length);
  }

now at the end of the function compilation we make sure to emit an OP_CLOSURE so that at runtime, we use that opcode to wrap the raw ObjFunction in a closure and push it onto the stack.

below is the disassembly of fun foo() { fun bar(){} }

> fun foo() { fun bar(){} }
== bar ==
0000    1 OP_NIL
0001    | OP_RETURN
== foo ==
0000    1 OP_CLOSURE          0 <fn bar>
0002    | OP_NIL
0003    | OP_RETURN
== <script> ==
0000    1 OP_CLOSURE          1 <fn foo>
0002    | OP_DEFINE_GLOBAL    0 'foo'
0004    2 OP_NIL
0005    | OP_RETURN
          [ <script> ]
0000    1 OP_CLOSURE          1 <fn foo>
          [ <script> ][ <fn foo> ]
0002    | OP_DEFINE_GLOBAL    0 'foo'
          [ <script> ]
0004    2 OP_NIL
          [ <script> ][ nil ]
0005    | OP_RETURN

there’s a couple of interesting things about this design choice

  • every function, regardless of whether they close over variables, will be treated like a closure at runtime. this adds both overhead through the creation of each closure function and indirection
  • closed over values are stored on the clojure instead of the function, which nicely reflects the reality that we made have multiple different closures of the same function!

calls and functions and why fixed stack locals don’t work

so far we’ve only been writing statements at the top level of the program. there’s no notion of a callable chunk of code. with the introduction of functions in chapter 24, all the current top level states like the compiler, locals, and chunks / instructions are moved into function objects

previously with locals we were effectively operating in a single function world. this effectively meant that all locals were allocated at the beginning of the global call stack. with functions that each have their own local environments, the author introduces an early idea that was implemented by fortran where different functions had their own fixed set of locals

this works if there’s no recursion and i’ll demo an example that shows why fixed, separate slots break down once you start to recurse:

fun factorial(n) {
    if (n <= 1) return 1;
    return n * factorial(n - 1);
}

factorial(3);

assume we give factorial its own fixed set of stack slots

Slot 0: parameter n
Slot 1: temporary result for multiplication (the value in slot 0 * factorial (slot 0 – 1))

now call factorial(2). this produces slot 0 = 2 and slot 1 = 2 * factorial(1)

then call factorial(1). this produces slot 0 = 1

OH CRAP, but that just overwrote slot 0 = 2 which we need to compute 2 * factorial (1) from the previous call. except now it ends up calling 1 * factorial(0) and screws up the entire expression

bob notes that fortran was able to get away with fixed stack slots simply because they didn’t support recursion!

lox vm local variables visualization

in chap 22 of crafting interpreters, bob nystrom walks through the implementation of local variables. it makes efficient use of memory by tracking local variable position and scope metadata during compilation phase and leveraging that to locate the correct value in the immediate proximity within the execution stack (where we expect all local variables to end up, unlike globals which are late bound and may be defined far away from where they’re actually used).

what i found most complicated about this chapter is the number of states you need to track and hold to understand how the compile and runtime stages work together. it helped me to write down a few essential states in trying to understand it, so i figured i translate those notes to some sort of visualization because i think it might help others too

here’s a visualization of the compile phase where we’re converting the tokens into a byte code instruction sequence (chunks). the arrow indicates the parse position where the vm is pointing to the source code and the variables on the right represent the state at that point.

side note: i didn’t bother doing character by character – i moved the arrow to positions where there are actually side-effects since not all tokens produce the sideeffects i actually care about for this demo.

and here is the runtime execution of the resulting byte code sequence. as you can see, the first thing that happens is that the literal number 13 is pushed onto the stack. every variable declaration’s value will be known at compile time.

however, notice that there is no information about what the name of that constant is. is 13 the value of “foo”? or something else? what’s cool about this implementation is that it doesn’t matter at this time because during the compilation phase, we’ve already figured out where that local is going to be on the stack for the variable foo. based on the information about locals and off sets in the previous phase, it’s going to be at position or offset 0 based on the metadata from the locals array that was getting constructed at compile time.

crafting interpreters chapter 17 notes – infix parsing with pratt parser

there’s a saying that all problems in computer science / programming can be solved by another level of indirection. in this chapter the pratt parser is a great example of that when it comes to parsing expressions such as

  • simple numeric literals i.e 1 or 2
  • single operand / prefix expressions like -1
  • binary expressions like 1 * 2 involving numeric, equality, comparison, or logical operators
  • any complex combination of the above with groupings

back in jlox, expression parsing was based on recursive descent expressions recursive descent. in this chapter, the parse sequence is driven by a special function called parsePrecedence. two new abstractions (the parse rule table and the rule lookup function) come together in the parsePrecedence function which is going to be the new entry point to expression parsing

static void parsePrecedence(Precedence precedence) {
  advance();
  ParseFn prefixRule = getRule(parser.previous.type)->prefix;
  if (prefixRule == NULL) {
    error("Expect expression.");
    return;
  }

  bool canAssign = precedence <= PREC_ASSIGNMENT;
  prefixRule(canAssign);

  while (precedence <= getRule(parser.current.type)->precedence) {
    advance();
    ParseFn infixRule = getRule(parser.previous.type)->infix;
    infixRule(canAssign);
  }

  if (canAssign && match(TOKEN_EQUAL)) {
    error("Invalid assignment target.");
  }
}

here’s a truncated example of some parse rules in our parse table. it’s a mapping of token types to a group of metadata (prefix parser, infix parser, and precedence level)

ParseRule rules[] = {
  [TOKEN_LEFT_PAREN]    = {grouping, call,   PREC_CALL},
  [TOKEN_RIGHT_PAREN]   = {NULL,     NULL,   PREC_NONE},
  [TOKEN_MINUS]         = {unary,    binary, PREC_TERM},
  [TOKEN_PLUS]          = {NULL,     binary, PREC_TERM},
  [TOKEN_NUMBER]        = {number,   NULL,   PREC_NONE},
};

unary is the prefix parsing function for the minus token. binary is the binary parsing function, and the precedence level of PREC_TERM. this is the getRule function that, given a token type, can retrieve that metadata

static ParseRule* getRule(TokenType type) {
  return &rules[type];
}

what’s unique about this approach is

  • the relevant parse function for a given token consumed via advance is fetched dynamically from the parse rule table. so given a token type of NUMBER for parser.previous.type, the first thing parsePrecedence attempts to do is locate the prefix function for that token
    • other prefix functions may themselves call back to parsePrecedence such as grouping if a left parenthesis is encountered
  • for chained expressions involving infix operators i.e 1 + 2 + 3, the current precedence level is used to continue consuming the following expressions in a left-associative manner. so parsing 1 + 2 + 3 becomes ((1 + 2) + 3)
  • addition of new tokens involves setting a new token rule for those tokens and their metadata (prefix operator, infix operator if it applies, and precedence level). the parsePrecedence function automatically obeys the precedence levels during parsing. in jlox, parsing precedence has to be carefully managed by ensuring that it’s reflected in the call sequence (top down execution where lower precedence parse functions calling higher precedence ones)

unlike recursive descent top down parsers where the syntax reflects both the grammar and precedence order (lower precedence parse targets always invoke higher precedence ones), it’s harder to visualize the call sequence in a pratt parser because the exact call sequence is only apparent during runtime through calls to parsePrecedence (which decides how far to parse on the current precedence). nevertheless this seems like a more extensible / configurable way to manage expression rules

lox vm scanner

in chapter 16 for the lox vm, the scanner implementation takes on a completely different approach compared to jlox. when we implemented jlox, the scanner did a full scan of the source file and then created all the tokens in memory for the parsing phase

in the C implementation, the file is still read but we don’t create a separate list for all the tokens by doing a full read of the file. instead the scanner refers directly to the source and we only create as many tokens as necessary (no more than 2 tokens since lox is a LLR1 type grammar that only requires a single token lookahead to uniquely identify a lexeme). this is a lazier and more memory efficient approach.

for example, here’s the scanner struct and how it’s initialized

 typedef struct {
   const char* start;
   const char* current;
   int line;
 } Scanner;

 Scanner scanner;

 void initScanner(const char* source) {
   scanner.start = source;
   scanner.current = source;
   scanner.line = 1;
 }
  • start refers to the beginning of a lexeme (say, an identifier)
  • current is the current character being scanned
  • there’s also some additional metadata like line number for debugging support

and this is the Token struct for representing a complete lexeme

typedef struct {
  TokenType type;
  const char* start;
  int length;
  int line;
} Token;
  • start is a pointer to the source – again we’re not allocating additional memory to hold token information
  • type is our special enum to things like TOKEN_IDENTIFIER

with the scanner and the token structs in place, the compiler drives the actual changes to these objects as it scans as much of the source code as it needs (and constructs tokens) to emit byte code sequences

ObjFunction* compile(const char* source) {
  initScanner(source);
  Compiler compiler;
  initCompiler(&compiler, TYPE_SCRIPT);

  parser.hadError = false;
  parser.panicMode = false;

  int line = -1;

  advance();

  while (!match(TOKEN_EOF)) {
    declaration();
  }

  ObjFunction* function = endCompiler();
  return parser.hadError ? NULL : function;
}

calls to advance and declaration both will eventually call out to scanToken which will make use of the scanner to read and construct the next token. for example if the token is a number, the compiler will emit two byte codes via a call to emitConstant(NUMBER_VAL(value));

the entire sequence of bytecodes is built this way, the compiler driving the scanner forward and emitting byte code sequences on the fly.