Interpreter Exception Handling, Part 3

Today we tackle the finally block. The trick with the finally block, is that it must always run. That’s not a surprise in and of itself, but remember, that doesn’t just apply to the exception handler that’s currently in scope, it applies to every single exception handler that has a finally block above the place where the exception is eventually handled.

This post is part 3 of 3 of Interpreter Exception Handling – here is part 1 and part 2.

So we need to know which exception handlers currently in scope have a finally block attached. We already know which handlers are in scope for each stack frame, but haven’t made any allowance for ‘finally’ stuff. Let’s fix that. First we need to detect the syntax, so here are the normal bits:

--- a/c/compiler.c
+++ b/c/compiler.c
@@ -956,6 +956,7 @@ ParseRule rules[] = {
 //> Types of Values table-false
   [TOKEN_FALSE]         = {literal,  NULL,   PREC_NONE},
 //< Types of Values table-false
+  [TOKEN_FINALLY]       = {NULL,     NULL,   PREC_NONE},
   [TOKEN_FOR]           = {NULL,     NULL,   PREC_NONE},
   [TOKEN_FUN]           = {NULL,     NULL,   PREC_NONE},
   [TOKEN_IF]            = {NULL,     NULL,   PREC_NONE},
--- a/c/scanner.c
+++ b/c/scanner.c
@@ -162,6 +162,7 @@ static TokenType identifierType() {
       if (scanner.current - scanner.start > 1) {
         switch (scanner.start[1]) {
           case 'a': return checkKeyword(2, 3, "lse", TOKEN_FALSE);
+          case 'i': return checkKeyword(2, 5, "nally", TOKEN_FINALLY);
           case 'o': return checkKeyword(2, 1, "r", TOKEN_FOR);
           case 'u': return checkKeyword(2, 1, "n", TOKEN_FUN);
         }
--- a/c/scanner.h
+++ b/c/scanner.h
@@ -21,7 +21,7 @@ typedef enum {
 
   // Keywords.
   TOKEN_AND, TOKEN_AS, TOKEN_CATCH, TOKEN_CLASS, TOKEN_ELSE, TOKEN_FALSE,
-  TOKEN_FOR, TOKEN_FUN, TOKEN_IF, TOKEN_NIL, TOKEN_OR,
+  TOKEN_FINALLY, TOKEN_FOR, TOKEN_FUN, TOKEN_IF, TOKEN_NIL, TOKEN_OR,
   TOKEN_PRINT, TOKEN_RETURN, TOKEN_SUPER, TOKEN_THIS,
   TOKEN_THROW, TOKEN_TRUE, TOKEN_TRY, TOKEN_VAR, TOKEN_WHILE,

Previously I said that the propagate exception function would come in handy during this post. And it’s kind of pivotal to the finally block. Because as we move through the stack frames, we need to check for the finally block, execute it – and here’s the catch (pun intended) – continue stepping through the stack exactly after the finally block, even if there is more code in the function. To make that happen, I’m going to introduce an instruction to continue the exception propagation. Unsurprisingly, I will call it: OP_PROPAGATE_EXCEPTION

--- a/c/chunk.h
+++ b/c/chunk.h
@@ -103,6 +103,7 @@ typedef enum {
   OP_THROW,
   OP_PUSH_EXCEPTION_HANDLER,
   OP_POP_EXCEPTION_HANDLER,
+  OP_PROPAGATE_EXCEPTION,
 } OpCode;
 //< op-enum
 //> chunk-struct
--- a/c/debug.c
+++ b/c/debug.c
@@ -68,8 +68,10 @@ static int exceptionHandlerInstruction(const char *name, Chunk *chunk, int offse
     uint8_t type = chunk->code[offset + 1];
     uint16_t handlerAddress = (uint16_t)(chunk->code[offset + 2] << 8);
     handlerAddress |= chunk->code[offset + 3];
-    printf("%-16s %4d -> %d\n", name, type, handlerAddress);
-    return offset + 4;
+    uint16_t finallyAddress = (uint16_t)(chunk->code[offset + 4] << 8);
+    finallyAddress |= chunk->code[offset + 5];
+    printf("%-16s %4d -> %d, %d\n", name, type, handlerAddress, finallyAddress);
+    return offset + 6;
 }
 //> disassemble-instruction
 int disassembleInstruction(Chunk* chunk, int offset) {
@@ -234,6 +236,8 @@ int disassembleInstruction(Chunk* chunk, int offset) {
       return exceptionHandlerInstruction("OP_PUSH_EXCEPTION_HANDLER", chunk, offset);
     case OP_POP_EXCEPTION_HANDLER:
       return simpleInstruction("OP_POP_EXCEPTION_HANDLER", offset);
+    case OP_PROPAGATE_EXCEPTION:
+      return simpleInstruction("OP_PROPAGATE_EXCEPTION", offset);
     default:
       printf("Unknown opcode %d\n", instruction);
       return offset + 1;

Then in the compiler, first we find the finally block, we make sure the runtime knows where to find it, in case it needs to be executed outside of the normal flow. There’s a little trick in there that means we will only execute the OP_PROPAGATE_EXCEPTION under the right conditions (which depend on how we enter the block). Then it’s the statement to actually be the block – which can be either a single statement or a whole block. Crucially, a statement doesn’t leave any result left on the stack (unlike an expression), so we know that there’s nothing on the top of the stack after the execution of that statement except for the boolean we placed there, earlier.

--- a/c/compiler.c
+++ b/c/compiler.c
@@ -1404,6 +1405,8 @@ static void tryCatchStatement() {
   emitByte(0xff);
   int handlerAddress = currentChunk(current)->count;
   emitBytes(0xff, 0xff);
+  int finallyAddress = currentChunk(current)->count;
+  emitBytes(0xff, 0xff);
 
   statement();
 
@@ -1419,14 +1422,34 @@ static void tryCatchStatement() {
       emitByte(OP_POP_EXCEPTION_HANDLER);
       statement();
   }
   patchJump(successJump);
+
+  if (match(TOKEN_FINALLY))
+  {
+    // If we arrive here from either the try or handler blocks, then we don't
+    // want to continue propagating the exception
+    emitByte(OP_FALSE);
+
+    patchAddress(finallyAddress);
+    statement();
+
+    int continueExecution = emitJump(OP_JUMP_IF_FALSE);
+    emitByte(OP_POP); // Pop the bool off the stack
+    emitByte(OP_PROPAGATE_EXCEPTION);
+    patchJump(continueExecution);
+    emitByte(OP_POP);
+  }
 }
 
 //> Global Variables synchronize

And now we hook it all together. Teach the runtime to add the finally address to the exception handler stack, and then add the very humble propagate exception implementation, which essentially just calls the propagateException() function.

--- a/c/vm.h
+++ b/c/vm.h
@@ -29,6 +29,7 @@
 
 typedef struct {
     uint16_t handlerAddress;
+    uint16_t finallyAddress;
     Value klass;
 } ExceptionHandler;
 
--- a/c/vm.c
+++ b/c/vm.c
@@ -197,6 +197,7 @@ bool instanceof(ObjInstance *instance, Value klass)
 }
 
 bool propagateException(void) {
+  #define PLACEHOLDER_ADDRESS 0xffff
   ObjInstance *exception = AS_INSTANCE(peek(0));
   while (vm.frameCount > 0) {
     CallFrame *frame = &vm.frames[vm.frameCount - 1];
@@ -206,6 +207,12 @@ bool propagateException(void) {
         frame->ip = &frame->closure->function->chunk.code[handler.handlerAddress];
         return true;
       }
+      else if (handler.finallyAddress != PLACEHOLDER_ADDRESS)
+      {
+        push(TRUE_VAL); // continue propagating once the finally block completes
+        frame->ip = &frame->closure->function->chunk.code[handler.finallyAddress];
+        return true;
+      }
     }
     vm.frameCount--;
   }
@@ -218,13 +225,14 @@ bool propagateException(void) {
   return false;
 }
 
-void pushExceptionHandler(Value type, uint16_t handlerAddress) {
+void pushExceptionHandler(Value type, uint16_t handlerAddress, uint16_t finallyAddress) {
   CallFrame *frame = &vm.frames[vm.frameCount - 1];
   if (frame->handlerCount == MAX_HANDLER_FRAMES) {
     runtimeError("Too many nested exception handlers in one function");
     return;
   }
   frame->handlerStack[frame->handlerCount].handlerAddress = handlerAddress;
+  frame->handlerStack[frame->handlerCount].finallyAddress = finallyAddress;
   frame->handlerStack[frame->handlerCount].klass = type;
   frame->handlerCount++;
 }
@@ -962,19 +970,29 @@ static InterpretResult run() {
       case OP_PUSH_EXCEPTION_HANDLER: {
         ObjString *typeName = READ_STRING();
         uint16_t handlerAddress = READ_SHORT();
+        uint16_t finallyAddress = READ_SHORT();
         Value value;
         if (!tableGet(&vm.globals, typeName, &value) || !IS_CLASS(value))
         {
             runtimeError("'%s' is not a type to catch", typeName->chars);
             return INTERPRET_RUNTIME_ERROR;
         }
-        pushExceptionHandler(value, handlerAddress);
+        pushExceptionHandler(value, handlerAddress, finallyAddress);
         break;
       }
       case OP_POP_EXCEPTION_HANDLER: {
         frame->handlerCount--;
         break;
       }
+      case OP_PROPAGATE_EXCEPTION: {
+        frame->handlerCount--;
+        if (propagateException())
+        {
+          frame = &vm.frames[vm.frameCount - 1];
+          break;
+        }
+        return INTERPRET_RUNTIME_ERROR;
+      }
     }
   }
 

So that’s it. Deceptively simple, but it took me a bunch of work and rework to get there. I hope you found this helpful, and a big congratulations to anyone who made it this far. If you have any questions, or feel like I didn’t explain things well enough, feel free to comment below.

Advertisement

About Michael Malone
30-something web dev, self-confessed Linux lover, Ruby enthusiast, and obsessed with programming. Former embedded C and desktop .NET developer.

4 Responses to Interpreter Exception Handling, Part 3

  1. Wizard says:

    Hello, I realize it’s been about two years since this post was published, but I wanted to thank you for sharing these three informative articles on exception handling in Lox from the Crafting Interpreters book. I found the implementation you presented to be a solid foundation for further development, and I appreciate the attention you gave to a subject that’s often ignored.

    I did have a question about the possibility of using try blocks without catch or finally blocks. As it stands, it seems to result in a segmentation fault. Could you clarify whether this is expected behavior, or if there’s a way to use try blocks on their own?

    Thank you again for taking the time to write these articles and share your knowledge!

    • Hi – thanks for reading! Honestly, this is presented as me “knowing” but I am presenting my experiments in figuring it out as I go. I’ve been developing my own language (cometlang.com) and it too suffers the problem you mentioned, so thanks for pointing it out! No, it is definitely not intended behaviour.

      My code definitely always assumes that there will be an exception handler present and it segfaults because it starts looking for the Exception type, except that the type name was never part of the constants in the chunk, so it’s probably trying to get a string value from undefined memory.

      I don’t know what using a try block on its own without an exception handler would do semantically – its purpose is to define an area where an exception might be thrown and then how we react afterwards. Of course, my code doesn’t work if there’s a try-finally block, which makes total sense.

      There are a couple of things that need addressing – I need to be more wary of placeholder values in the OP_PUSH_EXCEPTION_HANDLER part of the runtime loop / deal with there only being a finally jump address and I need to improve the parser to detect the situations that won’t work and let the user know that this code won’t be accepted.

      Honestly, I will need to think a bit harder about how this will work and most of this is supposition, because I haven’t got a working solution, yet! Perhaps I will try and solve this and another bug and see if I can write an errata or something 🙂

      • Wizard says:

        Hello again, and thank you for your response! After looking at other languages like Python and JavaScript, I noticed that they force a catch or a final block when dealing with exceptions. I guess it makes sense since the user may want to catch an exception or continue with the code execution.

        One last thing, I was thinking about how it could be possible to make all the other runtime errors in the VM part be caught as exceptions too. I’m not totally sure how it would work, maybe modify the current exception frame or something similar? Thanks again!

  2. > One last thing, I was thinking about how it could be possible to make all the other runtime errors in the VM part be caught as exceptions too. I’m not totally sure how it would work, maybe modify the current exception frame or something similar?

    It depends. For strictly runtime errors, then it’s absolutely possible. The real thing you need to consider is whether the error _should_ be catchable. If it’s something like “out of memory” then it doesn’t make sense to be able to catch that. But maybe there is a reason that a user would want to be able to catch an error to do with a method missing on an object.

    It comes down to philosophy of the language. For example, Python encourages users to “ask for forgiveness, not permission” whereas in Comet, I make the method list of an object available and “method not found” isn’t catchable.

    I’ve absolutely created a function to throw an exception from the native code in comet, but I didn’t include it, because it requires too much other code and foundational changes. It creates an instance of the named class, sets the exception message, sets the stack trace and then calls ‘propagateException’ with it.

    The code is here if you’re interested: https://github.com/cometlang/comet/blob/main/vmlib/vm.c#L158-L194

Comment on this

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: