Mutation, CSC430, Fall 2023
1 Goal
2 Guidelines
2.1 Handling Errors
2.2 Progress Toward Goal comment
3 The Assignment
3.1 New Forms
3.2 Order of Evaluation
3.3 The null value
3.4 Array Values
3.5 Allocation
3.6 Syntax of PAIG5
4 Suggested Implementation Strategy
4.1 Adding the Store
5 Nice Big Test Cases
5.1 Quicksort (optional)
6 Interface
8.10.900

Mutation, CSC430, Fall 2023

1 Goal

Extend the interpreter to handle mutable arrays.

2 Guidelines

For this and all remaining assignments, every function you develop must come with the following things:

For this assignment, you must develop your solutions using the typed/racket language. If you haven’t seen them, you might be interested in these Hints on Using Typed Racket in CPE 430.

Your test cases must use the check-equal?, check-=, or check-exn forms.

Your solution should take the form of a single file.

Hand in your solution using the handin server. For help with the handin server, please see the course web page.

2.1 Handling Errors

All of your error messages must contain the string "PAIG". Essentially, this allows my test cases to distinguish errors correctly signaled by your implementation from errors in your implementation. To be more specific: any error message that doesn’t contain the string "PAIG" will be considered to be an error in your implementation.

Additionally, your error messages should be actually helpful. Since you are the primary consumer of your own error messages, making these error messages good in the first place should reduce your overall development time. There are two parts to this: first, the error message should include text that actually indicates what the programmer did wrong. Second, include the text of the user’s program, so they (actually you) can figure out how to fix it. See lab 3 for an example of how to do this. (Apologies in advance if I renumber the labs and fail to update this paragraph....)

2.2 Progress Toward Goal comment

Graders are happier when they know what to expect. Your final submission should start with a short one- or two-line comment indicating how far you got through the project. Ideally, this would just be: “Full project implemented.” But if you only implemented, say, squazz and blotz, and didn’t get to frob or dringo, please indicate this in the comment, so that we don’t spend all our time searching for bits that aren’t there.

3 The Assignment

For this assignment, you must extend Assignment 5 by implementing mutable arrays. These arrays are instead of boxes. Just as the previous assignments asked you to generalize the book’s solution from one argument to many arguments, this assignment generalizes from a single box to an array. Unlike the book, however, we will not require an implementation of Store Passing Style (SPS). Instead you will be permitted one use of Racket mutability to implement a mutable store.

In addition, we’ll be adding a simple substring operation.

3.1 New Forms

Most of these new forms can be implemented as primitive functions, bound in a top-level environment. For details on this transformation, see the roadmap below.

There’s one more form, and this one can’t be implemented as a primitive function:

In order to allow mutable bindings and arrays, you’ll need to add a store. As a simpler alternative to Store Passing Style, which is the purely functional approach to supporting mutation, you should instead implement your store as a single mutable vector (NB: vector is the term Racket uses for arrays, variable-length data structures indexed by integers).

Note that arrays are not typed; it’s fine to have an array that contains a mixture of numbers and booleans.

In order to allow mutable bindings, you will have to change the type of the environment; rather than mapping names to values, it will map names to locations. That is, "every" binding will be a reference to the store.

3.2 Order of Evaluation

In a language with mutation, programmers can observe the order of evaluation of function call arguments. For this language, all forms must perform left-to-right evaluation.

3.3 The null value

Add a value that represents the null value. It will be used as the result of mutation operations. The serialize function should produce "null" when called with a null value. The null value should be equal to itself, but not to any other value.

3.4 Array Values

PAIG5 programmers can now use arrays, which (as you might expect) can be created at any length the programmer desires, and can be accessed using indexes.

The equal? function must now accept arrays. It should return true for arrays only when its two arguments evaluate to the same array; that is, two arrays pointing to the same region of memory.

The serialize function must handle arrays. It should simply return the string "#<array>" for an array.

One possible point of confusion: PAIG5 has arrays. Racket has vectors. Your interpreter will use exactly one racket vector to hold the full contents of the running program’s memory; this means that all of the arrays (and all of the bindings) of the running PAIG5 program will be stored in that one big Racket vector.

3.5 Allocation

Creating new bindings and arrays will require identifying new free areas of memory (that is, locations within the vector representing memory) in which to store those values. The simplest way to do this is to use the first element of the vector to store the index of the first free address. So, for instance, if the first element of the memory vector currently contains 122, that would mean that the next block of memory allocated should start at index location 122. If you need to allocate (say) 15 memory locations, you would then increase this value to 137, so that the next allocation doesn’t trample this one.

All values can be stored in a single vector cell; this includes strings, closures, and arrays. Note that this doesn’t mean that the cells of the array, just the array value itself (what you might think of as "the pointer" in another language).

This assignment does not require (or really even invite) you to implement any kind of storage reclamation (a.k.a. memory management or garbage collection); we’ll talk about that later....

You can definitely run out of memory, if the programmer tries to allocate a great big array. You should signal an error if there is not enough space available in the program’s memory.

3.6 Syntax of PAIG5

The concrete syntax of the PAIG5 language with these additional features is captured by the following EBNF:

  expr = number
  | string
  | id
  | {id := expr}
  | {expr ? expr else: expr}
  | {with [expr as id] ... : expr}
  | {blam (id ...) expr}
  | {expr expr ...}

  top-level-constants = true
  | false
  | null

  top-level-functions = +
  | -
  | *
  | /
  | equal?
  | <=
  | array
  | make-array
  | aref
  | aset!
  | do
  | substring
  | error

... where an id is not :=, ?, else:, with, as, :, or blam.

4 Suggested Implementation Strategy

4.1 Adding the Store

First thing, strip down the language. In the interp function, comment out everything except the evaluation of numbers and applications of primitive functions. Add an other rule that signals an "unimplemented" error for all other forms. Make sure that your test cases for binops still work.

Time to add the store! As mentioned earlier, you will use a single mutable vector to represent memory. The store is now passed as an argument to interp. Unlike the SPS implementation in the book, there is no need to change the return type of interp. Change the Env type to map names to locations. Update your test cases so that they pass a store in.

With luck and some effort, you should be able to get those binop programs working again.

Next, I would add variables and function definitions; these are both one-line changes. Now you can re-enable your tests involving these items.

Next, I would add the mutation operations. Note that I’m advising you to add these before adding your other language forms (if, application of closures) back in. First off, I think I’d add an allocate helper function that accepts a store and a list of values to place in the next set of sequential locations, and returns the base location.

Design your store so that the "next allocated" location is derived directly from the store. It could be a separate counter that’s part of a define-type, or it could be a function that just scans the store to find a new address. Don’t use the new-loc defined by the book; it makes testing quite painful.

Next, I would add a new arrayV value to the set of values. This will require a bunch of extra clauses in various places (serialize, for instance). Note that the representation of arrays is up to you, but your representation had better be rich enough to enable you to determine where an array cell is stored in memory, and whether a programmer’s reference is in-range or not.

Following this, I would add the new-array operation. At this point, you should be able to create arrays, and the result of interpretation should be that the store includes lots of newly allocated locations.

At this point, I would go back and add test cases that create arrays as subexpressions of the equal? binop; check that the allocations happen in the right order. As you go forward, you’ll want to use this technique to check order of evaluation for all of your forms.

At this point, it starts to make less difference what order you add language forms in. I think I would probably wait on applications, just because there will be lots of opportunities for mistakes.

5 Nice Big Test Cases

When you think you have everything working, develop a while function (that is, an PAIG5 program, using its concrete syntax) that accepts a guard function and a body function and keeps running the body until the guard returns false. This function will have to be recursive; build a recursive function using mutation. Here’s an example of a factorial function written in this way:

{with ["bogus" as fact]
 :
 {do {fact := {blam (n) {{<= n 0} ? 1 else: {* n {fact {- n 1}}}}}}
   {fact 12}}}

Then, use your while to develop the in-order function that accepts an array of numbers and its size and returns true if the array is in strictly increasing order.

Note that handin does in fact contain tests for the existence and correct functioning of these functions.

5.1 Quicksort (optional)

This is not a good early test case, but you might want to see if you can write an in-place quicksort routine. You can model recursion by using the technique shown in the factorial example here.

Did I mention that this is optional?

6 Interface

Make sure that you include the following functions, and that they match this interface.

procedure

(parse s)  ExprC

  s : s-expression
Parses an expression.

procedure

(interp e env sto)  Value

  e : ExprC
  env : Environment
  sto : Store
Interprets an expression, with a given environment and store.

procedure

(top-interp s memsize)  string

  s : s-expression
  memsize : integer
Combines parsing and evaluation, creates a memory vector of the given size. You should probably use this definition for this function:

(define (top-interp [s : s-expression] [memsize : Natural]) : string
  (serialize (interp (parse s) initial-env (make-initial-store memsize))))

value

while : s-expression?

an s-expression representing the while function, described above, implemented in PAIG5 .

value

in-order : s-expression?

an s-expression representing the in-order function, described above, implemented in PAIG5 . It’s okay for this s-expression to assume that while is already bound.