Mutation, CSC430, Fall 2024
1 Goal
Extend the interpreter to handle mutable arrays.
2 Guidelines
For this and all remaining assignments, every function you develop must come with the following things:
A commented header line that expresses the result of the function in terms of its inputs, written in English. Be as precise as you can within the space of a line or two.
A type declaration (possibly inline), specifying the input and output types.
Test cases. A function without test cases is incomplete. Write the test cases first, please.
For this assignment, you must develop your solutions using the typed/racket language. If you haven’t seen them, you might be interested in these Hints on Using Typed Racket in CPE 430.
Your test cases must use the check-equal?, check-=, or check-exn forms.
Your solution should take the form of a single file.
Hand in your solution using the handin server. For help with the handin server, please see the course web page.
2.1 Handling Errors
All of your error messages must contain the string "AAQZ". Essentially, this allows my test cases to distinguish errors correctly signaled by your implementation from errors in your implementation. To be more specific: any error message that doesn’t contain the string "AAQZ" will be considered to be an error in your implementation.
Additionally, your error messages should be actually helpful. Since you are the primary consumer of your own error messages, making these error messages good in the first place should reduce your overall development time. There are two parts to this: first, the error message should include text that actually indicates what the programmer did wrong. Second, include the text of the user’s program, so they (actually you) can figure out how to fix it. See lab 3 for an example of how to do this. (Apologies in advance if I renumber the labs and fail to update this paragraph....)
2.2 Progress Toward Goal comment
Graders are happier when they know what to expect. Your final submission should start with a short one- or two-line comment indicating how far you got through the project. Ideally, this would just be: “Full project implemented.” But if you only implemented, say, squazz and blotz, and didn’t get to frob or dringo, please indicate this in the comment, so that we don’t spend all our time searching for bits that aren’t there.
3 The Assignment
For this assignment, you must extend Assignment 4 by implementing mutable arrays. These arrays are instead of boxes. Just as the previous assignments asked you to generalize the book’s solution from one argument to many arguments, this assignment generalizes from a single box to an array. Unlike the book, however, we will not require an implementation of Store Passing Style (SPS). Instead you will be permitted one use of Racket mutability to implement a mutable store.
In addition, we’ll be adding a simple substring operation.
3.1 New Forms
Most of these new forms can be implemented as primitive functions, bound in a top-level environment. For details on this transformation, see the roadmap below.
make-array: creates a fresh array of the given size, with all cells filled with the given value. So, for instance, {make-array 34 0.0} would create an array of 34 cells, all containing 0.0. It is illegal to create an array with fewer than one cell.
array: creates a fresh array containg the given values. So, for instance, {array 3 14 false 5} would create an array of length four whose second element is 14. It’s illegal to create an array with fewer than one cell.
aref : returns an element of an array. So, for instance, {aref p 15} would return the contents of cell 15 of the array named by p. If the array does not have that many elements, you must signal an error. The first element has index 0. It is illegal to refer to an index below zero or as large or larger than the size of the array.
aset! : the aset! form is for arrays. So, for instance, {aset! p 15 {f 6}} would set element 15 of the array named by p to be the result of calling f with 6. It must return nullV. The first element of an array has index 0. As with reference, it is illegal to try to set an element whose index is smaller than zero or equal to or greater than the size of the array
seq : evaluates a sequence of expressions, returning the last one. So, for instance, {seq {f 9} p} would evaluate {f 9}, then return the value of p. It’s illegal to call seq with fewer than one argument. note that this primitive accepts a variable number of arguments.
substring : accepts a string and a start and end position, and returns the corresponding substring. Use Racket’s substring to implement this function.
There’s one more form, and this one can’t be implemented as a primitive function:
:= : the := form is used to mutate bindings. So, for instance, {l := 9} will change the binding of l to contain 9. It must return nullV. It is an error to mutate a variable that is not already bound.
In order to allow mutable bindings and arrays, you’ll need to add a store. As a simpler alternative to Store Passing Style, which is the purely functional approach to supporting mutation, you should instead implement your store as a single mutable vector (NB: vector is the term Racket uses for arrays, variable-length data structures indexed by integers).
Note that arrays are not typed; it’s fine to have an array that contains a mixture of numbers and booleans.
In order to allow mutable bindings, you will have to change the type of the environment; rather than mapping names to values, it will map names to locations. That is, every binding will be a reference to the store.
3.2 Order of Evaluation
In a language with mutation, programmers can observe the order of evaluation of function call arguments. For this language, all forms must perform left-to-right evaluation.
3.3 The null value
Add a value that represents the null value. It will be used as the result of mutation operations. The serialize function should produce "null" when called with a null value. The null value should be equal to itself, but not to any other value.
3.4 Array Values
AAQZ6 programmers can now use arrays, which (as you might expect) can be created at any length the programmer desires, and can be accessed using indexes.
The equal? function must now accept arrays. It should return true for arrays only when its two arguments evaluate to the same array; that is, two arrays pointing to the same region of memory.
The serialize function must handle arrays. It should simply return the string "#<array>" for an array.
One possible point of confusion: AAQZ6 has arrays. Racket has vectors. Your interpreter will use exactly one racket vector to hold the full contents of the running program’s memory; this means that all of the arrays (and all of the bindings) of the running AAQZ6 program will be stored in that one big Racket vector.
3.5 Allocation
Creating new bindings and arrays will require identifying new free areas of memory (that is, locations within the vector representing memory) in which to store those values. The simplest way to do this is to use the first element of the vector to store the index of the first free address. So, for instance, if the first element of the memory vector currently contains 122, that would mean that the next block of memory allocated should start at index location 122. If you need to allocate (say) 15 memory locations, you would then increase this value to 137, so that the next allocation doesn’t trample this one.
All values can be stored in a single vector cell; this includes strings, closures, and arrays. Note that this doesn’t mean that the cells of the array, just the array value itself (what you might think of as "the pointer" in another language).
This assignment does not require (or really even invite) you to implement any kind of storage reclamation (a.k.a. memory management or garbage collection); we’ll talk about that later....
You can definitely run out of memory, if the programmer tries to allocate a great big array. You should signal an error if there is not enough space available in the program’s memory.
3.6 Syntax of AAQZ6
The concrete syntax of the AAQZ6 language with these additional features is captured by the following EBNF:
| ‹expr› | ::= | ‹num› |
|
| | | ‹id› |
|
| | | { ‹id› := ‹expr› } |
|
| | | ‹string› |
|
| | | { if ‹expr› ‹expr› ‹expr› } |
|
| | | { bind ‹clause›* ‹expr› } |
|
| | | { (‹id›*) => ‹expr› } |
|
| | | { ‹expr› ‹expr›* } |
| ‹clause› | ::= | [ ‹id› = ‹expr› ] |
| ‹top-level-constants› | ::= | true |
|
| | | false |
|
| | | null |
| ‹top-level-functions› | ::= | + |
|
| | | - |
|
| | | * |
|
| | | / |
|
| | | equal? |
|
| | | <= |
|
| | | array |
|
| | | make-array |
|
| | | aref |
|
| | | aset! |
|
| | | seq |
|
| | | substring |
|
| | | error |
... where an id is not if, =>, bind, :=, or =.
4 Suggested Implementation Strategy
4.1 Adding the Store
First thing, strip down the language. In the interp function, comment out everything except the evaluation of numbers and applications of primitive functions. Add an other rule that signals an "unimplemented" error for all other forms. Make sure that your test cases for binops still work.
Time to add the store! As mentioned earlier, you will use a single mutable vector to represent memory. The store is now passed as an argument to interp. Unlike the SPS implementation in the book, there is no need to change the return type of interp. Change the Env type to map names to locations. Update your test cases so that they pass a store in.
With luck and some effort, you should be able to get those binop programs working again.
Next, I would add variables and function definitions; these are both one-line changes. Now you can re-enable your tests involving these items.
Next, I would add the mutation operations. Note that I’m advising you to add these before adding your other language forms (if, application of closures) back in. First off, I think I’d add an allocate helper function that accepts a store and a list of values to place in the next set of sequential locations, and returns the base location.
Next, I would add a new arrayV value to the set of values. This will require a bunch of extra clauses in various places (serialize, for instance). Note that the representation of arrays is up to you, but your representation had better be rich enough to enable you to determine where an array cell is stored in memory, and whether a programmer’s reference is in-range or not.
Following this, I would add the new-array operation. At this point, you should be able to create arrays, and the result of interpretation should be that the store includes lots of newly allocated locations.
At this point, I would go back and add test cases that create arrays as subexpressions of the equal? binop; check that the allocations happen in the right order. As you go forward, you’ll want to use this technique to check order of evaluation for all of your forms.
At this point, it starts to make less difference what order you add language forms in. I think I would probably wait on applications, just because there will be lots of opportunities for mistakes.
5 Nice Big Test Cases
When you think you have everything working, develop a while function (that is, an AAQZ6 program, using its concrete syntax) that accepts a guard function and a body function and keeps running the body until the guard returns false. The while function will have to be recursive; build a recursive function using mutation. Here’s an example of a factorial function written in this way:
{bind [fact = "bogus"] {seq {fact := {(x) => {if {equal? x 0} 1 {* x {fact {- x 1}}}}}} {fact 12}}}
Then, use your while to develop the in-order function that accepts an array of numbers and its size and returns true if the array is in strictly increasing order.
Note that handin does in fact contain tests for the existence and correct functioning of these functions.
5.1 Quicksort (optional)
This is not a good early test case, but you might want to see if you can write an in-place quicksort routine. You can model recursion by using the technique shown in the factorial example here.
Did I mention that this is optional?
6 Interface
Make sure that you include the following functions, and that they match this interface.
procedure
(parse s) → ExprC
s : s-expression
procedure
(interp e env sto) → Value
e : ExprC env : Environment sto : Store
procedure
(top-interp s memsize) → string
s : s-expression memsize : integer
(define (top-interp [s : s-expression] [memsize : Natural]) : string (serialize (interp (parse s) initial-env (make-initial-store memsize))))
value
while : s-expression?
value
in-order : s-expression?