Assignment 4, CSC430, Winter 2017

6.8.0.2

Assignment 4, CSC430, Winter 2017

1 Goal

Extend the interpreter to handle mutable arrays.

2 Guidelines

For this and all remaining assignments, every function you develop must come with the following things:

A commented header line that expresses the result of the function in terms of its inputs, written in English. Be as precise as you can within the space of a line or two.
A type declaration, specifying the input and output types.
Test cases. A function without test cases is incomplete. Write the test cases first, please.

For this assignment, you must develop your solutions using the typed/racket language. If you still haven’t seen them, you might be interested in these Hints on Using Typed Racket in CPE 430.

Your test cases must use the check-equal?, check-=, or check-exn forms.

Your solution should take the form of a single file. Solve each problem separately, ands make sure that each solution appears in a separate part of the file, with comments separating each problem’s solution.

Hand in your solution using the handin server. For help with the handin server, please see the course web page.

2.1 Handling Errors

All of your error messages must start with the string "PHYM: ". Essentially, this allows my test cases to distinguish errors correctly signaled by your implementation from errors in your implementation. To be more specific: any error message that doesn’t contain the string "PHYM" will be considered to be an error in your implementation.

2.2 Using Monads

Want to use the store monad? Here’s some sample code:

monad-example.rkt

3 The Assignment

For this assignment, you must implement mutable arrays. These arrays are instead of boxes. Just as the previous assignments asked you to generalize the book’s solution from one argument to many arguments, this assignment generalizes from a single box to an array.

In addition, we’ll be adding strings and a simple substring operation.

As we’ve discussed in class, this code will use no side effects in its implementation; that is, it’s implementing state without using it. The key technology here is SPS, Store Passing Style.

3.1 Top-Level environment

This assignment makes us pay in sweat and tears for every form in the language.

In order to simplify the language, then, we’re going to remove a whole bunch of language forms, without removing the corresponding functionality. How is this possible? The answer is that many of the existing forms are unnecessary. Specifically, the binops can all be represented as function applications. Also, the primitive true and false forms can simply be represented as references to top-level-variables. The key is the specification of a top-level environment that is used in calls to top-interp.

This is also a good moment to appreciate the fact that the var form is expressed as syntactic sugar.

3.2 New Values

Add a value that represents the null value. It will be used as the result of mutation operations. The serialize function should produce "null" when called with a null value. The null value should be equal to itself, but not to any other value.

Also, we need to add strings to the language. Calling serialize on a string should produce a string surrounded with double-quotes, and with any contained double-quotes quoted. Use the racket code (format "~s" val) for this. Strings should be considered equal when they contain the same characters. Use the racket primitive equal? or string=? for this.

The array value is also a new kind of array, but I’m going to delay discussing it for a few paragraphs.

3.3 New Forms

Implement these new forms as primitive bindings in a top-level environment. For details on this transformation, see the roadmap below.

new-array: creates a fresh array of the given size, with all cells filled with the given value. So, for instance, {new-array 34 0.0} would create an array of 34 cells, all containing 0.0. It is illegal to create an array with fewer than one cell.
array: creates a fresh array containg the given values. So, for instance, {array 3 14 false 5} would create an array of length four whose second element is 14. It’s illegal to create an array with fewer than one cell.
aref : returns an element of an array. So, for instance, {aref p 15} would return the contents of cell 15 of the array named by p. If the array does not have that many elements, you must signal an error. The first element has index 0.
aset! : the aset! form is for arrays. So, for instance, {aset! p 15 {f 6}} would set element 15 of the array named by p to be the result of calling f with 6. It must return nullV. The first element of an array has index 0.
begin : evaluates a sequence of expressions, returning the last one. So, for instance, {begin {f 9} p} would evaluate {f 9}, then return the value of p. It’s illegal to call begin with fewer than one argument.
substring : accepts a string and a start and end position, and returns the corresponding substring. Use substring to implement this function.

In order to allow mutable bindings and arrays, you’ll need to add a store, and rewrite your interpreter in store-passing style, as we did in class. In addition to the storage of memory cells, your store will have to represent the generated output, and the remaining input.

Note that arrays are not typed; it’s fine to have an array that mixes numbers and booleans.

3.4 Order of Evaluation

In a language with mutation, programmers can observe the order of evaluation of function call arguments. For this language, all forms must perform left-to-right evaluation.

3.5 Array Values

The eq? function must now accept arrays. It should return true for arrays only when its two arguments evaluate to the same array; that is, two arrays pointing to the same region of memory.

The serialize function must handle arrays. It should simply return the string "#<array>" for an array.

3.6 Syntax of PHYM4

The concrete syntax of the PHYM4 language with these additional features can be captured with the following EBNF:

expr	=	number
	\|	string
	\|	id
	\|	{if expr expr expr}
	\|	{var {id = expr} ... expr}
	\|	{lam {id ...} expr}
	\|	{expr expr ...}

top-level-constants	=	true
	\|	false
	\|	null

top-level-functions	=	+
	\|	-
	\|	*
	\|	/
	\|	eq?
	\|	<=
	\|	array
	\|	new-array
	\|	aref
	\|	aset!
	\|	begin
	\|	substring

... where an id is not var, if, func, or =.

Note that this syntax is somewhat simpler than that of previous assignments, because binops and the new forms are characterized as members of an initial environment, rather than as language forms. Implementing your code this way should make it simpler, and reduce the work associated with the translation to store-passing style.

4 Suggested Implementation Strategy

Whoo! lots of fun stuff to do here.

4.1 Simplification

The first thing to do is to simplify things by introducing the top-level environment.

Create a top-level environment containing bindings for true and false. Remove the corresponding forms from the parser and the interpreter. Make sure that your code all still works.

Next, choose a representation for primitive function values. Add these to your values. Update serialize to return "#<procedure>" for all of these values.

Add these values to your environment. Remove binop-specific parsing from your parser, so that they all parse as function applications. This should shorten your parser, but will require some rewriting of your parse test cases.

Now, remove binop rules from your evaluator, and update the rule for application so that when the function position evaluates to a function primitive, the application rule calls the corresponding library implementation.

Your test cases should now all work again.

4.2 Strings

Add strings to your language. This should be relatively straightforward.

4.3 Store-passing style

Store-passing style is a bear. Here’s how I’d get started.

First thing, strip down the language. In the interp function, comment out everything except the evaluation of numbers and applications of primitive functions. Add an other rule that signals an "unimplemented" error for all other forms. Make sure that your test cases for binops still work.

Time to add the store! Formulate the define-types and type aliases that you’ll need for stores (just like the book’s, or use hash tables if you prefer). Choose a representation for a Value-combined-with-store, and change the type of the interp function so that it accepts a store and returns a Value-combined-with-store. Update your test cases so that they pass a store in, and expect the v*s including the answer and the store. Rewrite the interp rules for numbers and primitive applications so that they thread the store through the computation as they should.

With luck and some effort, you should be able to get those binop programs working again.

If this takes you a lot of time and effort, don’t worry: this might be the hardest part of the assignment. Once you get the hang of transforming code into store-passing style, it will get easier. Check to make sure that each store is used exactly once (with the exception of the mutation operations you’ll add later).

Next, I would add identifiers and functions; these are both one-line changes. Now you can re-enable your tests involving these items.

Next, I would add the mutation operations. Note that I’m advising you to add these before adding your other language forms (if, application of closures) back in. First off, I think I’d add an allocate helper function that accepts a store, a number of locations to be allocated, and a value to place in all of them, and returns two things; the base location, and an extended store.

Design your store so that the "next allocated" location is derived directly from the store. It could be a separate counter that’s part of a define-type, or it could be a function that just scans the store to find a new address. Don’t use the new-loc defined by the book; it makes testing quite painful.

Next, I would add a new arrayV value to the set of values. This will require a bunch of extra clauses in various places (serialize, for instance). Note that the representation of arrays is up to you, but it had probably better include a location and a length.

Following this, I would add the new-array operation. At this point, you should be able to create arrays, and the result of interpretation should include a store that contains lots of new allocated locations.

At this point, I would go back and add test cases that create arrays as subexpressions of the eq? binop; check that the allocations happen in the right order. As you go forward, you’ll want to use this technique to check order of evaluation for all of your forms.

At this point, it starts to make less difference what order you add language forms in. I think I would probably wait on applications, just because there will be lots of opportunities for mistakes.

4.4 Quicksort (optional)

This is not a good early test case, but you might want to see if you can write an in-place quicksort routine. You can model recursion by using the technique that Shriram describes in the next chapter.

Did I mention that this is optional?

5 Interface

Make sure that you include the following functions, and that they match this interface–actually, I’m giving you a couple of choices: you can return a Result, as Shriram describes, or a pair, from interp. Your definition of top-interp will depend on which of these you choose. In fact, if you use a monadic style, your eval will look different from both of these.

procedure
(parse s) → ExprC
s : s-expression

Parses an expression.

procedure
(interp e env sto) → Result
  e : ExprC
  env : Environment
  sto : Store

Interprets an expression, with a given environment. The Result may be any type you design.

procedure
(top-interp s) → string
s : s-expression

Combines parsing and evaluation. You should probably use this definition for this function:

(define (top-interp [s : s-expression]) : string
(serialize (value-part (interp (parse s) initial-env empty-store))))

Note that I’m assuming the existence of some value-part function that discards the store and returns only the value.

1	Goal
2	Guidelines
3	The Assignment
4	Suggested Implementation Strategy
5	Interface