Implementing Type-Checking, CSC430, Spring 2023
1 Goal
Extend the interpreter to include a type system, binding mutation, and a mutable number array data structure.
2 Guidelines
For this and all remaining assignments, every function you develop must come with the following things:
A commented header line that expresses the result of the function in terms of its inputs, written in English. Be as precise as you can within the space of a line or two.
A type declaration (possibly inline), specifying the input and output types.
Test cases. A function without test cases is incomplete. Write the test cases first, please.
For this assignment, you must develop your solutions using the typed/racket language. If you haven’t seen them, you might be interested in these Hints on Using Typed Racket in CPE 430.
Your test cases must use the check-equal?, check-=, or check-exn forms.
Your solution should take the form of a single file.
Hand in your solution using the handin server. For help with the handin server, please see the course web page.
2.1 Handling Errors
All of your error messages must contain the string "VVQS". Essentially, this allows my test cases to distinguish errors correctly signaled by your implementation from errors in your implementation. To be more specific: any error message that doesn’t contain the string "VVQS" will be considered to be an error in your implementation.
Additionally, your error messages should be actually helpful. Since you are the primary consumer of your own error messages, making these error messages good in the first place should reduce your overall development time. There are two parts to this: first, the error message should include text that actually indicates what the programmer did wrong. Second, include the text of the user’s program, so they (actualy you) can figure out how to fix it. See lab 3 for an example of how to do this. (Apologies in advance if I renumber the labs and fail to update this paragraph....)
2.2 Progress Toward Goal comment
Graders are happier when they know what to expect. Your final submission should start with a short one- or two-line comment indicating how far you got through the project. Ideally, this would just be: “Full project implemented.” But if you only implemented, say, squazz and blotz, and didn’t get to frob or dringo, please indicate this in the comment, so that we don’t spend all our time searching for bits that aren’t there.
3 The Assignment
This assignment will build on the Assignment 5 interpeter.
4 Syntax of VVQS8
A VVQS8 program consists of a single expression.
The concrete syntax of the VVQS8 expressions with these additional features can be captured with the following EBNFs.
Expr | = | Num | ||
| | id | |||
| | String | |||
| | {id <- Expr} | |||
| | {Expr if Expr else Expr} | |||
| | {Expr where {[[id : ty] := Expr] ...}} | |||
| | {letrec {id {[id : ty] ...} : ty => Expr} in Expr} | |||
| | {{[id : ty] ...} => Expr} | |||
| | {begin Expr ...} | |||
| | {makearr Expr ...} | |||
| | {Expr Expr ...} |
ty | = | num | ||
| | bool | |||
| | str | |||
| | void | |||
| | {ty ... -> ty} | |||
| | numarray |
operator | = | + | ||
| | - | |||
| | * | |||
| | / | |||
| | num-eq? | |||
| | str-eq? | |||
| | <= | |||
| | substring | |||
| | arr | |||
| | aref | |||
| | aset | |||
| | alen |
... where an id is not where, letrec, :=, if, else, <-, :, =>, begin, in, or makearr.
4.1 Primitives
procedure
(+ a b) → num
a : num b : num
procedure
(- a b) → num
a : num b : num
procedure
(* a b) → num
a : num b : num
procedure
(/ a b) → num
a : num b : num
procedure
(<= a b) → boolean
a : num b : num
procedure
(num-eq? a b) → boolean
a : num b : num
procedure
(str-eq? a b) → boolean
a : str b : str
procedure
(substring str begin end) → string
str : str begin : num end : num
procedure
(arr size default) → numarray
size : num default : num
procedure
(aref arr idx) → num
arr : numarray idx : num
procedure
(aset arr idx newval) → Void
arr : numarray idx : num newval : num
procedure
(alen arr) → num
arr : numarray
value
true : boolean
value
false : boolean
4.2 Mutation
This assignment includes both mutation of variables and mutable number arrays. We’ll be implementing it in a meta-circular way; that is, we’ll be using racket vectors to implement VVQS8 arrays, and to implement mutable variables, we’ll be adding mutation to our environments.
However, we want to be a bit careful in making the environments mutable; it turns out that it’s not a great idea to use mutable hash tables to represent environments. We want the cells to be mutable, but we still want the nice behavior of immutable hash tables (or, equivalently, lists of bindings). The way to have our cake and eat it too is to create an immutable mapping from names to mutable values. If, for instance, you’re using a Binding structure, you’ll probably want to update it to map a name to a boxed value, written (Boxof Value).
To represent arrays of numbers, just use racket’s mutable vector value. To keep things simple, these will be serialized simply as the string "#<array>".
Also, mutation of a variable will return the new void value, which should be serialized as "#<void>".
4.2.1 Order of Evaluation
Now that we have mutation, order of evaluation becomes observable. So, for instance, consider a program with a call {h {f} {g}} where both the function f and the function g mutate the same variable. Which one happens first?
In languages such as Java, Python, and VVQS8, evaluation is left-to-right; that is, the subterms of every expression are evaluated from left to right, except when the form itself demands a different order of evaluation, as e.g. if. So, to be specific, in an application, the function position should be evaluated first, and then the leftmost argument, the second leftmost argument, and so forth.
4.3 Type Checking
Implement a type checker for your language. Note that since functions must now come annotated with types for arguments, you will need to have a type parser that parses types. You should definitely make a separate function for this. Note that the types of functions are extended to handle multiple arguments. So, for instance, the type {num str -> bool} refers to a function that accepts two arguments, a number and a string, and returns a boolean.
All type rules are standard.
4.3.1 Operators
Type-checking binops is more or less as you might expect. For instance, a + should receive two numbers, and will produce a number. The <= operator will take two numbers, and return a boolean.
The equal? operator is a bit different. Specifically, we don’t have subtyping, and we treat the equality operator as a function in the environment, so it must have a single type. In order to simplify our lives, we split it into two equality operators; one that only works for numbers, called num-eq?, with type {num num -> bool}, and one that only works for strings, called str-eq?, with type {str str -> bool}.
5 Recursion
This assignment will add recursion, using a letrec form. Implement the rec form as described in the book. Note that you will need some kind of mutation in order to create the closure as a cyclic data structure. For this assignment, we’ll keep things simple and just use Typed Racket’s built-in mutation in order to eliminate the need for store-passing style. I recommend designing your (runtime) environment as a mapping from names to boxed values.
Here’s an example of a simple program that defines a recursive function to compute perfect squares in a kind of silly way:
{letrec {square-helper {{[n : num]} : num => {0 if {<= n 0} else {+ n {square-helper {- n 2}}}}}} in {{square 13} where {[[square : {num->num}] := {{[n : num]} => {square-helper {- {* 2 n} 1}}}]}}}
6 Suggested Implementation Strategy
Here are some of the steps that I followed. I wrote test cases for every step before implementing it.
6.1 Type Checking for Existing Language
Add a data definition for Types.
Add a parser for the language of types.
Update your test cases to include types for the function declaration and where forms. At the same time, update your parser to accept the updated forms, parse the types, and just throw them away. This step is an important one, because it suddenly takes you from passing basically no test cases at all to passing a whole bunch of them.
Develop a type checker that only works for constants.
Add a definition for Type Environments, mapping symbols to types.
Extend your type checker to handle variables and if expressions.
Extend your type checker to handle applications. (Since you don’t have types for LamC terms yet, your test cases should just use primitive functions.)
Extend your AST nodes for LamC nodes to include types with the parameters, modify parser to store types in the AST node rather that just throwing them away.
Extend your type checker to handle LamC terms.
Extend your parser and type checker to handle the modified where syntax, update test cases.
Add your type checker to top-interp. Type checking should happen after parsing and before interpretation. The result of type checking is ignored; if it doesn’t signal an error, that’s all you’re looking for.
At this point, you’re in good shape; your assignment should now pass a significant fraction of the test cases.
6.2 Adding Binding Mutation
Add a new void value to your set of values, and a new void type to your set of types. Extend serialize to handle the void value.
Extend your type parser to handle the void type.
Update your environment to map names to boxed values.
Fix breakage by adding boxing and unboxing to environment operations.
Add an ExprC form to represent mutation of a binding, and another to represent the sequencing form, begin.
Add code to parse to parse the binding mutation form and the sequencing form.
Add code to type-check to type-check the binding mutation form and the sequencing form.
Add code to interp to interpret binding mutations and the sequencing form.
Add code to recreate the top-env on every call to top-interp (necessary because each call to interp might damage the existing top-env.
6.3 Adding Arrays
Add a new kind of value to represent integer arrays. Extend serialize to handle the new array value.
Add a new language form, makearr, to construct arrays. Note that this cannot be a primitive function with our existing type checker, because it is of variable arity. Add it to the parser, the type-checker, and the interpreter.
Add the new primitives aref, aset, and alen.]
6.4 Adding recursion (last part!)
Add the letrec form to the set of ExprCs.
Develop the parser clause for RecC. Test cases first!
Develop the interpreter clause for RecC. Test cases first!
7 Interface
Make sure that you include the following functions, and that they match this interface:
procedure
(parse s) → ExprC
s : Sexp
procedure
(parse-type s) → Ty
s : Sexp
procedure
(type-check e env) → Ty
e : ExprC env : TEnv
value
base-tenv : TEnv
procedure
(interp e env) → Value
e : ExprC env : Environment
procedure
(top-interp s) → string
s : s-expression