Assignment 5, CSC430, Winter 2020
1 Goal
2 Guidelines
2.1 Handling Errors
2.2 Progress Toward Goal comment
3 The Assignment
3.1 Recursion and Binding Mutation
3.1.1 The definition of top-env
3.2 Errors
3.3 Classes and Objects
3.3.1 Expression Forms
3.3.2 Top-level form
4 Syntax of DUNQ5
4.1 Primitives
5 Classes, Objects, and Inheritance
6 Desugaring
6.1 Desugaring Class  Defs
6.2 Desugaring expression forms
7 Quasiquote!
8 Suggested Implementation Strategy
9 Interface
7.6.0.17

Assignment 5, CSC430, Winter 2020

1 Goal

Extend the interpreter to handle recursive functions and a simple class system.

2 Guidelines

For this and all remaining assignments, every function you develop must come with the following things:

For this assignment, you must develop your solutions using the typed/racket language. If you haven’t seen them, you might be interested in these Hints on Using Typed Racket in CPE 430.

Your test cases must use the check-equal?, check-=, or check-exn forms.

Your solution should take the form of a single file. Solve each problem separately, and make sure that each solution appears in a separate part of the file, with comments separating each problem’s solution.

Hand in your solution using the handin server. For help with the handin server, please see the course web page.

2.1 Handling Errors

All of your error messages must contain the string "DUNQ". Essentially, this allows my test cases to distinguish errors correctly signaled by your implementation from errors in your implementation. To be more specific: any error message that doesn’t contain the string "DUNQ" will be considered to be an error in your implementation.

2.2 Progress Toward Goal comment

Graders are happier when they know what to expect. Your final submission should start with a short one- or two-line comment indicating how far you got through the project. Ideally, this would just be: “Full project implemented.” But if you only implemented, say, squazz and blotz, and didn’t get to frob or dringo, please indicate this in the comment, so that we don’t spend all our time searching for bits that aren’t there.

3 The Assignment

This assignment will build on Assignment 3, although it is also possible to build on Assignment 4 if you wish.

3.1 Recursion and Binding Mutation

We’re going to need recursion for this assignment, to allow class methods to create instance of the classes that they’re in. This requires support for variable mutation.

One solution to this would be to build our assignment on top of Assignment 4, in which you have implemented both variable and value mutation (of arrays) using support from Racket’s mutation.

It might be easier, however, to go back to Assignment 3 and add a simpler version of mutation than the implementation you needed for Assignment 4.

The rest of this description will assume you are starting with Assignment 3. If you are very comfortable with your Assignment 4 implementation and would like to start with it, it would be good to discuss with your instructor.

Starting with Assignment 3, change the type of environments to map names to Racket boxed values, one box for each binding. Update the appC and idC rules to perform box creation and box deref, then add a variable mutation language form (the <- of the previous assignment) that mutates the specified binding. The value that is the result of a mutation operation is unspecified.

Please be sure to take a look at the note on boxes in Hints on Using Typed Racket in CPE 430.

After this, we can add a rec form to our language. There are a number of different ways of doing this; we could add it using desugaring, as Shriram describes, or we could add it using a new RecC form. Either one works for this assignment, but if you want your solution to work well in the type-checking assignment, we strongly recommend implementing rec as its own language form.

If you do it this way, you’re going to have to figure out what fields your RecC form should contain. Here’s a hint: one of the fields of your RecC should have type LamC. If you can’t see why this would be the case, ask.

Also, your interp function will need a rule for rec that performs these steps:
  1. extend the current environment with the new binding, mapped to a bogus value,

  2. evaluate the right-hand-side using the extended environment,

  3. mutate the binding to refer to the new value, and

  4. use the extended environment to evaluate the body.

3.1.1 The definition of top-env

We’re using Racket’s mutation now, to implement environments. This means that bindings in our environments are now actually mutable. It also means that it’s now a very bad idea to make a single top-env and use it over and over again; if an early test case mutates the value associated with, say, equal?, then later test cases may no longer be able to use equal?.

Instead, you should implement (make-top-env), that generates a fresh top-environment. This probably requires almost no work: just wrap your top-env name in parens, really. And maybe add a type annotation.

3.2 Errors

I’m so tired of using division by zero to signal errors. Really, it’s barbaric. Let’s add an error primitive that accepts a string and signals an error containing the given text. This will help you debug your own code.

3.3 Classes and Objects

Here’s the tricky one. However, one thing makes it simpler: we will not be changing the interp function in any way in order to support objects. Instead, we will be representing objects as closures, following the lead of the book.

For this assignment, we will add two new expression forms, and one new top-level form.

3.3.1 Expression Forms
3.3.2 Top-level form

4 Syntax of DUNQ5

A DUNQ5 program consists of a sequence of class definitions, and a single expression (all wrapped in a single pair of brackets).

The concrete syntax of the ClassDef and DUNQ5 expressions with these additional features can be captured with the following EBNFs.

  Program = {ClassDef ... expr}

  ClassDef = 
{class id extends id {id ...}
 {{{id id ...} expr}
  ...}}

  expr = num
  | string
  | this
  | id
  | {if expr expr expr}
  | {vars {{id expr} ...} expr}
  | {id <- expr}
  | {new id expr ...}
  | {send expr expr expr ...}
  | {lam {id ...} expr}
  | {rec {{id id ...} expr} expr}
  | {expr expr ...}

... where an id is not vars, if, lam, new, send, <-, or rec.

4.1 Primitives

procedure

(+ a b)  real

  a : real
  b : real
Compute a+b.

procedure

(- a b)  real

  a : real
  b : real
Compute a-b

procedure

(* a b)  real

  a : real
  b : real
Compute a*b

procedure

(/ a b)  real

  a : real
  b : real
If b is not zero, compute a/b.

procedure

(<= a b)  boolean

  a : real
  b : real
Return true if a is less than or equal to b

procedure

(equal? a b)  boolean

  a : any
  b : any
Return true if neither value is a closure or a primitive operator and the two values are equal.

procedure

(substring str begin end)  string

  str : string
  begin : int
  end : int
Return the substring of the given string beginning at the character in position begin and ending just before the character in position end.

procedure

(begin a b)  any

  a : any
  b : any
Return b.

procedure

(error msg)  any

  msg : string
Signal an error containing the given string.

value

true : boolean

the literal boolean representing true.

value

false : boolean

the literal boolean representing false.

5 Classes, Objects, and Inheritance

In our system, all classes are declared first. They must declare a superclass, but they may declare the base “Object” superclass. This is a special superclass that does not need to be declared, and has no fields or methods.

Further, classes must be declared in top-down order (actually a partial order), superclasses first.

Each class contains a set of fields, and a set of methods. The methods may refer to the fields in that class. Within the method declarations, the name this will refer to the object on which the method was called.

When an object is created it will also create an object of its superclass. The translation example below may make this more clear.

In order to create an object, you must supply values for all of the fields of its class and for all the fields of its superclasses, topmost first. Here’s an example:

{{class Top extends Object {a} {}}
 
 {class Middle extends Top {b} {}}
 
 {class Bottom extends Middle {c} {}}
 
 {new Bottom 10 11 12}}

In this example, the value 10 would be the a field of the Top object, the value 11 would be the b field of the Middle object, and the value 12 would be the c field of the Bottom object.

In order to keep things simple, our class system has lots of restrictions.

First off, fields are all “private”. They may be accessed only through provided methods.

Here’s an example of a Point class that provides a getter for x and y:

{class Point extends Object
  {x y}
  {{{get-x} x}
   {{get-y} y}}}

Next, the fields of superclasses are visible only to those superclasses. If subclasses want to use those methods, they’ll need to use the getters. Here’s an example of a 3d point as a subclass:

{class 3DPoint extends Point {z}
  {{{get-z} z}
   {{dist-from-zero}
    {vars {{y {send this "get-y"}}}
     {vars {{x {send this "get-x"}}}
      {expt {+ {* z z} {+ {* y y} {* x x}}} 1/2}}}}}}

Note that calls to getters are required in order to get the values of x and y.

Another simplification is that we disallow shared field names between classes and their superclasses.

Also, you cannot declare the same class name twice.

On the other hand, we will be doing dynamic dispatch “correctly”. Here’s a contrived example:

{class Rider extends Object {}
  {{{one-if-by-land}
    {if {equal? {send this "travel-mode"} "by land"}
     1
     2}}}}
 
{class LandRider extends Rider {}
  {{{travel-mode} "by land"}}}
 
{class SeaRider extends Rider {}
  {{{travel-mode} "by sea"}}}

If we call {send my-rider "one-if-by-land"}, it should work for both LandRiders and SeaRiders.

6 Desugaring

We will be implementing the class system using desugaring. That is, the parser will translate the new forms into existing forms.

6.1 Desugaring ClassDefs

The most complex translation is for the class form. In order to illustrate the process, here’s the translation of the 3DPoint class shown above into the existing concrete syntax, as it would appear as a rec binding:

{{3DPoint x y z}
 {vars {{parent {new Point x y}}}
  {lam {mesg}
   {if {equal? mesg "get-z"}
    {lam {this} z}
    {if {equal? mesg "dist-from-zero"}
     {lam {this}
      {vars {{y {send this "get-y"}}}
       {vars {{x {send this "get-x"}}}
        {expt {+ {* z z} {+ {* y y} {* x x}}} 1/2}}}}
     {parent mesg}}}}}}

Note that large chunks of this (especially the body of dist-from-zero) are taken directly from the class declaration. Also note the delegation to the parent in the case that we don’t find a definition for the given method name.

Also, note that this uses the name parent in a way that could cause shadowing, if the programmer used the name parent in a different way. The right way to solve this problem is with macro hygiene. Lacking it, we will simply cross our fingers and hope that the programmer doesn’t use these as identifiers.

Also, note that this concrete syntax still needs to be parsed. I recommend using the parser to rewrite the syntax into another s-expression, and then calling parse on the result.

Another important point concerns the correct computation of the class arguments, and how to pass them on to the superclass correctly. In the example above, the translation must have enough information about the superclass to know that it requires two arguments named x and y. You’ll probably want to make two passes through the ClassDefs; the first to build this table, the second to perform the actual desugaring.

In order to be able to “bottom out”, you’ll need a built-in declaration for Object. Here’s the one I suggest, written as a rec clause:

{{Object}
  {lam {mesg} {error "NO SUCH METHOD"}}}

Calling any methods on something created using this function will simply signal an error.

When you’re done, you’ll have a set of functions, one set for each class. Add these to the outside of the program, as rec bindings. So, for instance, if we declare the Point and 3DPoint classes, our final program would look like:

{rec {{Object} ...}
  {rec {{Point x y} ...}
    {rec {{3DPoint x y z} ...}
      ...}}}

... where the ...’s are replaced with the class definition functions and the body of the program.

6.2 Desugaring expression forms

Next, we need to desugar the new and send expressions. These are much simpler.

The new expression is just translated into an application. Specifically,

{new Point 79 2}

becomes

{Point 79 2}

We could come up with a more elaborate way of separating constructors from applications, but it would just add complexity.

The send expression is a teensy bit more complex, because it has to expand into nested applications. So, for instance,

{send p1 "add-x" 999}

becomes

{vars {{obj p1}} {{obj "add-x"} obj 999}}

The temporary variable obj is necessary in case the target of the send is an expression that performs mutation. So, for instance, if we wrote {send {get-new-socket} bind-to 4421}, we certainly wouldn’t want to create two new sockets.

Also, as with parent, we’re simply crossing our fingers and hoping that the programmer doesn’t use these identifiers. This is a macro hygiene problem.

7 Quasiquote!

There are a number of ways of implementing desugaring for this assignment. You could expand your class forms directly into ExprC’s, for instance. This works fine, but it’s very verbose and painful.

Instead, I advise you to write a desugarer for classes that expand into one big s-expression that is then fed into the parser.

But wait! There’s a problem!

Suppose I want to construct the s-expression

'{vars {{parent {new parent-class-name x y}}} ...}

.. except that where it says parent-class-name, we want to plug in the actual name of the parent class. Put differently, we want an s-expression template. Quasiquote provides that template.

The basic idea is this. When you begin a quoted expression with a ‘ (back-tick), rather than a ’ (forward tick), you can use a comma in order to escape again into the language outside of the quote.

So, for instance, if you wrote

(define test `{+ 3 4})
(define then `{+ p 15})
(define els `{f x})
`{if ,test ,then ,els}

It would evaluate to

`{if {+ 3 4} {+ p 15} {f x}}

You can use “unquoting” (the comma) to insert pieces of data from the enclosing language.

There’s one more form that may be useful to you, “unquote splice”. It’s written using ,@ (comma at-sign), and it “splices” its argument into the context. So, if you had written

(local [(define test `{+ 3 4})
        (define then {list `+ `p `15})
        (define els `{f x})]
  `{if ,test ,@then ,els})

It would produce

`{if {+ 3 4} + p 15 {f x}}

Notice that the elements of the second list are now “spliced” into the surrounding quoted list.

8 Suggested Implementation Strategy

As always, you may feel free to ignore my suggested implementation strategy. What follows is just the implementation strategy that makes the most sense to me. It is based on starting from your Assignment 3. If you choose to start with your Assignment 4, you might check with your instructor.

First, I would make the changes to the environment: change environments to map Symbols to boxed Values, and update the IdC and AppC rules. The IdC must wrap its result in an unbox, and the newly added bindings in AppC’s must be wrapped with a call to box.

Next, add recursion as a new language form.

After this, we need to deal with classes.

Let me reiterate here, that this requires no changes at all to the interp function, or the set of values. Our entire object system will be implemented using desugaring into existing forms.

I think I would start simple, by implementing the new and send desugaring operations. The desugaring for these forms can live in the same place as the desugaring of existing forms such as vars–presumably, in the parser.

Translating the classes will be the big part. I think I might start by translating classes that have no fields, only methods. These are less work because you don’t need to figure out what fields are associated with the superclass. Get these working, and make sure that you can delegate to the superclass when no method with the given name is found.

Make sure that your tests of desugaring are just that–tests of desugaring. That is, your expected results should probably be s-expressions, not Values.

Next, try classes with fields that have superclasses with no fields.

Finally, add support for superclasses with fields. In order to do this, you’ll need to keep track of the field names associated with each superclass.

Now you’re done!

9 Interface

Make sure that you include the following functions, and that they match this interface:

procedure

(parse-prog program)  ExprC

  program : Sexp
Parses a sequence of classdefs and a “main” expression into a single large expression.

procedure

(parse s)  ExprC

  s : Sexp
Parses a single expression, as in prior assignments.

procedure

(interp s e)  Value

  s : ExprC
  e : Env
Interpret a single expression, as in prior assignments.

procedure

(top-interp program)  string

  program : s-expression
Combines parsing and evaluation. Here’s the code you should probably use for this function:

(define (top-interp [program : s-expression]) : string
  (serialize (interp (parse-prog program) (make-top-env))))