The benefits of building a compiler are manyfold; here are two important ones. First, you get a much better understanding of how to translate higher level languages into assembly language. Second, you get to build and maintain a ball of code that’s somewhat larger than you handle in many other courses (between 2K and 10K lines of code, depending on your language of implementation).
As a result, this course lives at the intersection of Programming Languages, Architecture, and Software Engineering. With luck, you’ll learn something about each of these.
implement and maintain a project of up to 10,000 lines of code; tracking bugs, developing tests, and documenting code,
implement a compiler from a modern, high-level programming language to a low-level assembly language, and
present and explain code and design work.
Students taking this course must be familiar with the core principles of programming languages, including the evaluation of Abstract Syntax Trees, and the implementation of function calls, environments, and closures.
Students must also be comfortable writing larger programs (~2 KLOC), and be able to manage the development process efficiently.
Finally, students must be able to write programs in assembly language.
John Clements, aoeuclements @ brinckerhoff.org
Lecture: 10:10–11:00, MWF, room 2-212
Lab: 11:10–12:00, MWF, room 14-255
See my home page for my calendar. You can add it to your calendar, if that makes your life easier.
Office hours also appear on this calendar; you may find them easier to see if you click on the "week" tab of the calendar.
This is the course web page, its link is https://www.brinckerhoff.org/clements/2198-csc431/.
I think that an interactive and lively classroom is a better learning environment. In particular, I will almost certainly learn everyone’s name, and I’m likely to notice if you’re missing. My experience is that if you come to class reliably, you’re extremely likely to pass the class—there’s a reason that we conduct classes face-to-face; it keeps you engaged, and ensures that you’re connected to the other students in the class.
In addition, I’m likely to call on you, in places during the lecture where I want to see if you’re following what’s going on. If you don’t know, it’s totally fine to say "no, I have no idea." In particular, this is probably evidence that I’m going too fast or not explaining things well. However, I try to respect the wishes of students for whom this technique is disruptive. Please let me know if you don’t want me to call on you.
Finally, my experience standing in front of classes and more especially my experience of sitting behind classes has convinced me that laptops are useful for note-taking in approximately 1% of cases. Essentially, never.
Indeed, there’s now a mountain of evidence indicating that laptops are distracting to students and to those around them, and that even when these distractions are eliminated, taking notes on laptops fails to create learning in effective ways. I’ll just cite this one paper, because it’s got copious references to other sources.
For this reason, I do not allow the use of laptops in class without special dispensation. If you need to use a laptop to take notes, please come and talk to me; otherwise, just put it away and take notes on paper.
You will be able to complete the work in this class in one of a number of different programming languages. One of them is Racket. Others will certainly include C, C++, and Java. If students or teams are interested in using other languages (Python? Rust?) we can discuss extending this list.
This class allows teams of size two. You are not required to work with a partner, but I strongly encourage it. Since your work through the quarter will all be on a single project, you will be working with the same partner for the duration of the class, though it is possible to change partners after the first project.
The work in this class will consist principally of one quarter-long development project (hint: it’ll be a compiler). Milestones are listed on the schedule, but there may be some adjustment of these deadlines.
Programming assignments will be submitted by pushing to a github repo, but they will also be demo’ed in lab, and this will be the principal means of assigning grades. The github submissions are there in case I want to look at your code after the demo. Don’t forget to push!
Late assignments will not be accepted, but it will generally be the case that many if not most of the points on later assignments will be for tasks that were a part of earlier assignments. Don’t stop working on it just because the deadline has passed!
From time to time, we may examine student code, in lecture. Try to ensure that the code you submit is something you’d be proud to show to the others in the class.
We’re going to use Github Classroom for assignments in this class. The first lab contains an invitation link.
We’re going to be using a textbook from Jeremy Siek and Ryan Newton entitled Essentials of Compilation. The link is at the top of the page.
In addition, there are many classic textbooks that give a broad overview of compilation techniques, including Cooper and Torczon’s “Engineering a Compiler, 2nd edition”. The first chapter is a great overview, and the whole book provides detailed information and (perhaps) a different perspective on the process of building a compiler.
This class will use Piazza. This will be the principal means that I’ll use to notify you of deadlines, organizational updates, and changes to assignments. If you’re not keeping up with the group, you’re going to be missing important information.
It’s also the best way for you to direct questions to me and/or the class. Feel free to e-mail me with personal questions, but use the Piazza group as your main means of communication. It’s possible to post anonymously, if you like.
You should already have received an invitation to the Piazza group; let me know if you need an invite.
Don’t post your code or test cases to the group; anything else is fair game.
Also, please keep in mind that I (and everyone else) judge you based in part on your written communication. Spelling, complete sentences, and evidence of forethought are important in all of your posts & e-mails. One easy rule of thumb: just read over what you’ve written before clicking post or send, and imagine others in the class reading it.
We strongly encourage collaboration among students in almost every aspect of the course. See, for example, the following section on "Labs", which are designed for collaboration.
The exception is for programming assignments: you may not copy another student’s code, not even test cases. You are also responsible that no other students see your code, either during this course or afterwards, either deliberately or through negligence (e.g. via a non-private repo).
A very effective automated tool will review submissions for evidence of copying. Students believed to be cheating, i.e. both parties involved in a transfer of code, will in the first instance typically receive a 0 score for the assignment. We reserve the right to assign a failing grade in the class when appropriate.
I will be grading your code repeatedly in this class. On most assignments, your score will consist of a part (usually 20 points) based on your performance on a set of test cases automatically administered by the handin server, and a part (usually 6 points) based on my opinion of your code’s clarity, organization, and adherence to rules about purpose statements and contracts (in short: you’ve got to have them). As a rule, my "eyeball" score rubric runs something like this:
6 points – I simply can’t find anything wrong with your code.
5 points – some inelegant parts, or one or two purpose statements or contracts missing
4 points – an actual misunderstanding, or a widespread lack of purpose statements and contracts
3 points – a serious misunderstanding–you didn’t understand some major part of the assignment, or I had to work to try to get your code to compile or run without looping
2 points – your program is seriously incomplete, doesn’t compile, or has widespread major problems
1 point – you didn’t make any apparent progress on the program at all.
Finally, please note that I will place comments in some of your submissions indicating errors or stylistic requests. These will all begin with the string ;;> (in Racket) or ##> (in Python), so you can search for these in the e-mail that you get with your final assignment grade.
Here’s the most important thing to know about code in this class:
I do actually read it. That means that—
This means that getting your code to work is not the end of the process; after you get your code to work, you have to clean it up, put nice headers on the various parts, collect the test cases, document strange things that you did, and clarify the code.
You should begin with a single-paragraph comment that describes how far you got: did you finish, or did you get stuck on something? If you got stuck, describe what’s done and what’s still left to implement.
As a rule, I like to read code in a "top-down" way. This means that the definition of the top-level, important functions should come first, and the supporting functions should come later. I want to have a good understanding of the big picture before getting into the details. My experience is that if interp makes sense, then add-to-env will probably not present any difficulty.
Another part of cleaning up the code is collecting the test cases in a place that’s sensible and doesn’t interrupt the flow of reading the code. It’s probably best–after you’re done writing the code–to collect the test cases at the bottom of the file (or put them in another file altogether, if appropriate).
Whatever language you use, it’s likely to have a style guide. Here’s the one for Racket. You’re not required to follow any style guide, but anything that makes your code hard to read could hurt your score.
Finally: dead code is misleading and makes code hard to read. Delete it.
I reserve the right to assign bad scores to programs that work correctly; if I don’t think you’re doing a good job of programming, then you won’t receive a good score. "It works" isn’t a defense for bad code.
Good code is easy to read. I reserve the right to allocate a fixed period of time to grading a program submission. Don’t be surprised to see comments like "ran out of grading time here."
Naturally, all grades contain an element of subjectivity.
You definitely do not have to use Racket in this course. If you do, though, you might want to take a look at this example of how to break up your program into smaller files. Also, I should add material on separate compilation, sigh.
My experience suggests that frequent quizzes are a good way to ensure that you’re understanding what I’m teaching, and that I’m teaching things that you understand.
This class will have wednesday quizzes every other week, starting in the second week. These quizzes will probably be fifteen minutes long, and will probably take place during lab.
Grades will be determined by performance on programming projects, your final submission, and the quizzes. The breakdown of the grade is as follows:
Final Submission: 25%
I like to share your current grade in the class with you. It usually takes me a few weeks to get this set up, but you should eventually be able to check your student grade at
That’s not a clickable link, because you’ll have to edit it to add your login. So, for instance, if your name is Annette Czernikoff and your login is email@example.com, you’d put aczernik in the login spot.