Posts tagged Teaching

On the relationship between mathematical functions and program functions

:: Programming, Teaching

By: John Clements

NOTE: this text is a brief overview intended for instructors of a CS course that we’re developing; it is not technical. It tries to make the case that our early courses should steer students toward understanding problems in terms of pure functions. If you have suggestions or feedback, I’d love to hear them/it.


This section of the course introduces functions, a crucial topic in the field of computer science AND in the field of math.

Programmers and mathematicians sometimes think about the term “function” somewhat differently. Furthermore, some people who are familiar with both fields assign different meanings to the word “function” in the two fields.

The definition of a function in the mathematical domain is fairly well specified, though of course things get a little fuzzy around the edges. We’re going to define functions as the arrows in the category Set, more or less (if that’s not helpful, ignore it). That is, a function has a specified domain and a specified codomain, and it maps every element of the domain to a particular element in the codomain. There’s no requirement that it map every element of the domain to a different element of the codomain (one-to-one) OR that there be some element of the domain that maps to any chosen element of the codomain (onto). This (I claim) is the standard notion of “function” in math.[*]

Programmers also use and love functions. Nearly every programming language has a notion of functions. Of course, they’re sometimes called “procedures” or even “paragraphs” (I believe that’s COBOL. Yikes.). In programming, functions are often thought of as being elements of abstraction that are designed to allow repetition. And so they are. But it turns out that they also, in most modern programming languages, can be thought of as mathematical functions. Well, some of them can.

For many functions, this is totally obvious. If I consider the function from numbers to numbers that we might write in math class as f(x) = 14x + 2, then I can write that as a function in most programming languages. (If you disagree with me, hold that thought.)

But… things aren’t always so clear. What about a function that doesn’t return at all? What about a function that takes input, or produces output? What about a function that mutates an external variable, or reads a value from a mutable value? What about a function that signals an error? All of these present problems, some more substantial than others. None of these have a totally obvious mapping to mathematical functions.

There certainly are ways to fit these functions into mathematical models, but in general, the clearest lesson is that when there is a natural way to express a problem using functions that map directly to mathematical functions, we should. These are generally called “pure” or “purely functional” functions.

So, why should it matter whether our functions are pure? What benefits do we gain when we express functions in purely functional ways?

The clearest one is predictability, also known as debuggability and testability. When I write a pure function that maps the input “savage speeders” to 17, then I know that it will always map that string to 17; I don’t need to worry that it will work differently when the global foo counter is less than zero, or when it’s run in parallel, or on Wednesday, or when the value in memory location 0x3342a7 is less than the value in memory location 0x3342a8.

Put differently, pure functions allow me to reliably decompose problems into sub-pieces. When I’m debugging and testing, I don’t need to worry about setup and teardown to establish external conditions.

Another way to understand this is to dip into functions that use mutation. If we want to model these as mathematical functions, we need to understand that in addition to their stated inputs, they take additional hidden inputs. In the simplest case, this may be the value of a global counter. Things get much more complex when we allow mutation of data structures; now we need to worry about whether two values are the “same” value; that is, whether mutating one of them will change the other. Worse still, mutating certain values may affect the evaluation of other concurrent procedures.

For these reasons and others like them, pure functions are vastly easier to reason about, debug, and maintain. Over time, many of our programming domains and paradigms are migrating toward primarily-pure settings. Examples include the spread of the popular map-reduce frameworks, and the wild explosion of popularity in deep learning networks. In both cases, the purity spreads downward from a mathematical framework.

Note that it is not always the case that pure approaches are the most natural “first choice” for programmers, especially introductory programmers, for whom programs are often imagined as a “sequence of changes”; do this, then do this, then do this, then you’re done. In this model, the program is performing a series of mutations on a larger world. Helping introductory programmers move to a purer model is a challenge, but one with substantial payoff.

For this reason, this section focuses directly on pure functions, and invites students to conceive of programs using the models that they’ve been taught in elementary and secondary school, most particularly tables mapping inputs to outputs.

[*] The only reason I mention the category Set is to draw attention to the distinction between “codomain” and “range”; every function has a named codomain, regardless of whether its range covers it. For instance, the “times-two” function from Reals to Reals is a different function from the “times-two” function from integers to integers, and the “times-two” function from integers to reals is yet a third function.

knowing what’s out there

:: Programming, Teaching

By: John Clements

I’m teaching a class to first-year college students. I just had a quick catch-up session with some of the students that had no prior programming experience, and one of them asked a fantastic question: “How do you know what library functions are available?”

In a classroom setting, teachers can work to prevent this kind of question by ensuring that students have seen all of the functions that they will need, or at least that they’ve seen enough library functions to complete the assignment.

But what about when they’re trying to be creative, and do something that might or might not be possible?

Let’s take a concrete example: a student in a music programming question wants to reverse a sound. How can this be done?

ontologies OF programs

:: Programming Languages, Teaching

By: John Clements

Reading Daniel Dennett’s “From Bacteria to Bach and Back” this morning, I came across an interesting section where he extends the notion of ontology—a “system of things that can be known”—to programs. Specifically, he writes about what kinds of things a GPS program might know about: latitudes, longitudes, etc.

I was struck by the connection to the “data definition” part of the design recipe. Specifically, would it help beginning programmers to think about “the kinds of data that their program ‘knows about’”? This personification of programs can be seen as anti-analytical, but it might help students a lot.

Perhaps I’ll try it out this fall and see how it goes.

Okay, that’s all.

restrictive or: notes from the dark side

:: Programming Languages, Teaching

By: John Clements

Okay, it’s week four of data structures in Python. In the past few days, I’ve read a lot of terrible code. Here’s a beautiful, horrible, example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# An IntList is one of
# - None, or
# - Pair(int, IntList)
class Pair:
    def __init__(self, first, rest):
        self.first = first
        self.rest = rest

    # standard definitions of __eq__ and __repr__ ...

# A Position is one of
# - an int, representing a list index, or
# - None

# IntList int -> Position
# find the position of the sought element in the list, return None if not found.
def search(l, sought):
    if l == None:
        return None
    rest_result = search(l.rest, sought)
    if (l.first == sought or rest_result) != None:
        if l.first == sought:
            return 0
        else:
            return 1 + rest_result

This code works correctly. It searches a list to find the position of a given element. Notice anything interesting about it?

Take a look at the parentheses in the if line. How do you feel about this code now?

(Spoilers after the jump. Figure out why it works before clicking, and how to avoid this problem.)

Not liking Python any better now

:: Programming Languages, Teaching

By: John Clements

It’s much closer to ‘go’ time now with Python, and I must say, getting to know Python better is not making me like it better. I know it’s widely used, but it really has many nasty bits, especially when I look toward using it for teaching. Here’s my old list:

  • Testing framework involves hideous boilerplate.
  • Testing framework has standard problems with floating-point numbers.
  • Scoping was clearly designed by someone who’d never taken (or failed to pay attention in) a programming languages course.
  • The vile ‘return’ appears everywhere.

But wait, now I have many more, and I’m a bit more shouty:

  • Oh dear lord, I’m going to have to force my students to implement their own equality method in order to get test-case-intensional checking. Awful. Discovering this was the moment when I switched from actually writing Python to writing Racket code that generates Python. Bleah.
  • Python’s timing mechanism involves a hideously unhygienic “pass me a string representing a program” mechanism. Totally dreadful. Worse than C macros.
  • Finally, I just finished reading Guido Van Rossum’s piece on tail-calling, and I find his arguments not just unconvincing, not just wrong, but sort of deliberately insulting. His best point is his first: TRE (or TCO or just proper tail-calling) can reduce the utility of stack traces. However, the solution of translating this code to loops destroys the stack traces too! You can argue that you lose stack frames in those instances in which you make tail calls that are not representable as loops, and in that case I guess I’d point you to our work with continuation marks. His next point can be paraphrased as “If we give them nice things, they might come to depend on them.” Well, yes. His third point suggests to me that he’s tired of losing arguments with Scheme programmers. Fourth, and maybe this is the most persuasive, he points out that Python is a poorly designed language and that it’s not easy for a compiler to reliably determine whether a call is in tail position. Actually, it looks like he’s wrong even here; I read it more carefully, and he’s getting hung up on some extremely simple scoping issues. I’m really not impressed by GvR as a language designer.

time spent on CPE430, Spring 2016

:: CPE 430, Teaching

By: John Clements

Earlier this year, I was talking to Kurt Mammen, who’s teaching 357, and he mentioned that he’d surveyed his students to see how much time they were putting into the course. I think that’s an excellent idea, so I did it too.

Specifically, I conducted a quick end-of-course survey in CPE 430, asking students to estimate the number of weekly hours they spent on the class, outside of lab and lecture.

Here are some pictures of the results. For students that specified a range, I simply took the mean of the endpoints of the range as their response.

Density of responses

Density of responses

Then, for those who will complain that a simple histogram is easier to read, a simple histogram of rounded-to-the-nearest-hour responses:

Histogram of responses

Histogram of responses

Finally, in an attempt to squish the results into something more accurately describable as a parameterizable normal curve, I plotted the density of the natural log of the responses. Here it is:

Density of logs of responses

Density of logs of responses

Sure enough, it looks much more normal, with no fat tail to the right. This may just be data hacking, of course. For what it’s worth, the mean of this curve is 2.13, with a standard deviation of 0.49.

(All graphs generated with Racket.)

things I already dislike about Python

:: Programming Languages, Teaching

By: John Clements

I’m just getting started, but already Python is looking like a terrible teaching language, relative to Racket.

  • Testing framework involves hideous boilerplate.
  • Testing framework has standard problems with floating-point numbers.
  • Scoping was clearly designed by someone who’d never taken (or failed to pay attention in) a programming languages course.
  • The vile ‘return’ appears everywhere.

Break it! Confrontational thinking in computer science

:: Programming Languages, Teaching

By: John Clements

So here I am grading another exam. This exam question asks students to imagine what would happen to an interpreter if environments were treated like stores. Then, it asks them to construct a program that would illustrate the difference.

They fail, completely.

(Okay, not completely.)

By and large, it’s pretty easy to characterize the basic failing: these students are unwilling to break the rules.

Is teaching programming like teaching math?

:: Programming, Programming Languages, Teaching

By: John Clements

One of my children is in third grade. As part of a “back-to-school” night this year, I sat in a very small chair while a teacher explained to me the “Math Practices” identified as part of the new Common Core standards for math teaching.

Perhaps the small chair simply made me more receptive, taking me back to third grade myself, but as she ran down the list, I found myself thinking: “gosh, these are exactly the same skills that I want to impart to beginning programmers!”

Here’s the list of Math Practices, a.k.a. “Standards for Mathematical Practice”:

  1. Make sense of problems and persevere in solving them.
  2. Reason abstractly and quantitatively.
  3. Construct viable arguments and critique the reasoning of others.
  4. Model with Mathematics.
  5. Use appropriate tools strategically.
  6. Attend to precision.
  7. Look for and make use of structure.
  8. Look for and express regularity in repeated reasoning.

Holy Moley! Those are incredibly relevant in teaching programming. Furthermore, they sound like they were written by someone intimately familiar with the How To Design Programs or Bootstrap curricula. Indeed, in the remainder of my analysis, I’ll be referring specifically to the steps 1–4 of the design recipe proposed by HtDP (as, e.g., “step 2 of DR”).

Let’s take those apart, one by one:

Too Elegant For September

:: Programming Languages, Teaching

By: John Clements

Being on sabbatical has given me a bit of experience with other systems and languages. Also, my kids are now old enough to “mess around” with programming. Learning from both of these, I’d like to hazard a bit of HtDP heresy: students should learn for i = 1 to 10 before they learn

1
2
3
(define (sum lon)
  (cond [(empty? lon) 0]
        [else (+ (first lon) (sum (rest lon)))]))

To many of you, this may seem obvious. I’m not writing to you. Or maybe you folks can just read along and nod sagely.

HtDP takes this small and very lovely thing—recursive traversals over inductively defined data—and shows how it covers a huge piece of real estate. Really, if students could just understand how to write this class of programs effectively, they would have a vastly easier time with much of the rest of their programming careers, to say nothing of the remainder of their undergraduate tenure. Throw a few twists in there—a bit of mutation for efficiency, some memoization, some dynamic programming—and you’re pretty much done with the programming part of your first four years.

The sad thing is that many, many students make it through an entire four-year curriculum without ever really figuring out how to write a simple recursive traversal of an inductively defined data structure. This makes professors sad.

Among the Very Simple applications of this nice idea is that of “indexes.” That is, the natural numbers can be regarded as an inductively defined set, where a natural number is either 0 or the successor of a natural number. This allows you to regard any kind of indexing loop as simply a special case of … a recursive traversal of an inductively defined data structure.

So here’s the problem: in September, you face a bunch of bright-eyed, enthusiastic, deeply forgiving first-year college students. And you give them the recursive traversal of the inductively defined data structure. A very small number of them get it, and they’re off to the races. The rest of them struggle, and struggle, and finally get their teammates to help them write the code, and really wish they’d taken some other class.

NB: the rest of this makes less sense… even to me. Not finished.

However, another big part of the problem is … well, monads are like burritos.

Let me take a step back.

The notion of repeated action is a visceral and easily-understood one. Here’s what I mean. “A human can multiply a pair of 32-bit integers in about a minute. A computer can multiply 32-bit integers at a rate of several billion per second, or about a hundred billion times as fast as a person.” That’s an easily-understood claim: we understand what it means to the same thing a whole bunch of times really fast.

So, when I write

for i=[1..100] multiply_two_numbers();

It’s pretty easy to understand that I’m doing something one hundred times.