Project 2, CSC 202
1 Music Catalog
1.1 Input
1.2 Menu
1.3 Print catalog
1.4 Song Information
1.5 Sort
1.5.1 Sort Key Comparison Precedence
1.6 Add Songs
2 Required List Operations
2.1 foreach
2.2 sort
3 Empirical Study
4 Measurements
4.1 Linked List
4.2 Array List
4.2.1 Array List vs. Linked List
5 Handin
6 Split Deadline
6.9.0.6

Project 2, CSC 202

Invitation Link for this assignment.

For this project you will build an application that uses your list implementations from lab. To support this application, you will add some additional functionality to your list implementations. Once the implementation has been completed, you will perform a set of experiments to measure the run-time behavior of your list implementations. In addition to submitting the source code for your solution, you will also submit a short paper that outlines your experiments (including discussion of implementation variations) and displays the results of the experiments in various graphs.

The behavior of the client application is explained first and then a small set of required list operations is discussed. You are free to add additional list operations, but only if each is added to both implementations (i.e., to linked list and array list); your application code must not rely directly on the internal implementation of a given list data structure.

Note: For this assignment and every assignment in the class, every function must come with a signature and a purpose statement. You do not need to provide test cases for functions that produce output or consume input, but you must test (directly or indirectly) all functions that do not involve I/O (Input/Output), and you should make an attempt to design (or refactor) your code so that the functions that involve I/O are as small as possible.

1 Music Catalog

For this project you will implement a music catalog program with functionality similar to various media players (e.g., iTunes). Of course, this program will provide much less functionality than an actual media player (i.e., playing music is not supported, there is no graphical interface, playlists are not supported, etc.), but it will serve as a framework for some consideration of the design and implementation of such a system, it will demonstrate some uses for lists, and it will require that you consider operations beyond the initial set provided by your lists.

1.1 Input

Your program must take a single command-line argument. This argument is expected to specify the name of a file from which your program will load the contents of the catalog. Each line of this file is meant to contain information for a single song, specifically, the song’s title, artist, and album (examples given below).

Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits

Grazed Knees--Snow Patrol--Final Straw

The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming

Blah

Stockholm Syndrome--Muse--Absolution

Dust in the Wind--Kansas--Kansas

North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip

 

Nothingman--Pearl Jam--Vitalogy

Not a Song--

Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)

The Lowering--The Avett Brothers--Four Thieves Gone

Though one might expect that a catalog file would always be properly formatted (since it is likely to be generated by the program itself), it is better practice to verify such. Your program should skip blank lines (ignore them entirely) and report any other issues with the file by printing to the screen (but still load all properly formatted entries). For instance, loading this file might result in the following output.

Catalog input errors:

line 4: malformed song information

line 10: malformed song information

1.2 Menu

Tip: sys.stdin.readline() is a useful technique for reading from the "user" (standard input) and it works in both Python 2.7 and Python 3.

After loading the catalog, the user must be presented with the following menu of options. Each of the first four options is discussed in the following parts. If the user chooses to quit (or enters end-of-file), then your program should terminate. After each option (aside from quitting) is processed, the user is again presented with the menu.

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection:

Invalid Input: If the user enters anything other than one of the supported options, your program should note the error and prompt again as demonstrated below.

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: burrito

 

Invalid option

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection:

1.3 Print catalog

Selecting the "Print Catalog" option should result in the song information for each item in the catalog being printed to the screen in a style similar to that of the input file, but prefixed by the song number and in an order as determined by the most recent sort (see below). Each song is assigned a number as it is added to the list; on load, these numbers correspond to position within the input file (but accounting for blank lines and poorly formatted entries). For the sample file above, one would see the following.

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 1

0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits

1--Grazed Knees--Snow Patrol--Final Straw

2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming

3--Stockholm Syndrome--Muse--Absolution

4--Dust in the Wind--Kansas--Kansas

5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip

6--Nothingman--Pearl Jam--Vitalogy

7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)

8--The Lowering--The Avett Brothers--Four Thieves Gone

1.4 Song Information

Selecting the "Song Information" option should result in an additional prompt for the song number. After the song number is given, the program should print the attributes of the selected song as below. An invalid song number should generate an appropriate error message.

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 2

Enter song number: 3

 

Song information ...

   Number: 3

   Title: Stockholm Syndrome

   Artist: Muse

   Album: Absolution

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 2

Enter song number: 8

 

Song information ...

   Number: 8

   Title: The Lowering

   Artist: The Avett Brothers

   Album: Four Thieves Gone

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 2

Enter song number: 99

 

... Invalid song number

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection:

1.5 Sort

Selecting the "Sort" option should prompt the user for the primary sort key and then, if a valid option is given, sort the list using that key (see the discussion of sorting under the required list operations below). The results of the sort will be visible only when the catalog is printed.

Note: Sorting the catalog list itself will complicate the implementation of the "Song Information" menu option (at the very least it will be detrimental to performance). Consider using a separate list for sorting/displaying the catalog.

Tip: Do not write a separate sort for each key. Instead, generalize your sort to take a comparison function as an argument (some, like those working in Java, refer to this as "comparator"). You can then pass the appropriate function to the sort based on the user’s choice and use this function to compare two songs within the sort implementation.

1.5.1 Sort Key Comparison Precedence

Of course, many songs have the same title (crazy, right?) and, obviously, some are by the same artist and on the same album (are there multiple songs with the same title by the same artist on the same album?), so the order of some songs with an equal sort key may vary depending on the sort algorithm used. To reduce this variance, and to make your sort slightly more interesting, we will define a key comparison precedence of artist -> album -> title -> number.

What this means is that, for two songs being compared, if the artists match, then the songs are ordered by the albums. If the albums match, then the songs are ordered by the song titles. The song number is used to break the extraordinarily rare tie by using a known unique value. When the user specifies the primary key to use, that key is moved to the front of the key comparison precedence (i.e., sorting by album will give a comparison precedence of album -> artist -> title -> number).

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 3

 

Sort songs

   0) By Number

   1) By Title

   2) By Artist

   3) By Album

Sort by: 2

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 1

2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming

4--Dust in the Wind--Kansas--Kansas

5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip

3--Stockholm Syndrome--Muse--Absolution

7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)

6--Nothingman--Pearl Jam--Vitalogy

0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits

1--Grazed Knees--Snow Patrol--Final Straw

8--The Lowering--The Avett Brothers--Four Thieves Gone

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 3

 

Sort songs

   0) By Number

   1) By Title

   2) By Artist

   3) By Album

Sort by: title

 

... Invalid sort option

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection:

1.6 Add Songs

While a static catalog can be useful, the ability to extend the catalog will make your application more similar to an actual media player. A media player application might support adding songs (or general media) by downloading, by ripping physical media, or by adding existing files from a directory. For this program, you will implement a simpler approach that still mimics the behavior required in terms of your catalog (and display) lists.

Selecting the "Add Songs" option should result in an additional prompt for the name of a song listing file. This file is expected to be formatted the same as the initial catalog file. Your program must read the contents of this file (reporting errors as before) and add each valid song to the catalog (the song numbers for these new song must be contiguous, must be in order of entry in the file, and must continue from the last used number in the existing catalog). The addition of new songs must preserve the sort order most recently selected by the user (i.e., if the user sorted by title before adding new songs, then printing after adding will display the songs in order by title).

If the provided file cannot be opened, print an appropriate error message and return to the main menu.

Note: there are multiple approaches to implementing this feature. You are encouraged to explore different approaches with consideration for run-time complexity. One approach that is explicitly discouraged is reading through the file multiple times (to, for instance, determine the number of entries).

The example that follows adds the following songs. Notice that the song numbers are based on the order of the entries in the file, but that printing the catalog continues to display the songs in order by song title (the most recent sort).

Oceanside--The Decemberists--5 Songs

Shiny--The Decemberists--5 Songs

My Mother Was A Chinese Trapese Artist--The Decemberists--5 Songs

Angel, Won't You Call Me?--The Decemberists--5 Songs

I Don't Mind--The Decemberists--5 Songs

Apology Song--The Decemberists--5 Songs

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 3

 

Sort songs

   0) By Number

   1) By Title

   2) By Artist

   3) By Album

Sort by: 1

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 1

0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits

4--Dust in the Wind--Kansas--Kansas

1--Grazed Knees--Snow Patrol--Final Straw

5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip

6--Nothingman--Pearl Jam--Vitalogy

3--Stockholm Syndrome--Muse--Absolution

2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming

8--The Lowering--The Avett Brothers--Four Thieves Gone

7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 4

Enter name of file to load: new_songs

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 1

12--Angel, Won't You Call Me?--The Decemberists--5 Songs

14--Apology Song--The Decemberists--5 Songs

0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits

4--Dust in the Wind--Kansas--Kansas

1--Grazed Knees--Snow Patrol--Final Straw

13--I Don't Mind--The Decemberists--5 Songs

11--My Mother Was A Chinese Trapese Artist--The Decemberists--5 Songs

5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip

6--Nothingman--Pearl Jam--Vitalogy

9--Oceanside--The Decemberists--5 Songs

10--Shiny--The Decemberists--5 Songs

3--Stockholm Syndrome--Muse--Absolution

2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming

8--The Lowering--The Avett Brothers--Four Thieves Gone

7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1)

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection: 4

Enter name of file to load: fe

 

'fe': No such file or directory

 

Song Catalog

   1) Print Catalog

   2) Song Information

   3) Sort

   4) Add Songs

   0) Quit

Enter selection:

2 Required List Operations

The following operations must be added as functions to your existing list implementations. You must test these operations separately from your music catalog application (i.e., you must write unit tests for these operations). These tests must be placed in the files named linked_list_tests.py and array_list_tests.py.

2.1 foreach

You will discover, or may have already discovered, that accessing and "processing" (e.g., printing) every element in a list is a relatively common operation, but doing so via the get function is particularly inefficient for long linked lists. To alleviate this inefficiency, you will add a foreach function to your list implementations.

This function takes a List and a function as arguments and applies the provided function to the value at each position in the List (from left-to-right). This function does not return a meaningful value (i.e., in Python it returns None).

2.2 sort

This function takes a List and a "less-than" function as arguments and sorts the List such that the elements are in ascending order as determined by the "less-than" function.

This function must return the sorted list. Mutation of the original list is as specified by instructor.

Requirements: The sort for your array list implementation must be either insertion sort or selection sort (you cannot use Python’s builtin sort functions). The sort for your linked list implementation must be an insertion sort into a new list (at least internally; again, mutation is as specified by instructor).

3 Empirical Study

This part of the project requires that you measure the performance of various implementations of list operations. Such measurements could be performed by timing their use in the catalog application, but doing so can skew the results (i.e., measuring the time to build the catalog list will also include the time to read from the catalog file). Instead, you will use the timeit library (an example is given below).

List Lengths: the execution time of each of your experiments will depend on the length of the corresponding lists. You must sample a range of list lengths appropriate to the experiment. For example, you might measure the performance of foreach on lists from length 10 to length 10000 in increments of 100.

Note that in order to allow a recursion depth greater than 1000, you will need to add code (just once) that instructs python to enable this. For instance:

import sys

sys.setrecursionlimit(10100)

The following shows an example use of timeit to measure the time for 10000 executions of the foreach function over a list of 1000 elements.

import timeit

import array_list

import random

 

 

# print out the timing results

def print_timing(desc, iterations, seconds):

   print('{}\n\t{} iterations in {} seconds'.format(desc, iterations, seconds))

 

 

# build a list for this test ... the type of list is passed as module

def build_list(n, module, max=10000):

   list = module.empty_list()

   for pos in range(n):

      list = module.add(list, pos, random.randrange(max))

 

   return list

 

 

# the function passed to foreach must take an argument, but we

# don't really want to do anything with it for this experiment

def noop(value):

   pass

 

 

# timeit expects that the function passed will take no arguments, so

# this function gathers the arguments and returns a new function that

# uses them, but that itself does not take any arguments

def build_operation(list):

   def run_foreach():

      array_list.foreach(list, noop)

 

   return run_foreach

 

 

def run_one_experiment(num_elements, num_iterations, module):

   list = build_list(num_elements, module)

   to_run = build_operation(list)

   seconds = timeit.timeit(to_run, number=num_iterations)

 

   print_timing('{}.foreach identity: {} elements'.format(module.__name__,

      num_elements), num_iterations, seconds)

 

 

# let's try just one experiment for now

run_one_experiment(1000, 10000, array_list)

4 Measurements

4.1 Linked List

4.2 Array List

4.2.1 Array List vs. Linked List

Consider each of these experiments as framed in the context of the music catalog application, but without a requirement that your lists contain songs (i.e., the content type of your list is somewhat irrelevant).
  • Compare "Load" times — compare the time to build lists of each type (plot the time vs. the length for each list type). Use the "best" approach to building an array list based on the earlier experiment. You can build each list in whichever order you feel appropriate (i.e., adding at the front or adding at the end, since it turns out that reversing a list, if needed, is relatively quick (do not include such a reverse in your experiment)).

  • Compare "Sort" times.

    Note that sorting an already sorted list can affect the performance of some sorting algorithms. As such, you will not want to run multiple iterations of sort on the same list. You can either rebuild the list for each iteration (which will, itself, skew the timing because the building time is now measured) or you can rerun the entire experiment multiple times (with the same random seed) and then average the results.

  • Compare "Song Information" times — compare the time required to access various indices (i.e., random access via get) for each type of list. Be sure to use the same access pattern for each list; for instance, if you access 50 elements randomly, then use the same seed for the random number generator for each type of list in your experiments.

  • Compare "Add Songs" times — for a reasonably long initial list (e.g., 50,000 elements), add (in sorted order as in the music catalog application) a new list of songs (varying the lengths over the set of experiments).

    • Measure for a "realistic" list of new songs (i.e., where the new songs are to be added at arbitrary positions within the list)

    • Measure with a more "contrived" example where the new songs will be added at the beginning of the list

5 Handin

You must submit the following. All but the last (study.pdf) are due on the given deadline. The study portion is due on the following Wednesday, by adding a file to your repository.

6 Split Deadline

This assignment has a split deadline. The code for the music application, including all of its libraries, is due on the stated deadline. However, the study (study.pdf) is due on the following Wednesday. Submit it by pushing your repo with the extra file.