Project 2, CSC 202
Invitation Link for this assignment.
For this project you will build an application that uses your list implementations from lab. To support this application, you will add some additional functionality to your list implementations. Once the implementation has been completed, you will perform a set of experiments to measure the run-time behavior of your list implementations. In addition to submitting the source code for your solution, you will also submit a short paper that outlines your experiments (including discussion of implementation variations) and displays the results of the experiments in various graphs.
The behavior of the client application is explained first and then a small set of required list operations is discussed. You are free to add additional list operations, but only if each is added to both implementations (i.e., to linked list and array list); your application code must not rely directly on the internal implementation of a given list data structure.
Note: For this assignment and every assignment in the class, every function must come with a signature and a purpose statement. You do not need to provide test cases for functions that produce output or consume input, but you must test (directly or indirectly) all functions that do not involve I/O (Input/Output), and you should make an attempt to design (or refactor) your code so that the functions that involve I/O are as small as possible.
1 Music Catalog
For this project you will implement a music catalog program with functionality similar to various media players (e.g., iTunes). Of course, this program will provide much less functionality than an actual media player (i.e., playing music is not supported, there is no graphical interface, playlists are not supported, etc.), but it will serve as a framework for some consideration of the design and implementation of such a system, it will demonstrate some uses for lists, and it will require that you consider operations beyond the initial set provided by your lists.
1.1 Input
Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits |
Grazed Knees--Snow Patrol--Final Straw |
The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming |
Blah |
Stockholm Syndrome--Muse--Absolution |
Dust in the Wind--Kansas--Kansas |
North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip |
|
Nothingman--Pearl Jam--Vitalogy |
Not a Song-- |
Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1) |
The Lowering--The Avett Brothers--Four Thieves Gone |
Catalog input errors: |
line 4: malformed song information |
line 10: malformed song information |
1.2 Menu
Tip: sys.stdin.readline() is a useful technique for reading from the "user" (standard input) and it works in both Python 2.7 and Python 3.
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: |
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: burrito |
|
Invalid option |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: |
1.3 Print catalog
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 1 |
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits |
1--Grazed Knees--Snow Patrol--Final Straw |
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming |
3--Stockholm Syndrome--Muse--Absolution |
4--Dust in the Wind--Kansas--Kansas |
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip |
6--Nothingman--Pearl Jam--Vitalogy |
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1) |
8--The Lowering--The Avett Brothers--Four Thieves Gone |
1.4 Song Information
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 2 |
Enter song number: 3 |
|
Song information ... |
Number: 3 |
Title: Stockholm Syndrome |
Artist: Muse |
Album: Absolution |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 2 |
Enter song number: 8 |
|
Song information ... |
Number: 8 |
Title: The Lowering |
Artist: The Avett Brothers |
Album: Four Thieves Gone |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 2 |
Enter song number: 99 |
|
... Invalid song number |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: |
1.5 Sort
Selecting the "Sort" option should prompt the user for the primary sort key and then, if a valid option is given, sort the list using that key (see the discussion of sorting under the required list operations below). The results of the sort will be visible only when the catalog is printed.
Note: Sorting the catalog list itself will complicate the implementation of the "Song Information" menu option (at the very least it will be detrimental to performance). Consider using a separate list for sorting/displaying the catalog.
Tip: Do not write a separate sort for each key. Instead, generalize your sort to take a comparison function as an argument (some, like those working in Java, refer to this as "comparator"). You can then pass the appropriate function to the sort based on the user’s choice and use this function to compare two songs within the sort implementation.
1.5.1 Sort Key Comparison Precedence
Of course, many songs have the same title (crazy, right?) and, obviously, some are by the same artist and on the same album (are there multiple songs with the same title by the same artist on the same album?), so the order of some songs with an equal sort key may vary depending on the sort algorithm used. To reduce this variance, and to make your sort slightly more interesting, we will define a key comparison precedence of artist -> album -> title -> number.
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 3 |
|
Sort songs |
0) By Number |
1) By Title |
2) By Artist |
3) By Album |
Sort by: 2 |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 1 |
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming |
4--Dust in the Wind--Kansas--Kansas |
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip |
3--Stockholm Syndrome--Muse--Absolution |
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1) |
6--Nothingman--Pearl Jam--Vitalogy |
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits |
1--Grazed Knees--Snow Patrol--Final Straw |
8--The Lowering--The Avett Brothers--Four Thieves Gone |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 3 |
|
Sort songs |
0) By Number |
1) By Title |
2) By Artist |
3) By Album |
Sort by: title |
|
... Invalid sort option |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: |
1.6 Add Songs
While a static catalog can be useful, the ability to extend the catalog will make your application more similar to an actual media player. A media player application might support adding songs (or general media) by downloading, by ripping physical media, or by adding existing files from a directory. For this program, you will implement a simpler approach that still mimics the behavior required in terms of your catalog (and display) lists.
Selecting the "Add Songs" option should result in an additional prompt for the name of a song listing file. This file is expected to be formatted the same as the initial catalog file. Your program must read the contents of this file (reporting errors as before) and add each valid song to the catalog (the song numbers for these new song must be contiguous, must be in order of entry in the file, and must continue from the last used number in the existing catalog). The addition of new songs must preserve the sort order most recently selected by the user (i.e., if the user sorted by title before adding new songs, then printing after adding will display the songs in order by title).
If the provided file cannot be opened, print an appropriate error message and return to the main menu.
Note: there are multiple approaches to implementing this feature. You are encouraged to explore different approaches with consideration for run-time complexity. One approach that is explicitly discouraged is reading through the file multiple times (to, for instance, determine the number of entries).
Oceanside--The Decemberists--5 Songs |
Shiny--The Decemberists--5 Songs |
My Mother Was A Chinese Trapese Artist--The Decemberists--5 Songs |
Angel, Won't You Call Me?--The Decemberists--5 Songs |
I Don't Mind--The Decemberists--5 Songs |
Apology Song--The Decemberists--5 Songs |
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 3 |
|
Sort songs |
0) By Number |
1) By Title |
2) By Artist |
3) By Album |
Sort by: 1 |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 1 |
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits |
4--Dust in the Wind--Kansas--Kansas |
1--Grazed Knees--Snow Patrol--Final Straw |
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip |
6--Nothingman--Pearl Jam--Vitalogy |
3--Stockholm Syndrome--Muse--Absolution |
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming |
8--The Lowering--The Avett Brothers--Four Thieves Gone |
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1) |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 4 |
Enter name of file to load: new_songs |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 1 |
12--Angel, Won't You Call Me?--The Decemberists--5 Songs |
14--Apology Song--The Decemberists--5 Songs |
0--Disarm--Smashing Pumpkins--Rotten Apples: Greatest Hits |
4--Dust in the Wind--Kansas--Kansas |
1--Grazed Knees--Snow Patrol--Final Straw |
13--I Don't Mind--The Decemberists--5 Songs |
11--My Mother Was A Chinese Trapese Artist--The Decemberists--5 Songs |
5--North Dakotachrome--Lawsuit--Emergency Third Rail Power Trip |
6--Nothingman--Pearl Jam--Vitalogy |
9--Oceanside--The Decemberists--5 Songs |
10--Shiny--The Decemberists--5 Songs |
3--Stockholm Syndrome--Muse--Absolution |
2--The Best Of What's Around--Dave Matthews Band--Under The Table And Dreaming |
8--The Lowering--The Avett Brothers--Four Thieves Gone |
7--Yellow Ledbetter--Pearl Jam--Lost Dogs (Disc 1) |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: 4 |
Enter name of file to load: fe |
|
'fe': No such file or directory |
|
Song Catalog |
1) Print Catalog |
2) Song Information |
3) Sort |
4) Add Songs |
0) Quit |
Enter selection: |
2 Required List Operations
The following operations must be added as functions to your existing list implementations. You must test these operations separately from your music catalog application (i.e., you must write unit tests for these operations). These tests must be placed in the files named linked_list_tests.py and array_list_tests.py.
2.1 foreach
You will discover, or may have already discovered, that accessing and "processing" (e.g., printing) every element in a list is a relatively common operation, but doing so via the get function is particularly inefficient for long linked lists. To alleviate this inefficiency, you will add a foreach function to your list implementations.
This function takes a List and a function as arguments and applies the provided function to the value at each position in the List (from left-to-right). This function does not return a meaningful value (i.e., in Python it returns None).
2.2 sort
This function takes a List and a "less-than" function as arguments and sorts the List such that the elements are in ascending order as determined by the "less-than" function.
This function must return the sorted list. Mutation of the original list is as specified by instructor.
Requirements: The sort for your array list implementation must be either insertion sort or selection sort (you cannot use Python’s builtin sort functions). The sort for your linked list implementation must be an insertion sort into a new list (at least internally; again, mutation is as specified by instructor).
3 Empirical Study
This part of the project requires that you measure the performance of various implementations of list operations. Such measurements could be performed by timing their use in the catalog application, but doing so can skew the results (i.e., measuring the time to build the catalog list will also include the time to read from the catalog file). Instead, you will use the timeit library (an example is given below).
List Lengths: the execution time of each of your experiments will depend on the length of the corresponding lists. You must sample a range of list lengths appropriate to the experiment. For example, you might measure the performance of foreach on lists from length 10 to length 10000 in increments of 100.
Note that in order to allow a recursion depth greater than 1000, you will need to add code (just once) that instructs python to enable this. For instance:
import sys |
sys.setrecursionlimit(10100) |
import timeit |
import array_list |
import random |
|
|
# print out the timing results |
def print_timing(desc, iterations, seconds): |
print('{}\n\t{} iterations in {} seconds'.format(desc, iterations, seconds)) |
|
|
# build a list for this test ... the type of list is passed as module |
def build_list(n, module, max=10000): |
list = module.empty_list() |
for pos in range(n): |
list = module.add(list, pos, random.randrange(max)) |
|
return list |
|
|
# the function passed to foreach must take an argument, but we |
# don't really want to do anything with it for this experiment |
def noop(value): |
pass |
|
|
# timeit expects that the function passed will take no arguments, so |
# this function gathers the arguments and returns a new function that |
# uses them, but that itself does not take any arguments |
def build_operation(list): |
def run_foreach(): |
array_list.foreach(list, noop) |
|
return run_foreach |
|
|
def run_one_experiment(num_elements, num_iterations, module): |
list = build_list(num_elements, module) |
to_run = build_operation(list) |
seconds = timeit.timeit(to_run, number=num_iterations) |
|
print_timing('{}.foreach identity: {} elements'.format(module.__name__, |
num_elements), num_iterations, seconds) |
|
|
# let's try just one experiment for now |
run_one_experiment(1000, 10000, array_list) |
4 Measurements
4.1 Linked List
Measure the use of foreach vs. get to access each element in the linked list. Nothing should be done with the elements (it’s the access time that you want to measure). Vary the list sizes as discussed previously.
4.2 Array List
Measure the time to build an array list under three different growth strategies. One strategy must be to increase the "array" size by one on each add. The other two must amortize the cost of adds by allocating, as needed, more space than necessary to enable the add. Do not trim the "array" to the needed size after building the list and do not pre-allocate the known needed amount.
More specifically, the list should be built by adding new elements at the end of the list. Vary the size of the list to be built.
Your report should include a graph of both the execution time and the total allocated space for the "array" after the list has been built. Your report must explicitly discuss the different growth strategies used.
4.2.1 Array List vs. Linked List
Compare "Load" times — compare the time to build lists of each type (plot the time vs. the length for each list type). Use the "best" approach to building an array list based on the earlier experiment. You can build each list in whichever order you feel appropriate (i.e., adding at the front or adding at the end, since it turns out that reversing a list, if needed, is relatively quick (do not include such a reverse in your experiment)).
Compare "Sort" times.
Note that sorting an already sorted list can affect the performance of some sorting algorithms. As such, you will not want to run multiple iterations of sort on the same list. You can either rebuild the list for each iteration (which will, itself, skew the timing because the building time is now measured) or you can rerun the entire experiment multiple times (with the same random seed) and then average the results.
Compare "Song Information" times — compare the time required to access various indices (i.e., random access via get) for each type of list. Be sure to use the same access pattern for each list; for instance, if you access 50 elements randomly, then use the same seed for the random number generator for each type of list in your experiments.
Compare "Add Songs" times — for a reasonably long initial list (e.g., 50,000 elements), add (in sorted order as in the music catalog application) a new list of songs (varying the lengths over the set of experiments).
Measure for a "realistic" list of new songs (i.e., where the new songs are to be added at arbitrary positions within the list)
Measure with a more "contrived" example where the new songs will be added at the beginning of the list
5 Handin
You must submit the following. All but the last (study.pdf) are due on the given deadline. The study portion is due on the following Wednesday, by adding a file to your repository.
linked_list.py — Linked List implementation
linked_list_tests.py — Linked List test cases
array_list.py — Array List implementation
array_list_tests.py — Array List test cases
music_catalog.py — entry point for Music Catalog application
study.pdf — empirical study report
6 Split Deadline
This assignment has a split deadline. The code for the music application, including all of its libraries, is due on the stated deadline. However, the study (study.pdf) is due on the following Wednesday. Submit it by pushing your repo with the extra file.