Assignment 8: Binary Search Trees

In lab this week we made a binary tree and wrote some recursive traversals. For this assignment, we're going to update that to a binary search tree and compare performance.

Since this is a console application, you will only need to commit the Program.cs file. The repo is here:

https://cssvn.utrgv.edu/svn_etomai/201810_3328/assn08_bst/<username>

Recursive Find

Create a new project and bring over your lab code. You need to be through the How Does Your Garden Grow? part of the lab so that you can recursively create and print your tree.

Now, you're going to add a FindByTitle method to your Node class that does a traversal (just like Print) looking for a book with the given title. If it finds it, return that Book, otherwise return null. Just like with Print, the recursion breaks down into:

  • If my Book has the right title, return it
  • Otherwise, find it in the Left subtree
  • Otherwise, find it in the Right subtree

You'll need to look up how string comparisons work in C#. Same idea as C++, but instead of overloading < and >, it uses explicit comparison methods.

If you completed the Forest for the Trees section of the lab to set up the yield return enumerator, then you can use that method or a LINQ query to implement the finding.

Demonstrate in main that your FindByTitle method works for titles that are and are not in the set of Books in your tree.

Binary Search Tree

Right now our tree doesn't really offer any benefits. However, by converting it to a binary search tree, we can make it much more efficient for adding and finding books. A binary search tree is an ordered data structure. Adding new books to a binary search tree is just like adding them to an ordered list - you can't just put it in the first available spot, you have to put it in the right place.

The rule for inserting into a binary search tree is that given a node, everything to the left must come before, and everthing to the right must come after. This is a valid binary search tree:

Pick any node, and you'll see that all the values in the left subtree are smaller, and all the values in the right subtree are bigger.

To build a binary search tree instead of an unordered tree, all we have to do is change our Add method. Recall that our rules before were:

  • If you have an empty left or right, create a new node there for this Book.
  • Otherwise, ask your left tree to add the book.

The new rules are the rules for a binary search tree, ordered by title:

  • If the new Book title comes before mine, go left
    • If left is null, create a new node for it there
    • Otherwise, ask left to add it
  • Otherwise, go right
    • If right is null, create a new node for it there
    • Otherwise, ask right to add it

OrderedNode

To implement this, create a new class OrderedNode that inherits from Node. Make Add virtual in Node, and override it in OrderedNode to implement the binary search tree add rules.

Look at that picture of the binary search tree above again, and consider that if you print it in-order (left then me then right), then numbers would come out:

1 3 4 6 7 8 10 13 14

Test your new OrderedNode in main by adding Books in random order, and then call Print to display them. They should be printed in alphabetical order by title (since your Print does an in-order traversal).

Speedy

Finally, since the data in a binary search tree is ordered, it should be much faster to find things. The binary search tree implments binary search. Looking at the example again, if I'm looking for the number 13, I start at the root which is 8. Since 13 > 8, I can ignore the whole left side and move right to the 10 node. 13 > 10, so I move right again to the 14 node. 13 < 14, so now I move left and find what I'm looking for.

Override FindByTitle in OrderedNode (remember to make it virtual in Node) so that it uses the binary search strategy:

  • If my Book has the right title, return it
  • Otherwise, if the title is before mine, find it in the left subtree
  • Otherwise, find it in the right subtree

You can't use the yield return traversal here.

Demonstrate in main that your new, ordered FindByTitle method still works for titles that are and are not in the set of Books in your tree.

Prove It

Add a static method to Program that generates a random string (feel free to google).

In main, using random strings and numbers, build an unordered tree with Node and an ordered tree with OrderedNode, each with the same 10000 Books. Keep track of 500 of the random titles in an array.

Search for the 500 saved titles plus 500 more random ones (testing books that aren't in the tree) in the unordered and ordered trees, and time how long each tree takes to try and find all 1000. Try scaling up the unordered tree to 100000 and a 1000000 books to see how that impacts the search. (The unordered tree will have a stack overflow error at those numbers, so don't worry about it). Report the timing results in a comment in your code. Here's how you time code in C#:

Stopwatch timer = Stopwatch.StartNew();

// stuff you want to time

timer.Stop();  
TimeSpan timespan = timer.Elapsed;
Console.WriteLine(timespan.TotalMilliseconds);