Object Oriented Design Course : Exercise 1
Exercise 1 - File System Walker
Design, Code and Unit-Test Using Structural and Traversal Design Patterns
Deadline April 2nd, 2003 at Ross closing time
Description In this exercise you will:
  • Design the first version of a file system walker
  • Produce a set of UML class and sequence diagrams to document your design
  • Experience design patterns in actual C++ or Java code
  • Write a covering set of unit tests

The File System

In our file system there are five types of files you should support: Each file has several properties:
  • Text - name, path, MIME type, size, number of lines. Text files has .txt suffix
  • HTML - name, path, MIME type, size, number of lines, title. HTML files has .html or .htm suffix
  • GIF - name, path, MIME type, size, width, height. GIF files has .gif suffix
  • directory - name, path, number of files. Directories may or may not have a suffix
  • Other - name, path, size. "Other" files are all the files that does not comply with the above definitions.
The path is calculated from the given root node. Treat symbolic links as the files they link to. Here are some tips for working with the files:
  • The title of an HTML file is the text that appears between the <title> and </title> tags. Remember that HTML tags are case insensitive.
  • Java users - you might want taking a look in the class javax.swing.ImageIcon.
    C++ users - you might want to check /usr/include/gif_lib.h (link to /usr/lib/libgif.a or /usr/lib/libgif.so).
  • A list of MIME types can be found here.

Requirements

Create an executable named fswalker who traverses the file system whose root is the given directory, and creates an output file who represents the file system. If you are using Java, either add a shell script or compile using gcj. The command line should be:

fswalker [-dironly] [-nameonly] -d <path> -o <output file>

Where:

  • -d <path> - The path to the file system's root directory.
  • -o <output file> - The path to the output file.
  • -dironly - Optional argument. Means only the directories are written to the XML, but not the files
  • -nameonly - Optional argument. Means only the name of the files is written to the XML.
The -dironly, -nameonly parameters may come in every order.

For each file you have to produce an XML element, where it's properties are mapped to the element's attributes. It looks like:

<file mime-type="MIME Type" size="size" other properties...>
    The file's name
</file>

For Directory, the element will look like this:

<directory name="name" number-of-files="number of files">
    ... child file elements ...
</directory>

The XML structure is as follows:

<directory name="testdir" number-of-files="4">
    <file mime-type="text/plain" size="455" number-of-lines="34">
        aTextFile.txt
    </file>
    <directory name="innerDir" number-of-files="1">
        <file mime-type="text/html" size="6352" number-of-lines="50" title="much to do!!!">
            anHtmlFile.html
        </file>
    </directory>
    <file mime-type="image/gif" size="367" width="20" height="30">
        aGifFile.gif
    </file>
    <file size="774">
        someOtherFile.png
    </file>
</directory>

Notice that since XML uses the characters <, >, & and " at part of it's syntax, you must convert them to the equivalent XML entities &lt;, &gt;, &amp; and &quot;. To know more about XML entities read The relevant section in the XML Specification.

Design

While this exercise can be easily programmed within a single class, this won't work since this fswalker is only a first version, so it is crucial to maintain an open mind with respect to possible future requirements. Consider the following possibilities:
  1. It may be required to produce the output in formats other than text - HTML, PDF, Word or others. It may also be required to write output in several formats at once, for example: xmlp -txt -pdf -html -d ~/www.
  2. It may be required to support other file types - image files (jpg, png), ps, pdf, code source files (c, cpp, h, java), etc., each with it's own properties
  3. It may be required to modify the output document before pretty-printing it: for example, sort elements, rename tag names, change text to uppercase, and so on.
  4. It may be required to read different data formats of hierarchical data besides file systems - for example XML, object graphs, relational tables with foreign keys and others - and to be able to produce the same output for them.
  5. It may be required to assign each file type an icon - same as in current file system browsers such as Windows Explorer or konqueror. Think of where to keep this data, how to obtain the images, etc.
You must design your program so that it is easy to add code that implements the above requirements. Assume that you are the one who will actually have to code it - that's how it usually is in "real life". For each of the above requirements write an explanation in your README file, not more than three sentences long, which explains how it should be coded. For example:

Requirement: It may be required to define filters on which parts of the input document get printed. For example, new command-line arguments can dictate that only the simple (non-hierarchical) elements get printed, that only elements that start with a given string get printed, and so forth.

Solution: Write an Iterator for each kind of filter, whose next() method will move to the next element for which the filter is true. Such iterators are implemented as Decorators of other iterators, which easily enables to dynamically combine different filters and does not require changing or recompiling existing code.

It is important that each solution you present will be at most three sentences long. The intention is to enforce the use of design patterns vocabulary rather than elaborating specific class and object relationships.

Code & Unit Test

This exercise intends you to divide your time equally between actual coding and between design, writing UML diagrams, and answering the above six design questions. You may choose between C++ and Java as the implementation languages; in any language chosen, use the standard libraries to their full extent - the standard streams, strings and data structures. With a proper design, this exercise is quick and simple to code.

It is also required that you submit unit tests to test your work. Organize your unit tests into classes by subject, and write a method for each small test. Each test should be self-validating - that is, know by itself whether it has passed or failed. Writing unit tests should be an integral part of coding, and is essential when code must be changed in newer versions. You will have a chance to estimate the convenience of unit tests in exercise 3. Until then:

  • Read the following article about unit testing as part of coding and the build process.
  • If you write in Java, you must use the JUnit Unit Testing Framework for your unit tests. If you write in C++, you must use the CPPUnit Unit Testing Framework for your unit tests.
  • You must arrange your Makefile so that running the command 'make all' will compile, link (for C++) and run all unit tests automatically.

The code you submit must be built with no compiler warnings, and pass all unit tests.

One metric for measuring the usefulness of a set of unit tests is called coverage, which means the percentage of your code that the unit tests actually run. Coverage of 90% or above is considered good, and you should aim to that goal.

This is an advanced course, so there is no intention to take points for coding style or naming conventions - the emphasis is on proper design. However, you are as always expected to write clear code with a consistent style.

Submission Submit a zip file contains all your sources, a README and the Makefile. If you use Java, submit the shell script as well. The README and the Makefile should be at the root of the submitted zip file.
Resources