Description |
In this exercise you will:
- Design the first version of a file system walker
- Produce a set of UML class and sequence diagrams to document your design
- Experience design patterns in actual C++ or Java code
- Write a covering set of unit tests
The File System
In our file system there are five types of files you should support: Each file
has several properties:
- Text - name, path, MIME type, size, number of lines. Text files has
.txt suffix
- HTML - name, path, MIME type, size, number of lines, title. HTML files has
.html or .htm suffix
- GIF - name, path, MIME type, size, width, height. GIF files has
.gif suffix
- directory - name, path, number of files. Directories may or may not have a suffix
- Other - name, path, size. "Other" files are all the files that does not comply with the above definitions.
The path is calculated from the given root node. Treat symbolic links as the files they
link to. Here are some tips for working with the files:
- The title of an HTML file is the text that appears between the
<title> and
</title> tags. Remember that HTML tags are case insensitive.
- Java users - you might want taking a look in the class
javax.swing.ImageIcon .
C++ users - you might want to check /usr/include/gif_lib.h (link to /usr/lib/libgif.a or
/usr/lib/libgif.so ).
- A list of MIME types can be found here.
Requirements
Create an executable named fswalker who traverses the file system whose root
is the given directory, and creates an output file who represents the
file system. If you are using Java, either add a shell script or compile using gcj.
The command line should be:
fswalker [-dironly] [-nameonly] -d <path> -o <output file>
Where:
-d <path> - The path to the file system's root directory.
-o <output file> - The path to the output file.
-dironly - Optional argument. Means only the directories are written to the XML, but not the files
-nameonly - Optional argument. Means only the name of the files is written to the XML.
The -dironly , -nameonly parameters may come in every order.
For each file you have to produce an XML element, where it's properties are mapped to the element's attributes. It looks like:
<file mime-type="MIME Type" size="size" other properties...>
The file's name
</file>
For Directory, the element will look like this:
<directory name="name" number-of-files="number of files">
... child file elements ...
</directory>
The XML structure is as follows:
<directory name="testdir" number-of-files="4">
<file mime-type="text/plain" size="455" number-of-lines="34">
aTextFile.txt
</file>
<directory name="innerDir" number-of-files="1">
<file mime-type="text/html" size="6352" number-of-lines="50" title="much to do!!!">
anHtmlFile.html
</file>
</directory>
<file mime-type="image/gif" size="367" width="20" height="30">
aGifFile.gif
</file>
<file size="774">
someOtherFile.png
</file>
</directory>
Notice that since XML uses the characters <, >, & and " at part of it's syntax,
you must convert them to the equivalent XML entities <, >, & and ".
To know more about XML entities read The relevant section in
the XML Specification.
Design
While this exercise can be easily programmed within a single class, this won't work since this
fswalker is only a first version, so it is crucial to maintain an open mind with
respect to possible future requirements. Consider the following possibilities:
- It may be required to produce the output in formats other than text - HTML, PDF, Word or
others. It may also be required to write output in several formats at once, for example:
xmlp -txt -pdf -html -d ~/www .
- It may be required to support other file types - image files (jpg, png), ps, pdf, code
source files (c, cpp, h, java), etc., each with it's own properties
- It may be required to modify the output document before pretty-printing it: for example, sort
elements, rename tag names, change text to uppercase, and so on.
- It may be required to read different data formats of hierarchical data besides file systems -
for example XML, object graphs, relational tables with foreign keys and others - and to be
able to produce the same output for them.
- It may be required to assign each file type an icon - same as in current file system browsers
such as Windows Explorer or konqueror. Think of where to keep this data, how to obtain the
images, etc.
You must design your program so that it is easy to add code that implements the above requirements.
Assume that you are the one who will actually have to code it - that's how it usually is in "real life".
For each of the above requirements write an explanation in your README file, not more than three
sentences long, which explains how it should be coded. For example:
Requirement: It may be required to define filters on which parts of the input document get
printed. For example, new command-line arguments can dictate that only the simple (non-hierarchical)
elements get printed, that only elements that start with a given string get printed, and so forth.
Solution: Write an Iterator for each kind of filter, whose next() method will move to the
next element for which the filter is true. Such iterators are implemented as Decorators of other
iterators, which easily enables to dynamically combine different filters and does not require
changing or recompiling existing code.
It is important that each solution you present will be at most three sentences long. The intention is
to enforce the use of design patterns vocabulary rather than elaborating specific class and object
relationships.
Code & Unit Test
This exercise intends you to divide your time equally between actual coding and between design,
writing UML diagrams, and answering the above six design questions. You may choose between C++
and Java as the implementation languages; in any language chosen, use the standard libraries to
their full extent - the standard streams, strings and data structures. With a proper design, this
exercise is quick and simple to code.
It is also required that you submit unit tests to test your work. Organize your unit tests into
classes by subject, and write a method for each small test. Each test should be self-validating -
that is, know by itself whether it has passed or failed. Writing unit tests should be an integral
part of coding, and is essential when code must be changed in newer versions. You will have a chance
to estimate the convenience of unit tests in exercise 3. Until then:
- Read the following article
about unit testing as part of coding and the build process.
- If you write in Java, you must use the JUnit Unit Testing Framework for your unit tests.
If you write in C++, you must use the CPPUnit Unit Testing Framework for your unit tests.
- You must arrange your Makefile so that running the command 'make all' will compile,
link (for C++) and run all unit tests automatically.
The code you submit must be built with no compiler warnings, and pass all unit tests.
One metric for measuring the usefulness of a set of unit tests is called coverage, which means
the percentage of your code that the unit tests actually run. Coverage of 90% or above is
considered good, and you should aim to that goal.
This is an advanced course, so there is no intention to take points for coding style or
naming conventions - the emphasis is on proper design. However, you are as always expected
to write clear code with a consistent style.
|