Summer School on |
Experimental Computer Science |
|
|
|
Computer Science has an established mathematical theory of what can be
computed and at what cost. It also has a well-developed engineering
side, spanning hardware development, software development, and the
crafting of applications ranging from massive search through computer
vision to robotic control. But it has much less to offer in terms of
experimentation as it is commonly done in the natural sciences.
In Computer Science, experimentation is often taken to mean
implementing a prototype or running a simulation. But in the natural
sciences experimental science is more about observation and
measurement of nature. It is this connection to reality that seems to
be most often missing in Computer Science research. Too much work is
being based on assumptions that are either mathematically convenient,
or seem to make sense, without verifying that they indeed hold in
practice.
The goal of the proposed summer school is to teach students how to be
good scientists in addition to being good engineers. It is planned to
be a 5-day event, with course offerings in the following three major
topic areas. The idea of holding such a summer school was raised at
the educational roundtable held at the Workshop on Experimental
Computer Science in San Diego as part of ACM FCRC 2007.
- Understanding Complex Systems.
While computer-based systems are man-made, many of today's systems
are so complex that even their designers cannot claim to fully
understand their operational characteristics. For example, this is
true of the detailed interactions among architectural features of
modern microprocessors, and of the structure and workings of the
Internet. Therefore such systems need to be studied much as natural
systems are studied, by observation and measurement.
Possible courses in this topic area include the following.
- Basics of measurement: active and passive monitoring;
unobtrusive measurement; errors and noise; confidence intervals.
- Measuring the Internet: exploiting the Internet infrastructure for
measurement; probing the Internet structure; effect of different
points of view; Internet traffic.
- Monitoring infrastructure, such as the KernInst project (option of
a hands-on course)
- workload characterization and modeling: workloads as the input to
system evaluations; distribution fitting; heavy tails and their
implications; correlations in workloads; self similarity; usage
examples.
- Experimental Engineering.
It is often convenient to think about system construction as a
linear process, in which requirements lead to design and
implementation. But today it is increasingly being recognized that
an iterative and incremental process may be much better, with
feedback from actual usage under realistic conditions guiding the
direction of subsequent developments. This is manifest in the
Unified Software Development Process, in agile software development,
and in the procedure used by companies like Google who test new
features on real users before incorporating them in the main product
line.
Possible courses in this topic area include the following.
- Experimental infrastructure: constructing and using large-scale
infrastructure such as PlanetLab (option for hands-on course).
- Microarchitecture development: benchmark design; assessing the
coverage of benchmarks; assessing the overlap of benchmarks;
architecture-benchmark interactions.
- Reliable evaluations: conducting tournaments; evaluation under
equivalent conditions; bias; standardization vs. innovation;
examples such as TREC or robocup.
- Experimental algorithmics: the interaction of experimentation and
theory; experimental analysis of algorithms that cannot be
analyzed; examples from bin packing; examples from phase
transitions in NP-complete problems.
- Experimenting with Humans.
Most computer systems are built by humans for use by humans.
Therefore one cannot escape the need to understand how humans
interact with systems and think about them. One should be cognizant
of the fact that easily measured metrics do not necessarily
correspond to what human users really care about, that users exhibit
a large variety of behaviors, and that many of these behaviors are
surprising.
Possible courses in this topic area include the following.
- Basics of human cognition: human thought processes; human memory
capacity; how humans understand systems and processes; effects of
previous experience.
- Experiments with humans: focus groups; experiment design and
setup; articulating tasks and requirements; using rewards.
- Human-system interaction: usability testing (option for a hands-on
course).
-
In addition, it is planned to hold a series of open lectures on common
topics such as the following:
- The scientific method: history and development of scientific
thought.
- Handling data: exploratory data analysis; data visualization; data
archiving and sharing.
- Replication: replication vs. repetition; controlled experiments;
ensuring replicability of results.
dgf / 31 Oct 2007 |