Kelvin Architecture
High-Level Overview
There are two main parts in Hadoop Kelvin: These are the Statistics Server and the Statistics Client.
The Statistics Server is a program which runs on a single machine in the cluster (typically one of the master machines in the cluster if a single Statistics Server is present. Alternatively a subset of slave machines can be used if several such servers are required, or each machine can run its own server for the tasks that run on it) and serves as a sink for all the traffic reports arriving from the cluster nodes. The server operates a set of user-configurable (via XML) data storers (which are write-only), data retrievers (which are read-only) and data manipulators (which provide read-and-write access) to which measurement data is stored and from which queries about past measurement data are completed. Currently, Hadoop Kelvin provides a Log-based information store which stores all traffic reports in plaintext form via a Log4J logger. The protocol all Hadoop Kelvin traffic uses is HTTP.
Data Storers, Data Retrievers and Data Manipulators: Why Hadoop Kelvin is (potentially) more than a Logger
As briefly described above, the system incorporates the notions of a Data Storer, a Data Retriever and a Data Manipulator. We refer to them all as Data Handlers. The first two define a Java Interface which can be implemented by anyone seeking to expand upon the functionality of Kelvin, while the latter is simply an entity implementing both these interfaces at once. The addition of extra such elements does not require the recompilation of Hadoop (they just need to be located in a JAR file which is located on the classpath and need to be enabled in the XML configuration files), but it does require a re-start of the statistic server(s) to load the classes configured in the XML configuration files. The current Kelvin implementation supplies one Data Storer (LogStatisticStore). The LogStatisticStore logs all traffic reports to a Log4J log file. This is the simplest form of a Data Storer, and should be mainly used for debugging or research purposes (we used it for the latter). The log files have a tendency to grow very large rather quickly, so it is not suited for long-term, constant deployment in a production environment. It is easy to implement the interfaces we specified for additional storers and/or retrievers, with the most obvious example being an SQL database.