Difference between revisions of "Hadoop Kelvin"

Revision as of 13:44, 19 July 2013

Hadoop Kelvin

Hadoop Kelvin is a network monitoring system designed for the Hadoop Map-Reduce framework. It monitors data (not control) traffic between Hadoop nodes and provides the basis for multiple ways to store and access the stored monitoring data (the current implementation provides for log-based storage). It is designed to be easily extensible, flexible and to operate with a minimal effect on the running time of Hadoop jobs.

Method: Hadoop Kelvin collects data about the following data transfers:

• HDFS reads (regardless of who is performing the read).

• HDFS writes (regardless of who is the origin of the data).

• Data transfers between Mappers and Reducers during a Map-Reduce job execution.

The data collected about each transfer includes:

• Source machine.

• Destination machine.

• Starting timestamp.

• Duration of transfer in milliseconds.

• Size of the transferred data, in bytes.

Difference between revisions of "Hadoop Kelvin"

Revision as of 13:44, 19 July 2013

Navigation menu

Views

Personal tools

Navigation

Search

Tools

@@ Line 8: / Line 8: @@
 Method:
 Hadoop Kelvin collects data about the following data transfers:
 • HDFS reads (regardless of who is performing the read).
 • HDFS writes (regardless of who is the origin of the data).
 • Data transfers between Mappers and Reducers during a Map-Reduce job execution.
 The data collected about each transfer includes:
 • Source machine.
 • Destination machine.
 • Starting timestamp.
 • Duration of transfer in milliseconds.
 • Size of the transferred data, in bytes.
 * [https://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Hook_Points Hadoop HUJI: Measurement hook-points.]
 * [https://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Scheduler_Hook_Points Hadoop HUJI: Scheduler hook-points]