Difference between revisions of "Hadoop Kelvin"

From Lawa
Jump to: navigation, search
Line 8: Line 8:
 
Method:
 
Method:
 
Hadoop Kelvin collects data about the following data transfers:
 
Hadoop Kelvin collects data about the following data transfers:
 +
 
• HDFS reads (regardless of who is performing the read).
 
• HDFS reads (regardless of who is performing the read).
 +
 
• HDFS writes (regardless of who is the origin of the data).
 
• HDFS writes (regardless of who is the origin of the data).
 +
 
• Data transfers between Mappers and Reducers during a Map-Reduce job execution.
 
• Data transfers between Mappers and Reducers during a Map-Reduce job execution.
 +
  
 
The data collected about each transfer includes:
 
The data collected about each transfer includes:
 +
 
• Source machine.
 
• Source machine.
 +
 
• Destination machine.
 
• Destination machine.
 +
 
• Starting timestamp.
 
• Starting timestamp.
 +
 
• Duration of transfer in milliseconds.
 
• Duration of transfer in milliseconds.
 +
 
• Size of the transferred data, in bytes.
 
• Size of the transferred data, in bytes.
 +
  
 
* [https://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Hook_Points Hadoop HUJI: Measurement hook-points.]
 
* [https://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Hook_Points Hadoop HUJI: Measurement hook-points.]
 
* [https://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Scheduler_Hook_Points Hadoop HUJI: Scheduler hook-points]
 
* [https://www.cs.huji.ac.il/wikis/MediaWiki/lawa/index.php/Scheduler_Hook_Points Hadoop HUJI: Scheduler hook-points]

Revision as of 13:44, 19 July 2013

Hadoop Kelvin

Hadoop Kelvin is a network monitoring system designed for the Hadoop Map-Reduce framework. It monitors data (not control) traffic between Hadoop nodes and provides the basis for multiple ways to store and access the stored monitoring data (the current implementation provides for log-based storage). It is designed to be easily extensible, flexible and to operate with a minimal effect on the running time of Hadoop jobs.

Method: Hadoop Kelvin collects data about the following data transfers:

• HDFS reads (regardless of who is performing the read).

• HDFS writes (regardless of who is the origin of the data).

• Data transfers between Mappers and Reducers during a Map-Reduce job execution.


The data collected about each transfer includes:

• Source machine.

• Destination machine.

• Starting timestamp.

• Duration of transfer in milliseconds.

• Size of the transferred data, in bytes.