Difference between revisions of "Hadoop HUJI: Measurement Hook-Points"

From Lawa
Jump to: navigation, search
Line 3: Line 3:
 
1. Mapper Input:
 
1. Mapper Input:
  
* Location: org.apache.hadoop.hdfs.BlockReader.readChunk()
+
org.apache.hadoop.hdfs.BlockReader.readChunk()
  
 
2. Reducer Input:  
 
2. Reducer Input:  
  
* Location #1: org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToMemory()
+
org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToMemory()
 
+
org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToDisk()
* Location #2: org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToDisk()
+
  
 
3. HDFS Writes:
 
3. HDFS Writes:
* Location: org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()
+
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()
  
 
== Hadoop 0.20.2 (CDH3) Hook Points ==
 
== Hadoop 0.20.2 (CDH3) Hook Points ==
  
1. (TBD)
+
1. Mapper Input
 +
hdfs.org.apache.hadoop.hdfs.DFSClient.BlockReader.readChunk()
 +
 
 +
Instrumentation of the input stream is in the factory method public static BlockReader newBlockReader (Same as Hadoop 0.21 )
  
 
2. Reducer Input:
 
2. Reducer Input:
  
* Location #1: mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleInMemory()
+
mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleInMemory()
 
+
mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleToDisk()
* Location #2: mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleToDisk()
+
  
 
3. HDFS Writes:
 
3. HDFS Writes:
  
* hdfs.org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()
+
hdfs.org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()

Revision as of 17:06, 9 May 2011

Hadoop 0.21 Hook Points

1. Mapper Input:

org.apache.hadoop.hdfs.BlockReader.readChunk()

2. Reducer Input:

org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToMemory()
org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToDisk()

3. HDFS Writes:

org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()

Hadoop 0.20.2 (CDH3) Hook Points

1. Mapper Input

hdfs.org.apache.hadoop.hdfs.DFSClient.BlockReader.readChunk()

Instrumentation of the input stream is in the factory method public static BlockReader newBlockReader (Same as Hadoop 0.21 )

2. Reducer Input:

mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleInMemory()
mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleToDisk()

3. HDFS Writes:

hdfs.org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()