Difference between revisions of "Hadoop HUJI: Measurement Hook-Points"
From Lawa
Direwolf007 (Talk | contribs) (Created page with '1.Mapper Input: * Location: org.apache.hadoop.hdfs.BlockReader.readChunk() 2.Reducer Input: * Location #1: org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToMemory() …') |
Direwolf007 (Talk | contribs) |
||
(11 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
− | + | == Hadoop 0.21 Hook Points== | |
− | + | 1. Mapper Input: | |
− | + | org.apache.hadoop.hdfs.BlockReader.readChunk() | |
− | + | 2. Reducer Input: | |
− | + | org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToMemory() | |
+ | org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToDisk() | ||
− | 3.HDFS Writes: | + | 3. HDFS Writes: |
− | + | org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock() | |
+ | |||
+ | == Hadoop 0.20.2 (CDH3) Hook Points == | ||
+ | |||
+ | 1. Mapper Input | ||
+ | hdfs.org.apache.hadoop.hdfs.DFSClient.BlockReader.readChunk() | ||
+ | |||
+ | Instrumentation of the input stream is in the factory method public static BlockReader newBlockReader (Same as Hadoop 0.21 ) | ||
+ | |||
+ | 2. Reducer Input: | ||
+ | |||
+ | mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleInMemory() | ||
+ | mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleToDisk() | ||
+ | |||
+ | 3. HDFS Writes: | ||
+ | |||
+ | hdfs.org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock() | ||
+ | |||
+ | The following diagram helps illustrate the locations of these hook-points in the execution flow of the job: | ||
+ | [[File:Hadoop-statistics-hookpoints.jpg | 1024px]] |
Latest revision as of 13:23, 19 July 2013
Hadoop 0.21 Hook Points
1. Mapper Input:
org.apache.hadoop.hdfs.BlockReader.readChunk()
2. Reducer Input:
org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToMemory() org.apache.hadoop.mapreduce.task.reduce.Fetcher.shuffleToDisk()
3. HDFS Writes:
org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()
Hadoop 0.20.2 (CDH3) Hook Points
1. Mapper Input
hdfs.org.apache.hadoop.hdfs.DFSClient.BlockReader.readChunk()
Instrumentation of the input stream is in the factory method public static BlockReader newBlockReader (Same as Hadoop 0.21 )
2. Reducer Input:
mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleInMemory() mapred.org.apache.hadoop.mapred.ReduceTask.ReduceCopier.MapOutputCopier.shuffleToDisk()
3. HDFS Writes:
hdfs.org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock()
The following diagram helps illustrate the locations of these hook-points in the execution flow of the job: