Start Scripts

From Lawa
Jump to: navigation, search

This entry describes the starting scripts and their flow for Hadoop 0.21.0.

System startup (Starting NameNode, DataNode, JobTracker, TaskTracker)

1) Origin scripts (start-dfs.sh / stop-dfs.sh / start-mapred.sh / end-mapred.sh). These are convenience scripts (as the same effect can be achieved by calling hadoop-daemon.sh directly with varying parameters), they parse basic input parameters and execute the next script with varying parameters to start the various Hadoop processes.


2) hadoop-daemon.sh - This is the "main" of the scripts. It executes other scripts according to the parameters it is provided from the origin script.

2.1) The first thing hadoop-daemon.sh does is to execute hadoop-config.sh to configure the environment variables required for Hadoop to work.

2.2) After configuration is done and parameter parsing is complete, then hadoop-daemon.sh executes the target script specified from the origin script. This is typically "hdfs" or "mapred".


3) Target scripts (hdfs / mapred). These scripts choose which class will serve as the entry point for the application and then launch it. This is what is of interest if executing the process with the ability to connect a remote debugger is required.

Hadoop job execution

The script executed is "hadoop".

1) Execute hadoop-config.sh to configure the environment.

2) If an HDFS or Mapreduce command is detected, indicate deprecation and delegate processing to the "hdfs" or "mapred" scripts.

3) Select the entry point class (this is org.apache.hadoop.util.RunJar for a Map Reduce task) and execute it.