hdinsight.github.io

How do I analyze Tez Directed Acyclic Graph (DAG) critical path on HDInsight cluster?

Issue:

Need to analyze Tez DAG information particularly the critical path on HDInsight cluster

Resolution Steps:

1) Connect to the HDInsight cluster with a Secure Shell (SSH) client (check Further Reading section below).

2) Run the following command at the command prompt:

hadoop jar /usr/hdp/current/tez-client/tez-job-analyzer-*.jar CriticalPath --saveResults --dagId <DagId> --eventFileName <DagData.zip> 

3) List other analyzers that can be used for analyzing Tez DAG with the following command:

hadoop jar /usr/hdp/current/tez-client/tez-job-analyzer-*.jar
An example program must be given as the first argument.
Valid program names are:
  ContainerReuseAnalyzer: Print container reuse details in a DAG
  CriticalPath: Find the critical path of a DAG
  LocalityAnalyzer: Print locality details in a DAG
  ShuffleTimeAnalyzer: Analyze the shuffle time details in a DAG
  SkewAnalyzer: Analyze the skew details in a DAG
  SlowNodeAnalyzer: Print node details in a DAG
  SlowTaskIdentifier: Print slow task details in a DAG
  SlowestVertexAnalyzer: Print slowest vertex details in a DAG
  SpillAnalyzer: Print spill details in a DAG
  TaskConcurrencyAnalyzer: Print the task concurrency details in a DAG
  VertexLevelCriticalPathAnalyzer: Find critical path at vertex level in a DAG

Further Reading:

1) Connect to HDInsight Cluster using SSH