hdinsight.github.io

How do I configure Spark application through Ambari on HDInsight clusters?

Issue:

Need to configure in Ambari the amount of memory and number of cores that a Spark application can use.

Resolution Steps:

  1. Refer to the topic Why did my Spark application fail with OutOfMemoryError? to determine which Spark configurations need to be set and to what values.

  2. Update configurations whose values are already set in the HDInsight Spark clusters with the following steps:

Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text

  1. Add configurations whose values are not set in the HDInsight Spark clusters with the following steps:

Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text Alt text

Note: These changes are cluster wide but can be overridden at actual Spark job submission time.

Further Reading:

Spark job submission on HDInsight clusters