hdinsight.github.io

Spark FAQ: Answers to common questions on Spark on Azure HDInsight

Use the Spark FAQ for answers to common questions on Spark on Azure HDInsight platform.

Basic data collection for Spark Performance

What’s extended Spark history server and how to troubleshot if there is issue

Why did my Spark application fail with OutOfMemoryError?

Why did my Spark application fail with IllegalArgumentException Wrong FS?

How do I configure Spark application through Ambari on HDInsight clusters?

How do I configure spark-shell on HDInsight clusters?

How do I configure Spark application through spark-submit on HDInsight clusters?

How do I configure Spark application through LIVY on HDInsight clusters?

How do I configure Spark application through Jupyter notebook on HDInsight clusters?

Why did my Spark Streaming Application stop processing data after executing for 24 days?

Why do I start seeing 502 errors frequently when trying to connect to Thrift server exposed by HDInsight cluster?

Spark job becomes slow when the destination folder has too many files?

How to retrieve livy session details?

How to increasing Spark history heap-memory configuration

Getting java.lang.OutOfMemoryError when attempting to restart Livy server

Spark 1.6 jar are used when Spark application is launched using Oozie Shell

Jupyter server 404 Not Found error due to Blocking Cross Origion API

how to access wasb?

Queue Name is honored in livy 0.3?

Unable to download/ retrieve large files when interactive the SparkSQL using Thrift Server?

Options to use Spark Submit?

Why do my Spark Streaming jobs take longer than usual to process?

Why do spark-submit job failed with NoClassDefFoundError?

Long running Spark Streaming app driver log filled with Error NativeAzureFileSystem … RequestBodyTooLarge.

Spark job fails with InvalidClassException - class version mismatch