hdinsight.github.io

How to re-use existing container when creating clusters with DataLake as primary storage

When creating clusters with DataLake as primary storage, a container is configured under which all the cluster-specific files will be stored. For instance, if the user specifies root path for a cluster as ‘/Clusters/clustername’, ‘clustername’ is the container.

User may choose to use a new container or re-use an existing one. Re-using the container is only possible when the old cluster that was using the container is deleted. When re-using the cluster container, user may specify the same service principal that was used to create the container initially or a new service principal.

Possible errors:

Cluster creation fails with InsufficientPermissionsToCopyBlobsToDataLakeContainerErrorCode due to insufficient privileges for the new serviceprincipal to change permissions on adls files/folders error if the container is not setup with proper owner.

Alternative options: