Azure Data Lake Storage Gen2 agent
Azure Data Lake Storage Gen2
Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob Storage. It combines the power of a high-performance file system with massive scale and economy to help you speed your transition to the cloud.
Setup Instructions
1. Create an Application (Service Principal) and Secret
-
Go to Microsoft Entra ID (formerly Azure AD) → App registrations → New registration
-
Enter a name for your application
-
Select "Accounts in this organizational directory only" (for single-tenant)
-
Click "Register"
-
In the app’s overview page, copy the following:
-
Application (client) ID
-
Directory (tenant) ID
-
-
Navigate to "Certificates & secrets" → "New client secret"
-
Add a description and select an expiration period
-
Click "Add" and immediately copy the client secret value (it will be hidden afterward)
You’ll need these values later: * Tenant ID * Client ID * Client Secret
2. Create a Storage Account (ADLS Gen2)
-
In the Azure Portal, go to "Storage accounts" → "+ Create"
-
Select your subscription and resource group
-
Enter a unique name (lowercase only)
-
Select a region
-
Performance: Standard (typical)
-
Redundancy: Choose between LRS or ZRS based on your needs
-
In the "Advanced" tab, enable "Hierarchical namespace" (required for ADLS Gen2)
-
(Optional) In the "Networking" tab, configure network settings (public endpoint is sufficient for initial setup)
-
Review and create the storage account
3. Assign RBAC Permissions to the Application
-
Open your newly created storage account
-
Go to "Access Control (IAM)" → "Add" → "Add role assignment"
-
Select the appropriate role:
-
For initial setup: "Storage Blob Data Owner" (temporary, for ACL management)
-
For production: "Storage Blob Data Contributor" (read/write access)
-
-
Under "Members", select "User, group, or service principal"
-
Click "Select members" and search for your application
-
Click "Select" and then "Review + assign"
Note: RBAC alone is not sufficient for ADLS Gen2 - POSIX ACLs must also be configured on the filesystem.
4. Create a Container and Configure ACLs
-
In your storage account, go to "Storage browser" → "Containers" → "+ Container"
-
Enter a name (e.g., "datalake") and click "Create"
-
Open the container and click "Manage ACL"
-
Add your application as a principal with the following permissions:
-
On the container root (/):
-
r-x
(Read + Execute) for listing paths -
w
(Write) if creating/writing is needed
-
-
-
Check "Propagate to child items"
-
(Recommended) Set the same permissions as default ACLs for new files and folders