In this tutorial, we’ll walk through the steps to create a read-only external table in Azure Databricks using an existing Delta table stored in an Azure Storage Account. This allows you to query the data in the Delta table without needing to copy it into your Databricks cluster.
Prerequisites:
- Access to Azure Databricks instance
- Access to Azure Storage Account where the Delta table is stored
- Appropriate permissions to create and manage resources in both Azure Databricks and Azure Storage Account
Steps:
1. Create Access Connector for Azure Databricks:
- in Azure, search for “Access Connector for Azure Databricks”, you are going to create a managed identity for databricks.
2. Assign Storage Blob Delegator Role:
- In the Azure Portal, navigate to your Storage Account.
- Go to the “Access control (IAM)” section.
- Add a role assignment for the access connector created in step 1, assigning the “Storage Blob Delegator” role.
- Additionally, assign the “Storage Blob Reader” role to the access connector either at the container or storage account level.
3. Create Storage Credential:
- Obtain the connector ID of the access connector created in step 1.
- Use this connector ID to create a storage credential. This can usually be done through the Databricks CLI or UI.
4. Create External Location on Databricks:
- navigate to “external locations”
- create a location either for container or specific delta table path, e,g: abfss://bronze@adl001.dfs.core.windows.net/
- “Limit to read-only use” is optional. If you create a location for a subfolder, you are not able to create location for contianer.
5. Create External Table:
- Once the external location is set up, you can create the external table using standard SQL commands in Databricks. Since we already set the access for container, we do not need to add locations for each delta table underneeth.
CREATE external TABLE unity_catalog.catalog_name.schema_name
USING delta
LOCATION 'abfss://bronze@adl001.dfs.core.windows.net/delta_table_path'
Leave a Reply