Table: databricks_catalog_external_location - Query Databricks Catalog External Locations using SQL
Databricks Catalog External Locations is a feature within Databricks that allows you to manage and organize your data. It provides a centralized way to catalog and manage data across various external sources, making data discovery and governance more efficient. Databricks Catalog External Locations helps you stay informed about the data you have and where it is located.
Table Usage Guide
The databricks_catalog_external_location
table provides insights into the catalog external locations within Databricks. As a data engineer or data scientist, explore specific details about these locations through this table, including the name of the location, the associated database, and the owner. Utilize it to gain a deeper understanding of your data's organization, distribution, and ownership in Databricks.
Examples
Basic info
Explore the metadata of external locations in your Databricks catalog to understand when and by whom they were created, as well as their associated account details. This can help you manage and track the usage of your resources.
select name, comment, created_at, created_by, metastore_id, url, account_idfrom databricks_catalog_external_location;
select name, comment, created_at, created_by, metastore_id, url, account_idfrom databricks_catalog_external_location;
List external locations modified in the last 7 days
Discover the segments that have undergone modification in the external locations during the past week. This can be useful in tracking recent changes and understanding who made them and when.
select name, comment, created_at, created_by, metastore_id, url, account_idfrom databricks_catalog_external_locationwhere updated_at >= now() - interval '7 days';
select name, comment, created_at, created_by, metastore_id, url, account_idfrom databricks_catalog_external_locationwhere updated_at >= datetime('now', '-7 days');
List read only external locations
Discover the segments that are read-only in your external locations. This can be useful to ensure data integrity by preventing accidental changes to these areas.
select name, comment, created_at, created_by, metastore_id, url, account_idfrom databricks_catalog_external_locationwhere read_only;
select name, comment, created_at, created_by, metastore_id, url, account_idfrom databricks_catalog_external_locationwhere read_only = 1;
Get assocated credential for each external location
Determine the associated credentials for each external location in Databricks to understand the correlation between data storage and access permissions. This can aid in managing security and access control across different data repositories.
select l.name, l.comment, l.url, c.name as credential_name, c.id, c.aws_iam_role as credential_aws_iam_role, l.account_idfrom databricks_catalog_external_location l, databricks_catalog_storage_credential cwhere l.credential_id = c.id and l.account_id = c.account_id;
select l.name, l.comment, l.url, c.name as credential_name, c.id, c.aws_iam_role as credential_aws_iam_role, l.account_idfrom databricks_catalog_external_location l, databricks_catalog_storage_credential cwhere l.credential_id = c.id and l.account_id = c.account_id;
Get effective permissions for each external location
Explore which permissions are associated with each external location in your Databricks catalog. This can be particularly useful for understanding access rights and maintaining security compliance.
select name, p ->> 'principal' as principal_name, p ->> 'privileges' as permissionsfrom databricks_catalog_external_location, jsonb_array_elements(external_location_effective_permissions) p;
select name, json_extract(p.value, '$.principal') as principal_name, json_extract(p.value, '$.privileges') as permissionsfrom databricks_catalog_external_location, json_each(external_location_effective_permissions) as p;
Count the number of external locations per account ID
Discover the segments that have varying numbers of external locations linked to them. This is useful to understand the distribution of resources across different account segments.
select account_id, count(*) AS location_countfrom databricks_catalog_external_locationgroup by account_id;
select account_id, count(*) AS location_countfrom databricks_catalog_external_locationgroup by account_id;
List users who created the most external locations
Discover who has been the most active in creating external locations within your Databricks catalog. This is beneficial for understanding usage patterns and identifying key contributors to your data infrastructure.
select created_by, count(*) as location_countfrom databricks_catalog_external_locationgroup by created_byorder by location_count desc;
select created_by, count(*) as location_countfrom databricks_catalog_external_locationgroup by created_byorder by location_count desc;
Schema for databricks_catalog_external_location
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form, e.g. connection_name. | |
account_id | text | The Databricks Account ID in which the resource is located. | |
comment | text | User-provided free-form text description. | |
created_at | timestamp with time zone | Time at which this external location was created. | |
created_by | text | The user who created this external location. | |
credential_id | text | Unique ID of the location's storage credential. | |
credential_name | text | Name of the storage credential used with this location. | |
external_location_effective_permissions | jsonb | The effective permissions for the external location. | |
external_location_permissions | jsonb | The permissions for the external location. | |
metastore_id | text | Unique identifier of metastore hosting the external location. | |
name | text | = | Human readable name that identifies the experiment. |
owner | text | The user who owns this external location. | |
read_only | boolean | Whether this external location is read-only. | |
title | text | The title of the resource. | |
updated_at | timestamp with time zone | Time at which this external location was last updated. | |
updated_by | text | The user who last updated this external location. | |
url | text | The Path URL of the external location. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks
You can pass the configuration to the command with the --config
argument:
steampipe_export_databricks --config '<your_config>' databricks_catalog_external_location