Table: databricks_catalog_table - Query Databricks Catalog Tables using SQL
Databricks Catalog is a feature in Databricks that allows users to discover, organize, and manage data across all Databricks workspaces. It provides a unified view of all tables, views, and functions in your Databricks environment. Databricks Catalog Tables are the specific tables within the catalog that hold your data.
Table Usage Guide
The databricks_catalog_table
table provides insights into the tables within your Databricks Catalog. As a Data Engineer or Data Scientist, explore table-specific details through this table, including table properties, associated databases, and table types. Utilize it to manage and organize your data effectively, ensuring optimal data discovery and usage within your Databricks environment.
Examples
Basic info
Explore the basic information of your Databricks catalog tables to understand their creation details and associated accounts. This can be useful for auditing, tracking changes, and managing access.
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere catalog_name = 'catalog' and schema_name = 'schema';
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere catalog_name = 'catalog' and schema_name = 'schema';
List tables modified in the last 7 days
Explore which tables have been updated in the past week. This is useful for keeping track of recent changes and modifications in your database.
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere updated_at >= now() - interval '7 days' and catalog_name = 'catalog' and schema_name = 'schema';
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere updated_at >= datetime('now', '-7 days') and catalog_name = 'catalog' and schema_name = 'schema';
List all view type tables
Determine the areas in which view type tables are used within your Databricks catalog. This is useful for understanding dependencies and managing your data architecture effectively.
select table_id, name, view_definition, view_dependencies ->> 'function' as view_dependency_function, view_dependencies ->> 'table' as view_dependency_table, account_idfrom databricks_catalog_tablewhere table_type = 'VIEW' and catalog_name = 'catalog' and schema_name = 'schema';
select table_id, name, view_definition, json_extract(view_dependencies, '$.function') as view_dependency_function, json_extract(view_dependencies, '$.table') as view_dependency_table, account_idfrom databricks_catalog_tablewhere table_type = 'VIEW' and catalog_name = 'catalog' and schema_name = 'schema';
List all tables with source as CSV
Explore which tables in your Databricks catalog are sourced from CSV files. This can be beneficial in instances where you need to identify and manage your data sources effectively.
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere data_source_format = 'CSV' and catalog_name = 'catalog' and schema_name = 'schema';
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere data_source_format = 'CSV' and catalog_name = 'catalog' and schema_name = 'schema';
List all external tables
Discover the segments that are storing data externally in your Databricks catalog. This is useful for understanding where your data is located and how it's being accessed, especially for security and data management purposes.
select table_id, name, table_type, storage_location, storage_credential_namefrom databricks_catalog_tablewhere table_type = 'EXTERNAL' and catalog_name = 'catalog' and schema_name = 'schema';
select table_id, name, table_type, storage_location, storage_credential_namefrom databricks_catalog_tablewhere table_type = 'EXTERNAL' and catalog_name = 'catalog' and schema_name = 'schema';
Get all table constraints
Discover the segments that have specific constraints applied to them. This is useful for assessing the integrity and structure of your data, ensuring that relationships between tables are properly enforced.
select table_id, name, c ->> 'foreign_key_constraint' as foreign_key_constraint, c ->> 'primary_key_constraint' as primary_key_constraint, c ->> 'named_table_constraint' as named_table_constraint, account_idfrom databricks_catalog_table, jsonb_array_elements(table_constraints -> 'table_constraints') as cwhere catalog_name = 'catalog' and schema_name = 'schema';
select table_id, name, json_extract(c.value, '$.foreign_key_constraint') as foreign_key_constraint, json_extract(c.value, '$.primary_key_constraint') as primary_key_constraint, json_extract(c.value, '$.named_table_constraint') as named_table_constraint, account_idfrom databricks_catalog_table, json_each( databricks_catalog_table.table_constraints, '$.table_constraints' ) as cwhere catalog_name = 'catalog' and schema_name = 'schema';
Get effective permissions for each table
Explore which users have specific permissions for each table in a Databricks catalog, providing a comprehensive view of access rights within the system. This can be particularly useful in managing user access and maintaining security protocols.
select name, p ->> 'principal' as principal_name, p ->> 'privileges' as permissionsfrom databricks_catalog_table, jsonb_array_elements(table_effective_permissions) pwhere catalog_name = 'catalog' and schema_name = 'schema';
select name, json_extract(p.value, '$.principal') as principal_name, json_extract(p.value, '$.privileges') as permissionsfrom databricks_catalog_table, json_each(table_effective_permissions) as pwhere catalog_name = 'catalog' and schema_name = 'schema';
Get details for a specific table
Explore the specifics of a particular table to understand its origin, authorship, and associated comments. This can be useful for auditing purposes or to gain insights into the table's history and purpose.
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere full_name = '__catalog_name__.__schema_name__.__table_name__';
select table_id, name, comment, created_at, created_by, metastore_id, account_idfrom databricks_catalog_tablewhere full_name = '__catalog_name__.__schema_name__.__table_name__';
List details of the parent catalog for a particular table
This query helps in identifying the source catalog details for a specific table in Databricks. It's useful for understanding the origin and ownership of data, thereby aiding in data governance and accountability.
select t.name as table_name, c.name as catalog_name, c.catalog_type, c.created_at as catalog_create_time, c.created_by as catalog_created_by, c.metastore_id, c.account_idfrom databricks_catalog_table as t left join databricks_catalog as c on t.catalog_name = c.namewhere full_name = '__catalog_name__.__schema_name__.__table_name__';
select t.name as table_name, c.name as catalog_name, c.catalog_type, c.created_at as catalog_create_time, c.created_by as catalog_created_by, c.metastore_id, c.account_idfrom databricks_catalog_table as t left join databricks_catalog as c on t.catalog_name = c.namewhere full_name = '__catalog_name__.__schema_name__.__table_name__';
Schema for databricks_catalog_table
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form, e.g. connection_name. | |
account_id | text | The Databricks Account ID in which the resource is located. | |
catalog_name | text | = | Name of parent catalog. |
columns | jsonb | The array of __ColumnInfo__ definitions of the table's columns. | |
comment | text | User-provided free-form text description. | |
created_at | timestamp with time zone | Time at which this table was created. | |
created_by | text | The user who created the table. | |
data_access_configuration_id | text | Unique ID of the Data Access Configuration to use with the table data. | |
data_source_format | text | Data source format. | |
deleted_at | timestamp with time zone | Time at which this table was deleted. | |
delta_runtime_properties_kvpairs | jsonb | Information pertaining to current state of the delta table. | |
full_name | text | = | Full name of table, in form of __catalog_name__.__schema_name__.__table_name__. |
metastore_id | text | Unique identifier of parent metastore. | |
name | text | Name of table, relative to parent schema. | |
owner | text | Username of current owner of table. | |
properties | jsonb | A map of key-value properties attached to the securable. | |
row_filter | jsonb | The row filter associated with the table. | |
schema_name | text | = | Name of parent schema relative to its parent catalog. |
sql_path | text | List of schemes whose objects can be referenced without qualification. | |
storage_credential_name | text | Name of the storage credential, when a storage credential is configured for use with this table. | |
storage_location | text | Storage root URL for table. | |
table_constraints | jsonb | Table constraints associated with the table. | |
table_effective_permissions | jsonb | The table effective permissions associated with the table. | |
table_id | text | Name of table, relative to parent schema. | |
table_permissions | jsonb | The table permissions associated with the table. | |
table_type | text | The type of the table. | |
title | text | The title of the resource. | |
updated_at | timestamp with time zone | Time at which this table was last updated. | |
updated_by | text | The user who last updated the table. | |
view_definition | text | The SQL text defining the view. | |
view_dependencies | jsonb | View dependencies associated with the table. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks
You can pass the configuration to the command with the --config
argument:
steampipe_export_databricks --config '<your_config>' databricks_catalog_table