Table: databricks_catalog_system_schema - Query Databricks Catalog System Schemas using SQL
A Databricks Catalog System Schema is a feature within Databricks that provides a unified interface for managing all data objects (tables, views, functions, etc.) and their respective schemas. It allows users to organize their data into databases for better management and querying. The system schema provides system-level details and metadata of the Databricks catalog.
Table Usage Guide
The databricks_catalog_system_schema
table provides insights into the system-level details and metadata of the Databricks catalog. As a Data Engineer or Data Scientist, explore specific details through this table, such as the database name, description, location URI, owner name, and more. Utilize it to uncover information about the structure and organization of your Databricks data objects, and to manage and query your data more effectively.
Examples
Basic info
Analyze the status of your Databricks system schemas to understand which ones are active within your account. This can help in effectively managing your Databricks resources and ensuring optimal performance.
select metastore_id, schema, state, account_idfrom databricks_catalog_system_schema;
select metastore_id, schema, state, account_idfrom databricks_catalog_system_schema;
List all system schemas that are unavailable
Uncover the details of system schemas that are currently unavailable in your Databricks catalog. This can help you identify potential issues with your data management and ensure smooth operation of your systems.
select metastore_id, schema, state, account_idfrom databricks_catalog_system_schemawhere state = 'UNAVAILABLE';
select metastore_id, schema, state, account_idfrom databricks_catalog_system_schemawhere state = 'UNAVAILABLE';
Give details of the parent merastore associated to a particular schema
Gain insights into the specific parent storage system associated with a particular schema in your Databricks catalog. This is useful in understanding the origin and ownership of data, aiding in data governance and management.
select s.title as system_schema_name, m.metastore_id, m.name as metastore_name, m.created_at as metastore_create_time, m.owner as metastore_owner, m.account_id as metastore_account_idfrom databricks_catalog_system_schema as s left join databricks_catalog_metastore as m on s.metastore_id = m.metastore_idwhere s.title = 'operational_data';
select s.title as system_schema_name, m.metastore_id, m.name as metastore_name, m.created_at as metastore_create_time, m.owner as metastore_owner, m.account_id as metastore_account_idfrom databricks_catalog_system_schema as s left join databricks_catalog_metastore as m on s.metastore_id = m.metastore_idwhere s.title = 'operational_data';
Find the account with the most schemas
Explore which account has the highest number of schemas, allowing you to identify the account with the most extensive system schema usage. This information can be particularly useful for resource allocation and system optimization efforts.
select account_id, count(*) as schema_countfrom databricks_catalog_system_schemagroup by account_idorder by schema_count desclimit 1;
select account_id, count(*) as schema_countfrom databricks_catalog_system_schemagroup by account_idorder by schema_count desclimit 1;
Schema for databricks_catalog_system_schema
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form. | |
account_id | text | =, !=, ~~, ~~*, !~~, !~~* | The Databricks Account ID in which the resource is located. |
metastore_id | text | = | Unique identifier of parent metastore. |
schema | text | Name of the system schema. | |
sp_connection_name | text | =, !=, ~~, ~~*, !~~, !~~* | Steampipe connection name. |
sp_ctx | jsonb | Steampipe context in JSON form. | |
state | text | The current state of enablement for the system schema. An empty string means the system schema is available and ready for opt-in. | |
title | text | The title of the resource. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks
You can pass the configuration to the command with the --config
argument:
steampipe_export_databricks --config '<your_config>' databricks_catalog_system_schema