Table: databricks_compute_cluster_policy - Query Databricks Compute Cluster Policies using SQL
A Databricks Compute Cluster Policy is a feature within Databricks that allows administrators to manage the specifications and restrictions for the clusters that users can create. It provides a centralized way to control the resources usage, including virtual machines, databases, and more. Databricks Compute Cluster Policy helps you maintain cost and resource utilization by enforcing predefined conditions for cluster creation.
Table Usage Guide
The databricks_compute_cluster_policy
table provides insights into Compute Cluster Policies within Databricks. As a DevOps engineer, explore policy-specific details through this table, including permissions, restrictions, and associated metadata. Utilize it to uncover information about policies, such as those with specific resource restrictions, the relationships between policies and clusters, and the verification of policy conditions.
Examples
Basic info
Explore the creation details and associated information of compute cluster policies in Databricks to understand their origin and usage. This is useful for auditing and managing resource allocation policies in your Databricks environment.
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policy;
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policy;
List clusters created in the last 7 days
Gain insights into the recently created clusters within the past week to understand their configurations and creators. This can be beneficial for tracking the usage and growth of your Databricks environment.
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policywhere created_at_timestamp >= now() - interval '7 days';
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policywhere created_at_timestamp >= datetime('now', '-7 days');
List all default policies
Explore which policies are set as default in your Databricks compute cluster. This is useful to understand the standard configurations applied across your account and to identify any potential security or performance implications.
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policywhere is_default;
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policywhere is_default;
List policies having no limit on the number of active clusters using it
Discover the policies that have no restrictions on the number of active clusters using them. This can be useful in managing resource allocation and identifying potential areas of system overload.
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policywhere max_clusters_per_user is null;
select name, policy_id, created_at_timestamp, creator_user_name, description, account_idfrom databricks_compute_cluster_policywhere max_clusters_per_user is null;
Get the ACLs for the policies
Explore the access control levels (ACLs) associated with various policies to understand who has what level of permissions. This can be useful for maintaining security and ensuring appropriate access rights within your Databricks compute cluster.
select name, policy_id, created_at_timestamp, acl ->> 'user_name' as principal_user_name, acl ->> 'group_name' as principal_group_name, acl ->> 'all_permissions' as permission_levelfrom databricks_compute_cluster_policy, jsonb_array_elements(definition -> 'access_control_list') as acl;
select name, policy_id, created_at_timestamp, json_extract(acl.value, '$.user_name') as principal_user_name, json_extract(acl.value, '$.group_name') as principal_group_name, json_extract(acl.value, '$.all_permissions') as permission_levelfrom databricks_compute_cluster_policy, json_each(definition, '$.access_control_list') as acl;
Find the account with the most cluster policies
Discover the account that has the highest number of cluster policies. This query can be used to identify potential areas of policy concentration or overload within an account.
select account_id, count(*) as policy_countfrom databricks_compute_cluster_policygroup by account_idorder by policy_count desclimit 1;
select account_id, count(*) as policy_countfrom databricks_compute_cluster_policygroup by account_idorder by policy_count desclimit 1;
Schema for databricks_compute_cluster_policy
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form, e.g. connection_name. | |
account_id | text | The Databricks Account ID in which the resource is located. | |
created_at_timestamp | timestamp with time zone | The timestamp (in millisecond) when this Cluster Policy was created. | |
creator_user_name | text | Creator user name. The field won't be included if the user has already been deleted. | |
definition | jsonb | Policy definition document expressed in Databricks Cluster Policy Definition Language. | |
description | text | Additional human-readable description of the cluster policy. | |
is_default | boolean | If true, policy is a default policy created and managed by Databricks. | |
max_clusters_per_user | bigint | Max number of clusters per user that can be active using this policy. If not present, there is no max limit. | |
name | text | Cluster Policy name requested by the user. | |
policy_family_definition_overrides | text | Policy definition JSON document expressed in Databricks Policy Definition Language. | |
policy_family_id | text | ID of the policy family. | |
policy_id | text | = | Canonical unique identifier for the Cluster Policy. |
title | text | The title of the resource. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks
You can pass the configuration to the command with the --config
argument:
steampipe_export_databricks --config '<your_config>' databricks_compute_cluster_policy