turbot/databricks
steampipe plugin install databricks

Table: databricks_compute_cluster_policy - Query Databricks Compute Cluster Policies using SQL

A Databricks Compute Cluster Policy is a feature within Databricks that allows administrators to manage the specifications and restrictions for the clusters that users can create. It provides a centralized way to control the resources usage, including virtual machines, databases, and more. Databricks Compute Cluster Policy helps you maintain cost and resource utilization by enforcing predefined conditions for cluster creation.

Table Usage Guide

The databricks_compute_cluster_policy table provides insights into Compute Cluster Policies within Databricks. As a DevOps engineer, explore policy-specific details through this table, including permissions, restrictions, and associated metadata. Utilize it to uncover information about policies, such as those with specific resource restrictions, the relationships between policies and clusters, and the verification of policy conditions.

Examples

Basic info

Explore the creation details and associated information of compute cluster policies in Databricks to understand their origin and usage. This is useful for auditing and managing resource allocation policies in your Databricks environment.

select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy;
select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy;

List clusters created in the last 7 days

Gain insights into the recently created clusters within the past week to understand their configurations and creators. This can be beneficial for tracking the usage and growth of your Databricks environment.

select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy
where
created_at_timestamp >= now() - interval '7 days';
select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy
where
created_at_timestamp >= datetime('now', '-7 days');

List all default policies

Explore which policies are set as default in your Databricks compute cluster. This is useful to understand the standard configurations applied across your account and to identify any potential security or performance implications.

select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy
where
is_default;
select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy
where
is_default;

List policies having no limit on the number of active clusters using it

Discover the policies that have no restrictions on the number of active clusters using them. This can be useful in managing resource allocation and identifying potential areas of system overload.

select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy
where
max_clusters_per_user is null;
select
name,
policy_id,
created_at_timestamp,
creator_user_name,
description,
account_id
from
databricks_compute_cluster_policy
where
max_clusters_per_user is null;

Get the ACLs for the policies

Explore the access control levels (ACLs) associated with various policies to understand who has what level of permissions. This can be useful for maintaining security and ensuring appropriate access rights within your Databricks compute cluster.

select
name,
policy_id,
created_at_timestamp,
acl ->> 'user_name' as principal_user_name,
acl ->> 'group_name' as principal_group_name,
acl ->> 'all_permissions' as permission_level
from
databricks_compute_cluster_policy,
jsonb_array_elements(definition -> 'access_control_list') as acl;
select
name,
policy_id,
created_at_timestamp,
json_extract(acl.value, '$.user_name') as principal_user_name,
json_extract(acl.value, '$.group_name') as principal_group_name,
json_extract(acl.value, '$.all_permissions') as permission_level
from
databricks_compute_cluster_policy,
json_each(definition, '$.access_control_list') as acl;

Find the account with the most cluster policies

Discover the account that has the highest number of cluster policies. This query can be used to identify potential areas of policy concentration or overload within an account.

select
account_id,
count(*) as policy_count
from
databricks_compute_cluster_policy
group by
account_id
order by
policy_count desc
limit
1;
select
account_id,
count(*) as policy_count
from
databricks_compute_cluster_policy
group by
account_id
order by
policy_count desc
limit
1;

Schema for databricks_compute_cluster_policy

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form, e.g. connection_name.
account_idtextThe Databricks Account ID in which the resource is located.
created_at_timestamptimestamp with time zoneThe timestamp (in millisecond) when this Cluster Policy was created.
creator_user_nametextCreator user name. The field won't be included if the user has already been deleted.
definitionjsonbPolicy definition document expressed in Databricks Cluster Policy Definition Language.
descriptiontextAdditional human-readable description of the cluster policy.
is_defaultbooleanIf true, policy is a default policy created and managed by Databricks.
max_clusters_per_userbigintMax number of clusters per user that can be active using this policy. If not present, there is no max limit.
nametextCluster Policy name requested by the user.
policy_family_definition_overridestextPolicy definition JSON document expressed in Databricks Policy Definition Language.
policy_family_idtextID of the policy family.
policy_idtext=Canonical unique identifier for the Cluster Policy.
titletextThe title of the resource.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks

You can pass the configuration to the command with the --config argument:

steampipe_export_databricks --config '<your_config>' databricks_compute_cluster_policy