turbot/databricks

steampipe plugin install databrickssteampipe plugin install databricks

databricks_catalog databricks_catalog_connection databricks_catalog_external_location databricks_catalog_function databricks_catalog_metastore databricks_catalog_schema databricks_catalog_storage_credential databricks_catalog_system_schema databricks_catalog_table databricks_catalog_volume databricks_compute_cluster databricks_compute_cluster_node_type databricks_compute_cluster_policy databricks_compute_global_init_script databricks_compute_instance_pool databricks_compute_instance_profile databricks_compute_policy_family databricks_files_dbfs databricks_iam_account_group databricks_iam_account_user databricks_iam_current_user databricks_iam_group databricks_iam_service_principal databricks_iam_user databricks_job databricks_job_run databricks_ml_experiment databricks_ml_model databricks_ml_webhook databricks_pipeline databricks_pipeline_event databricks_pipeline_update databricks_serving_serving_endpoint databricks_settings_ip_access_list databricks_settings_token databricks_settings_token_management databricks_sharing_provider databricks_sharing_recipient databricks_sharing_share databricks_sql_alert databricks_sql_dashboard databricks_sql_data_source databricks_sql_query databricks_sql_query_history databricks_sql_warehouse databricks_sql_warehouse_config databricks_workspace databricks_workspace_git_credential databricks_workspace_repo databricks_workspace_scope databricks_workspace_secret

Table: databricks_compute_cluster_node_type - Query Databricks Compute Cluster Node Types using SQL

Databricks Compute Cluster Node Types are the units of processing power and memory that Databricks uses to run computations. Each node type has specific attributes, limitations, and capabilities. The node types are designed to optimize the performance of Databricks workloads.

Table Usage Guide

The databricks_compute_cluster_node_type table provides insights into the node types available in Databricks Compute Clusters. As a Data Engineer, explore node type-specific details through this table, including memory, CPU, and storage attributes, as well as any limitations or special capabilities. Utilize it to select the most suitable node type for your specific Databricks workloads, ensuring optimal performance and cost-effectiveness.

Examples

Basic info

Explore the different categories of compute cluster node types in Databricks, understanding their memory and core capacities. This information can help optimize resource allocation and performance across different accounts.

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type;

List total node types per category

Explore the distribution of node types across different categories within your account. This allows you to understand your usage patterns and potentially optimize resource allocation.

select
  category,
  count(*) as num_node_types,
  account_id
from
  databricks_compute_cluster_node_type
group by
  category,
  account_id;

select
  category,
  count(*) as num_node_types,
  account_id
from
  databricks_compute_cluster_node_type
group by
  category,
  account_id;

List node types encrypted in transit

Explore which node types within your Databricks compute cluster are encrypted in transit. This can be useful for ensuring security compliance across your data processing infrastructure.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_encrypted_in_transit;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_encrypted_in_transit = 1;

List node types with I/O caching enabled

Explore which node types have I/O caching enabled in your Databricks compute cluster. This can help you determine which nodes could potentially offer enhanced performance due to caching, aiding in efficient resource allocation.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_io_cache_enabled;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_io_cache_enabled;

List node types that support port forwarding

Discover the types of nodes that support port forwarding to understand their characteristics and capabilities. This can help optimize your network configuration by choosing nodes that best meet your port forwarding needs.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  support_port_forwarding;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  support_port_forwarding = 1;

Get node instance type details for each node type

Discover the details of each node type in your system by understanding the specifics of the instance type, such as disk size. This could be useful to manage resources and plan for infrastructure upgrades.

select
  node_type_id,
  node_instance_type ->> 'instance_type_id' as instance_type_id,
  node_instance_type ->> 'local_disk_size_gb' as local_disk_size_gb,
  node_instance_type ->> 'local_disks' as local_disks,
  account_id
from
  databricks_compute_cluster_node_type;

select
  node_type_id,
  json_extract(node_instance_type, '$.instance_type_id') as instance_type_id,
  json_extract(node_instance_type, '$.local_disk_size_gb') as local_disk_size_gb,
  json_extract(node_instance_type, '$.local_disks') as local_disks,
  account_id
from
  databricks_compute_cluster_node_type;

List hidden node types

Discover the hidden node types within your Databricks compute cluster to better manage resources and understand the configuration of your system. This helps in optimizing the usage of memory and cores, thereby improving overall system efficiency.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_hidden;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_hidden = 1;

List gravition node types

Analyze the configuration of your Databricks compute cluster to identify instances where Graviton node types are being used. This could be useful for assessing the efficiency and performance of your data processing tasks.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_graviton;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_graviton = 1;

List all non-deprecated node types

Explore the different types of non-deprecated nodes in your Databricks compute cluster to understand their categories, descriptions, and hardware specifications such as memory and core count. This can be useful for optimizing resource allocation and identifying suitable node types for your specific workload requirements.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  not is_deprecated;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  is_deprecated = 0;

List node types having more than one GPUs

Discover the segments that have more than one GPU within your Databricks compute cluster node types. This can be useful in identifying high-performance node types, thus aiding in resource allocation and optimization.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  num_gpus > 1;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  num_gpus > 1;

List node types that support EBS volumes

Discover the types of nodes that are compatible with EBS volumes in a Databricks computing environment. This can be useful when planning resource allocation or designing data processing tasks.

select
  node_type_id,
  category,
  description,
  memory_mb num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  support_ebs_volumes;

select
  node_type_id,
  category,
  description,
  memory_mb,
  num_cores,
  account_id
from
  databricks_compute_cluster_node_type
where
  support_ebs_volumes = 1;

List node types in order of available memory

Explore the types of nodes in your Databricks compute cluster, ordered by the amount of available memory. This can help prioritize resource allocation and optimize cluster performance.

select
  node_type_id,
  category,
  memory_mb
from
  databricks_compute_cluster_node_type
order by
  memory_mb desc;

select
  node_type_id,
  category,
  memory_mb
from
  databricks_compute_cluster_node_type
order by
  memory_mb desc;

Schema for databricks_compute_cluster_node_type

Name	Type	Description
_ctx	jsonb	Steampipe context in JSON form, e.g. connection_name.
account_id	text	The Databricks Account ID in which the resource is located.
category	text	Category of the node type.
description	text	A string description associated with this node type.
display_order	bigint	Display order of the node type.
instance_type_id	text	An identifier for the type of hardware that this node runs on.
is_deprecated	boolean	Whether the node type is deprecated.
is_encrypted_in_transit	boolean	AWS specific, whether this instance supports encryption in transit, used for hipaa and pci workloads.
is_graviton	boolean	Whether this instance is a graviton instance.
is_hidden	boolean	Whether the node type is hidden.
is_io_cache_enabled	boolean	Flag indicating whether I/O cache is enabled for the node type.
memory_mb	bigint	Memory (in MB) available for this node type.
node_info_status	jsonb	Node info status information.
node_instance_type	jsonb	Node instance type information.
node_type_id	text	Unique identifier for this node type.
num_cores	double precision	Number of cores for the node type.
num_gpus	bigint	Number of GPUs for the node type.
photon_driver_capable	boolean	Indicates whether this node type is capable of being a Photon driver.
photon_worker_capable	boolean	Indicates whether this node type is capable of being a Photon worker.
support_cluster_tags	boolean	Flag indicating whether the node type supports cluster tags.
support_ebs_volumes	boolean	Flag indicating whether the node type supports EBS volumes.
support_port_forwarding	boolean	Flag indicating whether the node type supports port forwarding.
title	text	The title of the resource.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks

You can pass the configuration to the command with the --config argument:

steampipe_export_databricks --config '<your_config>' databricks_compute_cluster_node_type