Table: databricks_compute_cluster_node_type - Query Databricks Compute Cluster Node Types using SQL
Databricks Compute Cluster Node Types are the units of processing power and memory that Databricks uses to run computations. Each node type has specific attributes, limitations, and capabilities. The node types are designed to optimize the performance of Databricks workloads.
Table Usage Guide
The databricks_compute_cluster_node_type
table provides insights into the node types available in Databricks Compute Clusters. As a Data Engineer, explore node type-specific details through this table, including memory, CPU, and storage attributes, as well as any limitations or special capabilities. Utilize it to select the most suitable node type for your specific Databricks workloads, ensuring optimal performance and cost-effectiveness.
Examples
Basic info
Explore the different categories of compute cluster node types in Databricks, understanding their memory and core capacities. This information can help optimize resource allocation and performance across different accounts.
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_type;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_type;
List total node types per category
Explore the distribution of node types across different categories within your account. This allows you to understand your usage patterns and potentially optimize resource allocation.
select category, count(*) as num_node_types, account_idfrom databricks_compute_cluster_node_typegroup by category, account_id;
select category, count(*) as num_node_types, account_idfrom databricks_compute_cluster_node_typegroup by category, account_id;
List node types encrypted in transit
Explore which node types within your Databricks compute cluster are encrypted in transit. This can be useful for ensuring security compliance across your data processing infrastructure.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_encrypted_in_transit;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_encrypted_in_transit = 1;
List node types with I/O caching enabled
Explore which node types have I/O caching enabled in your Databricks compute cluster. This can help you determine which nodes could potentially offer enhanced performance due to caching, aiding in efficient resource allocation.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_io_cache_enabled;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_io_cache_enabled;
List node types that support port forwarding
Discover the types of nodes that support port forwarding to understand their characteristics and capabilities. This can help optimize your network configuration by choosing nodes that best meet your port forwarding needs.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere support_port_forwarding;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere support_port_forwarding = 1;
Get node instance type details for each node type
Discover the details of each node type in your system by understanding the specifics of the instance type, such as disk size. This could be useful to manage resources and plan for infrastructure upgrades.
select node_type_id, node_instance_type ->> 'instance_type_id' as instance_type_id, node_instance_type ->> 'local_disk_size_gb' as local_disk_size_gb, node_instance_type ->> 'local_disks' as local_disks, account_idfrom databricks_compute_cluster_node_type;
select node_type_id, json_extract(node_instance_type, '$.instance_type_id') as instance_type_id, json_extract(node_instance_type, '$.local_disk_size_gb') as local_disk_size_gb, json_extract(node_instance_type, '$.local_disks') as local_disks, account_idfrom databricks_compute_cluster_node_type;
List hidden node types
Discover the hidden node types within your Databricks compute cluster to better manage resources and understand the configuration of your system. This helps in optimizing the usage of memory and cores, thereby improving overall system efficiency.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_hidden;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_hidden = 1;
List gravition node types
Analyze the configuration of your Databricks compute cluster to identify instances where Graviton node types are being used. This could be useful for assessing the efficiency and performance of your data processing tasks.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_graviton;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_graviton = 1;
List all non-deprecated node types
Explore the different types of non-deprecated nodes in your Databricks compute cluster to understand their categories, descriptions, and hardware specifications such as memory and core count. This can be useful for optimizing resource allocation and identifying suitable node types for your specific workload requirements.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere not is_deprecated;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere is_deprecated = 0;
List node types having more than one GPUs
Discover the segments that have more than one GPU within your Databricks compute cluster node types. This can be useful in identifying high-performance node types, thus aiding in resource allocation and optimization.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere num_gpus > 1;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere num_gpus > 1;
List node types that support EBS volumes
Discover the types of nodes that are compatible with EBS volumes in a Databricks computing environment. This can be useful when planning resource allocation or designing data processing tasks.
select node_type_id, category, description, memory_mb num_cores, account_idfrom databricks_compute_cluster_node_typewhere support_ebs_volumes;
select node_type_id, category, description, memory_mb, num_cores, account_idfrom databricks_compute_cluster_node_typewhere support_ebs_volumes = 1;
List node types in order of available memory
Explore the types of nodes in your Databricks compute cluster, ordered by the amount of available memory. This can help prioritize resource allocation and optimize cluster performance.
select node_type_id, category, memory_mbfrom databricks_compute_cluster_node_typeorder by memory_mb desc;
select node_type_id, category, memory_mbfrom databricks_compute_cluster_node_typeorder by memory_mb desc;
Schema for databricks_compute_cluster_node_type
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form, e.g. connection_name. | |
account_id | text | The Databricks Account ID in which the resource is located. | |
category | text | Category of the node type. | |
description | text | A string description associated with this node type. | |
display_order | bigint | Display order of the node type. | |
instance_type_id | text | An identifier for the type of hardware that this node runs on. | |
is_deprecated | boolean | Whether the node type is deprecated. | |
is_encrypted_in_transit | boolean | AWS specific, whether this instance supports encryption in transit, used for hipaa and pci workloads. | |
is_graviton | boolean | Whether this instance is a graviton instance. | |
is_hidden | boolean | Whether the node type is hidden. | |
is_io_cache_enabled | boolean | Flag indicating whether I/O cache is enabled for the node type. | |
memory_mb | bigint | Memory (in MB) available for this node type. | |
node_info_status | jsonb | Node info status information. | |
node_instance_type | jsonb | Node instance type information. | |
node_type_id | text | Unique identifier for this node type. | |
num_cores | double precision | Number of cores for the node type. | |
num_gpus | bigint | Number of GPUs for the node type. | |
photon_driver_capable | boolean | Indicates whether this node type is capable of being a Photon driver. | |
photon_worker_capable | boolean | Indicates whether this node type is capable of being a Photon worker. | |
support_cluster_tags | boolean | Flag indicating whether the node type supports cluster tags. | |
support_ebs_volumes | boolean | Flag indicating whether the node type supports EBS volumes. | |
support_port_forwarding | boolean | Flag indicating whether the node type supports port forwarding. | |
title | text | The title of the resource. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks
You can pass the configuration to the command with the --config
argument:
steampipe_export_databricks --config '<your_config>' databricks_compute_cluster_node_type