turbot/databricks
steampipe plugin install databricks

Table: databricks_compute_cluster_node_type - Query Databricks Compute Cluster Node Types using SQL

Databricks Compute Cluster Node Types are the units of processing power and memory that Databricks uses to run computations. Each node type has specific attributes, limitations, and capabilities. The node types are designed to optimize the performance of Databricks workloads.

Table Usage Guide

The databricks_compute_cluster_node_type table provides insights into the node types available in Databricks Compute Clusters. As a Data Engineer, explore node type-specific details through this table, including memory, CPU, and storage attributes, as well as any limitations or special capabilities. Utilize it to select the most suitable node type for your specific Databricks workloads, ensuring optimal performance and cost-effectiveness.

Examples

Basic info

Explore the different categories of compute cluster node types in Databricks, understanding their memory and core capacities. This information can help optimize resource allocation and performance across different accounts.

select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type;

List total node types per category

Explore the distribution of node types across different categories within your account. This allows you to understand your usage patterns and potentially optimize resource allocation.

select
category,
count(*) as num_node_types,
account_id
from
databricks_compute_cluster_node_type
group by
category,
account_id;
select
category,
count(*) as num_node_types,
account_id
from
databricks_compute_cluster_node_type
group by
category,
account_id;

List node types encrypted in transit

Explore which node types within your Databricks compute cluster are encrypted in transit. This can be useful for ensuring security compliance across your data processing infrastructure.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_encrypted_in_transit;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_encrypted_in_transit = 1;

List node types with I/O caching enabled

Explore which node types have I/O caching enabled in your Databricks compute cluster. This can help you determine which nodes could potentially offer enhanced performance due to caching, aiding in efficient resource allocation.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_io_cache_enabled;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_io_cache_enabled;

List node types that support port forwarding

Discover the types of nodes that support port forwarding to understand their characteristics and capabilities. This can help optimize your network configuration by choosing nodes that best meet your port forwarding needs.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
support_port_forwarding;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
support_port_forwarding = 1;

Get node instance type details for each node type

Discover the details of each node type in your system by understanding the specifics of the instance type, such as disk size. This could be useful to manage resources and plan for infrastructure upgrades.

select
node_type_id,
node_instance_type ->> 'instance_type_id' as instance_type_id,
node_instance_type ->> 'local_disk_size_gb' as local_disk_size_gb,
node_instance_type ->> 'local_disks' as local_disks,
account_id
from
databricks_compute_cluster_node_type;
select
node_type_id,
json_extract(node_instance_type, '$.instance_type_id') as instance_type_id,
json_extract(node_instance_type, '$.local_disk_size_gb') as local_disk_size_gb,
json_extract(node_instance_type, '$.local_disks') as local_disks,
account_id
from
databricks_compute_cluster_node_type;

List hidden node types

Discover the hidden node types within your Databricks compute cluster to better manage resources and understand the configuration of your system. This helps in optimizing the usage of memory and cores, thereby improving overall system efficiency.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_hidden;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_hidden = 1;

List gravition node types

Analyze the configuration of your Databricks compute cluster to identify instances where Graviton node types are being used. This could be useful for assessing the efficiency and performance of your data processing tasks.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_graviton;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_graviton = 1;

List all non-deprecated node types

Explore the different types of non-deprecated nodes in your Databricks compute cluster to understand their categories, descriptions, and hardware specifications such as memory and core count. This can be useful for optimizing resource allocation and identifying suitable node types for your specific workload requirements.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
not is_deprecated;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
is_deprecated = 0;

List node types having more than one GPUs

Discover the segments that have more than one GPU within your Databricks compute cluster node types. This can be useful in identifying high-performance node types, thus aiding in resource allocation and optimization.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
num_gpus > 1;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
num_gpus > 1;

List node types that support EBS volumes

Discover the types of nodes that are compatible with EBS volumes in a Databricks computing environment. This can be useful when planning resource allocation or designing data processing tasks.

select
node_type_id,
category,
description,
memory_mb num_cores,
account_id
from
databricks_compute_cluster_node_type
where
support_ebs_volumes;
select
node_type_id,
category,
description,
memory_mb,
num_cores,
account_id
from
databricks_compute_cluster_node_type
where
support_ebs_volumes = 1;

List node types in order of available memory

Explore the types of nodes in your Databricks compute cluster, ordered by the amount of available memory. This can help prioritize resource allocation and optimize cluster performance.

select
node_type_id,
category,
memory_mb
from
databricks_compute_cluster_node_type
order by
memory_mb desc;
select
node_type_id,
category,
memory_mb
from
databricks_compute_cluster_node_type
order by
memory_mb desc;

Schema for databricks_compute_cluster_node_type

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form, e.g. connection_name.
account_idtextThe Databricks Account ID in which the resource is located.
categorytextCategory of the node type.
descriptiontextA string description associated with this node type.
display_orderbigintDisplay order of the node type.
instance_type_idtextAn identifier for the type of hardware that this node runs on.
is_deprecatedbooleanWhether the node type is deprecated.
is_encrypted_in_transitbooleanAWS specific, whether this instance supports encryption in transit, used for hipaa and pci workloads.
is_gravitonbooleanWhether this instance is a graviton instance.
is_hiddenbooleanWhether the node type is hidden.
is_io_cache_enabledbooleanFlag indicating whether I/O cache is enabled for the node type.
memory_mbbigintMemory (in MB) available for this node type.
node_info_statusjsonbNode info status information.
node_instance_typejsonbNode instance type information.
node_type_idtextUnique identifier for this node type.
num_coresdouble precisionNumber of cores for the node type.
num_gpusbigintNumber of GPUs for the node type.
photon_driver_capablebooleanIndicates whether this node type is capable of being a Photon driver.
photon_worker_capablebooleanIndicates whether this node type is capable of being a Photon worker.
support_cluster_tagsbooleanFlag indicating whether the node type supports cluster tags.
support_ebs_volumesbooleanFlag indicating whether the node type supports EBS volumes.
support_port_forwardingbooleanFlag indicating whether the node type supports port forwarding.
titletextThe title of the resource.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- databricks

You can pass the configuration to the command with the --config argument:

steampipe_export_databricks --config '<your_config>' databricks_compute_cluster_node_type