Table: azure_hdinsight_cluster - Query Azure HDInsight Clusters using SQL
Azure HDInsight is a fully managed, open-source analytics service for enterprises. It provides big data cloud offerings and is built on Hadoop, Spark, R, and Hive, among others. It enables processing massive amounts of data and running big data workloads in the cloud.
Table Usage Guide
The azure_hdinsight_cluster
table provides insights into HDInsight clusters within Azure. As a data engineer or data scientist, you can use this table to explore cluster-specific details, including properties, configurations, and states. This table can be utilized to uncover information about clusters, such as their health, location, provisioning state, and type.
Examples
Basic info
Determine the status and details of your Azure HDInsight clusters to manage and optimize your big data analytics. This query is useful in understanding the configuration and operational state of your clusters, including their version details and creation date.
select name, id, provisioning_state, type, cluster_hdp_version, cluster_id, cluster_state, cluster_version, created_datefrom azure_hdinsight_cluster;
select name, id, provisioning_state, type, cluster_hdp_version, cluster_id, cluster_state, cluster_version, created_datefrom azure_hdinsight_cluster;
List clusters with encryption in transit enabled
Determine the clusters that have enhanced security measures in place, specifically, the encryption during data transit. This is useful for auditing and ensuring compliance with data security standards.
select name, id, encryption_in_transit_properties -> 'isEncryptionInTransitEnabled' as is_encryption_in_transit_enabledfrom azure_hdinsight_clusterwhere ( encryption_in_transit_properties ->> 'isEncryptionInTransitEnabled' ) :: boolean;
select name, id, json_extract( encryption_in_transit_properties, '$.isEncryptionInTransitEnabled' ) as is_encryption_in_transit_enabledfrom azure_hdinsight_clusterwhere json_extract( encryption_in_transit_properties, '$.isEncryptionInTransitEnabled' ) = 'true';
List disk encryption details
Explore the encryption details of your Azure HDInsight clusters. This can help you understand your security setup and ensure that the right encryption measures are in place.
select name, id, disk_encryption_properties ->> 'encryptionAlgorithm' as encryption_algorithm, disk_encryption_properties -> 'encryptionAtHost' as encryption_at_host, disk_encryption_properties ->> 'keyName' as key_name, disk_encryption_properties ->> 'keyVersion' as key_version, disk_encryption_properties ->> 'msiResourceId' as msi_resource_id, disk_encryption_properties ->> 'vaultUri' as vault_urifrom azure_hdinsight_cluster;
select name, id, json_extract( disk_encryption_properties, '$.encryptionAlgorithm' ) as encryption_algorithm, json_extract(disk_encryption_properties, '$.encryptionAtHost') as encryption_at_host, json_extract(disk_encryption_properties, '$.keyName') as key_name, json_extract(disk_encryption_properties, '$.keyVersion') as key_version, json_extract(disk_encryption_properties, '$.msiResourceId') as msi_resource_id, json_extract(disk_encryption_properties, '$.vaultUri') as vault_urifrom azure_hdinsight_cluster;
List connectivity endpoint details
Explore the connectivity details of your HDInsight clusters in Azure. This query helps to understand the location, name, port, protocol, and private IP address of each endpoint, allowing for efficient cluster management and troubleshooting.
select name, id, endpoint ->> 'location' as endpoint_location, endpoint ->> 'name' as endpoint_name, endpoint -> 'port' as endpoint_port, endpoint ->> 'protocol' as endpoint_protocol, endpoint ->> 'privateIpAddress' as endpoint_private_ip_addressfrom azure_hdinsight_cluster, jsonb_array_elements(connectivity_endpoints) as endpoint;
select name, c.id, json_extract(endpoint.value, '$.location') as endpoint_location, json_extract(endpoint.value, '$.name') as endpoint_name, json_extract(endpoint.value, '$.port') as endpoint_port, json_extract(endpoint.value, '$.protocol') as endpoint_protocol, json_extract(endpoint.value, '$.privateIpAddress') as endpoint_private_ip_addressfrom azure_hdinsight_cluster as c, json_each(connectivity_endpoints) as endpoint;
Control examples
Schema for azure_hdinsight_cluster
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form. | |
akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. | |
cloud_environment | text | The Azure Cloud Environment. | |
cluster_definition | jsonb | The cluster definition. | |
cluster_hdp_version | text | The hdp version of the cluster. | |
cluster_id | text | The cluster id. | |
cluster_state | text | The state of the cluster. | |
cluster_version | text | The version of the cluster. | |
compute_isolation_properties | jsonb | The compute isolation properties of the cluster. | |
compute_profile | jsonb | The complete profile of the cluster. | |
connectivity_endpoints | jsonb | The list of connectivity endpoints. | |
created_date | text | The date on which the cluster was created. | |
diagnostic_settings | jsonb | A list of active diagnostic settings for the cluster. | |
disk_encryption_properties | jsonb | The disk encryption properties of the cluster. | |
encryption_in_transit_properties | jsonb | The encryption-in-transit properties of the cluster. | |
errors | jsonb | The list of errors. | |
etag | text | The ETag for the resource. | |
excluded_services_config | jsonb | The excluded services config of the cluster. | |
id | text | Fully qualified resource Id for the resource. | |
identity | jsonb | The identity of the cluster, if configured. | |
kafka_rest_properties | jsonb | The cluster kafka rest proxy configuration. | |
min_supported_tls_version | text | The minimal supported tls version of the cluster. | |
name | text | = | The name of the resource. |
network_properties | jsonb | The network properties of the cluster. | |
os_type | text | The type of operating system. Possible values include: 'Windows', 'Linux'. | |
provisioning_state | text | The provisioning state, which only appears in the response. Possible values include: 'InProgress', 'Failed', 'Succeeded', 'Canceled', 'Deleting'. | |
quota_info | jsonb | The quota information of the cluster. | |
region | text | The Azure region/location in which the resource is located. | |
resource_group | text | = | The resource group which holds this resource. |
security_profile | jsonb | The security profile of the cluster. | |
sp_connection_name | text | =, !=, ~~, ~~*, !~~, !~~* | Steampipe connection name. |
sp_ctx | jsonb | Steampipe context in JSON form. | |
storage_profile | jsonb | The storage profile of the cluster. | |
subscription_id | text | =, !=, ~~, ~~*, !~~, !~~* | The Azure Subscription ID in which the resource is located. |
tags | jsonb | A map of tags for the resource. | |
tier | text | The cluster tier. Possible values include: 'Standard', 'Premium'. | |
title | text | Title of the resource. | |
type | text | The type of the resource. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- azure
You can pass the configuration to the command with the --config
argument:
steampipe_export_azure --config '<your_config>' azure_hdinsight_cluster