steampipe plugin install gcp

Table: gcp_dataproc_cluster - Query Google Cloud Platform Dataproc Clusters using SQL

Google Cloud Dataproc is a fast, easy-to-use, fully managed cloud service for running Apache Spark and Apache Hadoop clusters in a simpler, more cost-efficient way. Operations that used to take hours or days take seconds or minutes instead, and you pay only for the resources you use. Dataproc also easily integrates with other Google Cloud services, giving you a powerful and complete data processing platform.

Table Usage Guide

The gcp_dataproc_cluster table provides insights into Dataproc Clusters within Google Cloud Platform. As a data engineer, you can explore cluster-specific details through this table, including configurations, status, and associated metadata. Use it to uncover information about clusters, such as those with specific configurations, the operational status of clusters, and verification of associated metadata.

Examples

Basic info

Explore the configuration and status of your Google Cloud Platform's Dataproc clusters. This can help you assess the current state and settings of your clusters for better resource management and optimization.

select
cluster_name,
cluster_uuid,
config,
state,
tags
from
gcp_dataproc_cluster;
select
cluster_name,
cluster_uuid,
config,
state,
tags
from
gcp_dataproc_cluster;

List the clusters which are in error state

Explore which clusters are experiencing errors to troubleshoot and resolve issues promptly, ensuring smooth operations. This is crucial in a real-world scenario where maintaining the health and functionality of clusters is vital for various applications and services.

select
cluster_name,
cluster_uuid,
state
from
gcp_dataproc_cluster
where
state = 'ERROR';
select
cluster_name,
cluster_uuid,
state
from
gcp_dataproc_cluster
where
state = 'ERROR';

Get config details of a cluster

Explore the configuration details of a specific cluster to gain insights into various aspects like endpoint configuration, bucket configuration, shielded instance configuration, and master configuration. This can be particularly useful for understanding and managing the cluster's settings and configurations.

select
cluster_name,
config -> 'endpointConfig' as endpoint_config,
config -> 'configBucket' as config_bucket,
config -> 'shieldedInstanceConfig' as shielded_instance_config,
config -> 'masterConfig' as master_config
from
gcp_dataproc_cluster
where
cluster_name = 'cluster-5824';
select
cluster_name,
json_extract(config, '$.endpointConfig') as endpoint_config,
json_extract(config, '$.configBucket') as config_bucket,
json_extract(config, '$.shieldedInstanceConfig') as shielded_instance_config,
json_extract(config, '$.masterConfig') as master_config
from
gcp_dataproc_cluster
where
cluster_name = 'cluster-5824';

Schema for gcp_dataproc_cluster

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form.
akasjsonbArray of globally unique identifier strings (also known as) for the resource.
cluster_nametext=The cluster name.
cluster_uuidtextA cluster UUID (Unique Universal Identifier). Dataproc generates this value when it creates the cluster.
configjsonbThe cluster config.
labelsjsonbThe labels to associate with this cluster.
locationtextThe GCP multi-region, region, or zone in which the resource is located.
metricsjsonbContains cluster daemon metrics such as HDFS and YARN stats.
projecttext=, !=, ~~, ~~*, !~~, !~~*The GCP Project in which the resource is located.
self_linktextServer-defined URL for the resource.
sp_connection_nametext=, !=, ~~, ~~*, !~~, !~~*Steampipe connection name.
sp_ctxjsonbSteampipe context in JSON form.
statetext=The cluster's state.
statusjsonbCluster status.
status_historyjsonbThe previous cluster status.
tagsjsonbA map of tags for the resource.
titletextTitle of the resource.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- gcp

You can pass the configuration to the command with the --config argument:

steampipe_export_gcp --config '<your_config>' gcp_dataproc_cluster