steampipe plugin install gcp

Table: gcp_dataplex_lake - Query GCP Dataplex Lakes using SQL

GCP Dataplex Lakes are managed data lakes that provide unified analytics and governance for data at scale. Dataplex simplifies data management by automating discovery, organization, and management of data across various storage systems.

Table Usage Guide

The gcp_dataplex_lake table allows data engineers and cloud administrators to query and manage Dataplex Lakes within their GCP environment. You can retrieve information about the lake's configuration, status, associated metastore, and more. This table is useful for monitoring and managing the state and metadata of Dataplex Lakes.

Examples

Basic info

Retrieve a list of all Dataplex Lakes in your GCP account to get an overview of your managed data lakes.

select
display_name,
name,
state,
create_time,
service_account
from
gcp_dataplex_lake;
select
display_name,
name,
state,
create_time,
service_account
from
gcp_dataplex_lake;

Dataplex Lakes by location

Explore which regions have the most Dataplex Lakes to understand your data infrastructure distribution better.

select
location,
count(*)
from
gcp_dataplex_lake
group by
location;
select
location,
count(*)
from
gcp_dataplex_lake
group by
location;

Get details of lakes with a specific state

Retrieve Dataplex Lakes in a specific state (e.g., ACTIVE) to monitor their status.

select
name,
state,
create_time,
update_time
from
gcp_dataplex_lake
where
state = 'ACTIVE';
select
name,
state,
create_time,
update_time
from
gcp_dataplex_lake
where
state = 'ACTIVE';

Get Dataplex Lakes with the associated metastore settings

List all Dataplex Lakes that have an associated Dataproc Metastore, including their metastore settings and status.

select
name,
metastore ->> 'service' as metastore_service,
metastore_status ->> 'state' as metastore_state,
location
from
gcp_dataplex_lake
where
metastore is not null;
select
name,
json_extract(metastore, '$.service') as metastore_service,
json_extract(metastore_status, '$.state') as metastore_state,
location
from
gcp_dataplex_lake
where
metastore is not null;

Schema for gcp_dataplex_lake

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form.
akasjsonbArray of globally unique identifier strings (also known as) for the resource.
asset_statusjsonbAggregated status of the underlying assets of the lake.
create_timetimestamp with time zoneThe time when the lake was created.
descriptiontextDescription of the lake.
display_nametext=User friendly display name.
locationtextThe GCP multi-region, region, or zone in which the resource is located.
metastorejsonbSettings to manage lake and Dataproc Metastore service instance association.
metastore_statusjsonbMetastore status of the lake.
nametext=The relative resource name of the lake.
projecttext=, !=, ~~, ~~*, !~~, !~~*The GCP Project in which the resource is located.
self_linktextServer-defined URL for the resource.
service_accounttextService account associated with this lake. This service account must be authorized to access or operate on resources managed by the lake.
sp_connection_nametext=, !=, ~~, ~~*, !~~, !~~*Steampipe connection name.
sp_ctxjsonbSteampipe context in JSON form.
statetext=Current state of the lake.
tagsjsonbA map of tags for the resource.
titletextTitle of the resource.
uidtextSystem generated globally unique ID for the lake.
update_timetimestamp with time zoneThe time when the lake was last updated.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- gcp

You can pass the configuration to the command with the --config argument:

steampipe_export_gcp --config '<your_config>' gcp_dataplex_lake