Table: gcp_dataplex_lake - Query GCP Dataplex Lakes using SQL
GCP Dataplex Lakes are managed data lakes that provide unified analytics and governance for data at scale. Dataplex simplifies data management by automating discovery, organization, and management of data across various storage systems.
Table Usage Guide
The gcp_dataplex_lake
table allows data engineers and cloud administrators to query and manage Dataplex Lakes within their GCP environment. You can retrieve information about the lake's configuration, status, associated metastore, and more. This table is useful for monitoring and managing the state and metadata of Dataplex Lakes.
Examples
Basic info
Retrieve a list of all Dataplex Lakes in your GCP account to get an overview of your managed data lakes.
select display_name, name, state, create_time, service_accountfrom gcp_dataplex_lake;
select display_name, name, state, create_time, service_accountfrom gcp_dataplex_lake;
Dataplex Lakes by location
Explore which regions have the most Dataplex Lakes to understand your data infrastructure distribution better.
select location, count(*)from gcp_dataplex_lakegroup by location;
select location, count(*)from gcp_dataplex_lakegroup by location;
Get details of lakes with a specific state
Retrieve Dataplex Lakes in a specific state (e.g., ACTIVE
) to monitor their status.
select name, state, create_time, update_timefrom gcp_dataplex_lakewhere state = 'ACTIVE';
select name, state, create_time, update_timefrom gcp_dataplex_lakewhere state = 'ACTIVE';
Get Dataplex Lakes with the associated metastore settings
List all Dataplex Lakes that have an associated Dataproc Metastore, including their metastore settings and status.
select name, metastore ->> 'service' as metastore_service, metastore_status ->> 'state' as metastore_state, locationfrom gcp_dataplex_lakewhere metastore is not null;
select name, json_extract(metastore, '$.service') as metastore_service, json_extract(metastore_status, '$.state') as metastore_state, locationfrom gcp_dataplex_lakewhere metastore is not null;
Schema for gcp_dataplex_lake
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form. | |
akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. | |
asset_status | jsonb | Aggregated status of the underlying assets of the lake. | |
create_time | timestamp with time zone | The time when the lake was created. | |
description | text | Description of the lake. | |
display_name | text | = | User friendly display name. |
location | text | The GCP multi-region, region, or zone in which the resource is located. | |
metastore | jsonb | Settings to manage lake and Dataproc Metastore service instance association. | |
metastore_status | jsonb | Metastore status of the lake. | |
name | text | = | The relative resource name of the lake. |
project | text | =, !=, ~~, ~~*, !~~, !~~* | The GCP Project in which the resource is located. |
self_link | text | Server-defined URL for the resource. | |
service_account | text | Service account associated with this lake. This service account must be authorized to access or operate on resources managed by the lake. | |
sp_connection_name | text | =, !=, ~~, ~~*, !~~, !~~* | Steampipe connection name. |
sp_ctx | jsonb | Steampipe context in JSON form. | |
state | text | = | Current state of the lake. |
tags | jsonb | A map of tags for the resource. | |
title | text | Title of the resource. | |
uid | text | System generated globally unique ID for the lake. | |
update_time | timestamp with time zone | The time when the lake was last updated. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- gcp
You can pass the configuration to the command with the --config
argument:
steampipe_export_gcp --config '<your_config>' gcp_dataplex_lake