steampipe plugin install gcp

Table: gcp_bigquery_dataset - Query Google Cloud Platform BigQuery Datasets using SQL

Google Cloud Platform's BigQuery is a serverless, highly scalable, and cost-effective multi-cloud data warehouse designed for business agility. BigQuery Datasets are top-level containers that are used to organize and control access to tables and views. They provide a mechanism for grouping tables and views, and setting permissions at a group level.

Table Usage Guide

The gcp_bigquery_dataset table provides insights into BigQuery Datasets within Google Cloud Platform. As a data analyst or data engineer, explore dataset-specific details through this table, including access controls, locations, and associated metadata. Utilize it to uncover information about datasets, such as their default partition type, default table expiration settings, and the labels applied to them.

Examples

Basic info

Explore the creation times and locations of your Google Cloud Platform BigQuery datasets. This is useful for understanding when and where your data resources were established, which can aid in resource management and optimization.

select
dataset_id,
self_link,
creation_time,
location
from
gcp_bigquery_dataset;
select
dataset_id,
self_link,
creation_time,
location
from
gcp_bigquery_dataset;

List datasets which are not encrypted using CMK

Identify instances where your datasets are not encrypted using a Customer-Managed Key (CMK). This helps in enhancing the security of your data by ensuring that all datasets are encrypted with a key that you manage and control.

select
dataset_id,
location,
kms_key_name
from
gcp_bigquery_dataset
where
kms_key_name is null;
select
dataset_id,
location,
kms_key_name
from
gcp_bigquery_dataset
where
kms_key_name is null;

List publicly accessible datasets

Discover the segments that are publicly accessible for data analysis and manipulation. This is useful in identifying potential data privacy risks or for making data publicly available for collaboration.

select
dataset_id,
location,
ls as access_policy
from
gcp_bigquery_dataset,
jsonb_array_elements(access) as ls
where
ls ->> 'specialGroup' = 'allAuthenticatedUsers'
or ls ->> 'iamMember' = 'allUsers';
select
dataset_id,
location,
ls.value as access_policy
from
gcp_bigquery_dataset,
json_each(access) as ls
where
json_extract(ls.value, '$.specialGroup') = 'allAuthenticatedUsers'
or json_extract(ls.value, '$.iamMember') = 'allUsers';

List datasets which do not have owner tag key

Discover datasets that lack an assigned owner within the Google Cloud Platform's BigQuery service. This query can be useful to identify potential gaps in data ownership and accountability.

select
dataset_id,
location
from
gcp_bigquery_dataset
where
tags -> 'owner' is null;
select
dataset_id,
location
from
gcp_bigquery_dataset
where
json_extract(tags, '$.owner') is null;

Schema for gcp_bigquery_dataset

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form, e.g. connection_name.
accessjsonbAn array of objects that define dataset access for one or more entities.
akasjsonbArray of globally unique identifier strings (also known as) for the resource.
creation_timetimestamp with time zoneThe time when this dataset was created.
dataset_idtext=The ID of the dataset resource.
default_partition_expiration_msbigintThe default partition expiration for all partitioned tables in the dataset, in milliseconds.
default_table_expiration_msbigintThe default lifetime of all tables in the dataset, in milliseconds.
descriptiontextA user-friendly description of the dataset.
etagtextA hash of the resource.
idtextThe fully-qualified, unique, opaque ID of the dataset.
kindtextThe type of the resource. This property always returns the value 'bigquery#dataset'.
kms_key_nametextDescribes the Cloud KMS encryption key that will be used to protect destination BigQuery table.
labelsjsonbA set of labels associated with this dataset.
last_modified_timetimestamp with time zoneThe date when this dataset or any of its tables was last modified.
locationtextThe GCP multi-region, region, or zone in which the resource is located.
nametextA descriptive name for the dataset, if one exists.
projecttextThe GCP Project in which the resource is located.
self_linktextAn URL that can be used to access the resource again.
tagsjsonbA map of tags for the resource.
titletextTitle of the resource.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- gcp

You can pass the configuration to the command with the --config argument:

steampipe_export_gcp --config '<your_config>' gcp_bigquery_dataset