steampipe plugin install aws

Table: aws_glue_catalog_table - Query AWS Glue Catalog Tables using SQL

The AWS Glue Catalog Tables are a part of AWS Glue, a fully managed extract, transform, and load (ETL) service that makes it easy for users to prepare and load their data for analytics. AWS Glue Catalog Tables store metadata related to data sources, transformations, and targets, allowing users to discover and manage their data. AWS Glue automatically generates the schema for your data and stores it as tables in its data catalog, which you can use for ETL jobs.

Table Usage Guide

The aws_glue_catalog_table table in Steampipe provides you with information about AWS Glue Catalog Tables. It allows you, whether you're a DevOps engineer, data engineer, or another technical professional, to query table-specific details, including table names, database names, owner information, creation time, and associated metadata. You can utilize this table to gather insights on tables, such as their storage descriptors, partition keys, and table parameters. The schema outlines the various attributes of the AWS Glue Catalog Table for you, including the catalog ID, database name, table type, storage descriptor, and associated tags.

Examples

Basic info

Explore the basic information of your AWS Glue Catalog Tables to understand when and why they were created. This can help in managing your resources and planning future database schemas.

select
name,
catalog_id,
create_time,
description,
database_name
from
aws_glue_catalog_table;
select
name,
catalog_id,
create_time,
description,
database_name
from
aws_glue_catalog_table;

Count the number of tables per catalog

Analyze the distribution of tables across different catalogs in AWS Glue service. This helps in understanding the organization of your data assets and can assist in optimizing data management strategies.

select
catalog_id,
count(name) as table_count
from
aws_glue_catalog_table
group by
catalog_id;
select
catalog_id,
count(name) as table_count
from
aws_glue_catalog_table
group by
catalog_id;

List tables with retention period less than 30 days

Explore which AWS Glue tables have a retention period of less than 30 days. This can be useful in identifying tables that may require a more extended retention period for data backup and disaster recovery purposes.

select
name,
catalog_id,
create_time,
description,
retention
from
aws_glue_catalog_table
where
retention < 30;
select
name,
catalog_id,
create_time,
description,
retention
from
aws_glue_catalog_table
where
retention < 30;

Schema for aws_glue_catalog_table

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form.
account_idtext=, !=, ~~, ~~*, !~~, !~~*The AWS Account ID in which the resource is located.
akasjsonbArray of globally unique identifier strings (also known as) for the resource.
catalog_idtext=The ID of the Data Catalog in which the table resides.
create_timetimestamp with time zoneThe time when the table definition was created in the data catalog.
created_bytextThe person or entity who created the table.
database_nametext=The name of the database where the table metadata resides.
descriptiontextA description of the table.
federated_tablejsonbA FederatedTable structure that references an entity outside the Glue Data Catalog.
is_registered_with_lake_formationbooleanIndicates whether the table has been registered with lake formation.
last_access_timetimestamp with time zoneThe last time that the table was accessed. This is usually taken from HDFS, and might not be reliable.
last_analyzed_timetimestamp with time zoneThe last time that column statistics were computed for this table.
lf_tagsjsonbLF-Tags assigned to the table by AWS Lake Formation.
nametext=The table name.
ownertextThe owner of the table.
parametersjsonbThese key-value pairs define properties associated with the table.
partitiontextThe AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov).
partition_keysjsonbA list of columns by which the table is partitioned.
regiontextThe AWS Region in which the resource is located.
retentionbigintThe retention time for this table.
sp_connection_nametext=, !=, ~~, ~~*, !~~, !~~*Steampipe connection name.
sp_ctxjsonbSteampipe context in JSON form.
storage_descriptorjsonbA storage descriptor containing information about the physical storage of this table.
table_typetextThe type of this table (EXTERNAL_TABLE, VIRTUAL_VIEW, etc.).
target_tablejsonbA TableIdentifier structure that describes a target table for resource linking.
titletextTitle of the resource.
update_timetimestamp with time zoneThe last time that the table was updated.
version_idtextThe ID of the table version.
view_expanded_texttextIf the table is a view, the expanded text of the view otherwise null.
view_original_texttextIf the table is a view, the original text of the view otherwise null.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- aws

You can pass the configuration to the command with the --config argument:

steampipe_export_aws --config '<your_config>' aws_glue_catalog_table