Table: aws_glue_catalog_table - Query AWS Glue Catalog Tables using SQL
The AWS Glue Catalog Tables are a part of AWS Glue, a fully managed extract, transform, and load (ETL) service that makes it easy for users to prepare and load their data for analytics. AWS Glue Catalog Tables store metadata related to data sources, transformations, and targets, allowing users to discover and manage their data. AWS Glue automatically generates the schema for your data and stores it as tables in its data catalog, which you can use for ETL jobs.
Table Usage Guide
The aws_glue_catalog_table
table in Steampipe provides you with information about AWS Glue Catalog Tables. It allows you, whether you're a DevOps engineer, data engineer, or another technical professional, to query table-specific details, including table names, database names, owner information, creation time, and associated metadata. You can utilize this table to gather insights on tables, such as their storage descriptors, partition keys, and table parameters. The schema outlines the various attributes of the AWS Glue Catalog Table for you, including the catalog ID, database name, table type, storage descriptor, and associated tags.
Examples
Basic info
Explore the basic information of your AWS Glue Catalog Tables to understand when and why they were created. This can help in managing your resources and planning future database schemas.
select name, catalog_id, create_time, description, database_namefrom aws_glue_catalog_table;
select name, catalog_id, create_time, description, database_namefrom aws_glue_catalog_table;
Count the number of tables per catalog
Analyze the distribution of tables across different catalogs in AWS Glue service. This helps in understanding the organization of your data assets and can assist in optimizing data management strategies.
select catalog_id, count(name) as table_countfrom aws_glue_catalog_tablegroup by catalog_id;
select catalog_id, count(name) as table_countfrom aws_glue_catalog_tablegroup by catalog_id;
List tables with retention period less than 30 days
Explore which AWS Glue tables have a retention period of less than 30 days. This can be useful in identifying tables that may require a more extended retention period for data backup and disaster recovery purposes.
select name, catalog_id, create_time, description, retentionfrom aws_glue_catalog_tablewhere retention < 30;
select name, catalog_id, create_time, description, retentionfrom aws_glue_catalog_tablewhere retention < 30;
Schema for aws_glue_catalog_table
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form. | |
account_id | text | =, !=, ~~, ~~*, !~~, !~~* | The AWS Account ID in which the resource is located. |
akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. | |
catalog_id | text | = | The ID of the Data Catalog in which the table resides. |
create_time | timestamp with time zone | The time when the table definition was created in the data catalog. | |
created_by | text | The person or entity who created the table. | |
database_name | text | = | The name of the database where the table metadata resides. |
description | text | A description of the table. | |
federated_table | jsonb | A FederatedTable structure that references an entity outside the Glue Data Catalog. | |
is_registered_with_lake_formation | boolean | Indicates whether the table has been registered with lake formation. | |
last_access_time | timestamp with time zone | The last time that the table was accessed. This is usually taken from HDFS, and might not be reliable. | |
last_analyzed_time | timestamp with time zone | The last time that column statistics were computed for this table. | |
lf_tags | jsonb | LF-Tags assigned to the table by AWS Lake Formation. | |
name | text | = | The table name. |
owner | text | The owner of the table. | |
parameters | jsonb | These key-value pairs define properties associated with the table. | |
partition | text | The AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov). | |
partition_keys | jsonb | A list of columns by which the table is partitioned. | |
region | text | The AWS Region in which the resource is located. | |
retention | bigint | The retention time for this table. | |
sp_connection_name | text | =, !=, ~~, ~~*, !~~, !~~* | Steampipe connection name. |
sp_ctx | jsonb | Steampipe context in JSON form. | |
storage_descriptor | jsonb | A storage descriptor containing information about the physical storage of this table. | |
table_type | text | The type of this table (EXTERNAL_TABLE, VIRTUAL_VIEW, etc.). | |
target_table | jsonb | A TableIdentifier structure that describes a target table for resource linking. | |
title | text | Title of the resource. | |
update_time | timestamp with time zone | The last time that the table was updated. | |
version_id | text | The ID of the table version. | |
view_expanded_text | text | If the table is a view, the expanded text of the view otherwise null. | |
view_original_text | text | If the table is a view, the original text of the view otherwise null. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- aws
You can pass the configuration to the command with the --config
argument:
steampipe_export_aws --config '<your_config>' aws_glue_catalog_table