Table: aws_glue_dev_endpoint - Query AWS Glue Development Endpoints using SQL
The AWS Glue Development Endpoints are interactive programming interfaces for AWS Glue. They provide a development environment to learn, write, and test scripts that extract, transform, and load data. Using these endpoints, you can debug and test your ETL scripts before deploying them.
Table Usage Guide
The aws_glue_dev_endpoint
table in Steampipe provides you with comprehensive information about Development Endpoints within AWS Glue. This table allows you, as a developer or data engineer, to query endpoint-specific details, including the endpoint status, security configurations, associated subnet ID, VPC ID, and much more. You can utilize this table to analyze and manage your Glue Development Endpoints, such as identifying endpoints with specific security configurations, verifying endpoint statuses, and understanding the network configurations of the endpoints. The schema outlines the various attributes of the Glue Development Endpoint for you, including the endpoint name, role ARN, public key, creation time, and associated tags.
Examples
Basic info
Explore the status and availability of your AWS Glue development endpoints, including their creation timestamps, versions, and addresses. This can help you monitor the health and performance of your endpoints, ensuring they are functioning optimally and are up-to-date.
select endpoint_name, status, availability_zone, created_timestamp, extra_jars_s3_path, glue_version, private_address, public_addressfrom aws_glue_dev_endpoint;
select endpoint_name, status, availability_zone, created_timestamp, extra_jars_s3_path, glue_version, private_address, public_addressfrom aws_glue_dev_endpoint;
List dev endpoints that are not in ready state
Determine the areas in which development endpoints are not yet ready for use. This can aid in identifying potential issues or bottlenecks in the system.
select endpoint_name, status, created_timestamp, extra_jars_s3_path, glue_version, private_address, public_addressfrom aws_glue_dev_endpointwhere status <> 'READY';
select endpoint_name, status, created_timestamp, extra_jars_s3_path, glue_version, private_address, public_addressfrom aws_glue_dev_endpointwhere status <> 'READY';
List dev endpoints updated in the last 30 days
Discover the segments that have seen recent modifications in your development endpoints. This is particularly useful to track changes and stay updated with the latest modifications made within the past month.
select title, arn, status, glue_version, last_modified_timestampfrom aws_glue_dev_endpointwhere last_modified_timestamp >= now() - interval '30' day;
select title, arn, status, glue_version, last_modified_timestampfrom aws_glue_dev_endpointwhere last_modified_timestamp >= datetime('now', '-30 day');
List dev endpoints older than 30 days
Determine the areas in which development endpoints have been active for more than 30 days. This can be useful for understanding long-term usage patterns and identifying potential areas for optimization or resource reallocation.
select endpoint_name, arn, status, glue_version, created_timestampfrom aws_glue_dev_endpointwhere created_timestamp >= now() - interval '30' day;
select endpoint_name, arn, status, glue_version, created_timestampfrom aws_glue_dev_endpointwhere created_timestamp >= datetime('now', '-30 day');
Get subnet details attached to a particular dev endpoint
Explore the specifics of a particular development endpoint, such as the availability zone and IP address count, to gain insights into its configuration and status. This is particularly useful for managing network resources and optimizing system performance.
select e.endpoint_name, s.availability_zone, s.available_ip_address_count, s.cidr_block, s.default_for_az, s.map_customer_owned_ip_on_launch, s.map_public_ip_on_launch, s.statefrom aws_glue_dev_endpoint as e, aws_vpc_subnet as swhere e.endpoint_name = 'test5' and e.subnet_id = s.subnet_id;
select e.endpoint_name, s.availability_zone, s.available_ip_address_count, s.cidr_block, s.default_for_az, s.map_customer_owned_ip_on_launch, s.map_public_ip_on_launch, s.statefrom aws_glue_dev_endpoint as e join aws_vpc_subnet as s on e.subnet_id = s.subnet_idwhere e.endpoint_name = 'test5';
Get extra jars s3 bucket details for a dev endpoint
Determine the configuration details of specific S3 buckets that are linked to a development endpoint in AWS Glue. This is useful for assessing the versioning status, policy, and object lock configuration of these buckets, aiding in security and management tasks.
select e.endpoint_name, split_part(j, '/', '3') as extra_jars_s3_bucket, b.versioning_enabled, b.policy, b.object_lock_configuration, b.restrict_public_buckets, b.policyfrom aws_glue_dev_endpoint as e, aws_s3_bucket as b, unnest (string_to_array(e.extra_jars_s3_path, ',')) as jwhere b.name = split_part(j, '/', '3') and e.endpoint_name = 'test34';
Error: SQLite does not support the unnest,split_part,or string_to_array functions.
Control examples
Schema for aws_glue_dev_endpoint
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form. | |
account_id | text | =, !=, ~~, ~~*, !~~, !~~* | The AWS Account ID in which the resource is located. |
akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. | |
arn | text | The Amazon Resource Name (ARN) of the DevEndpoint. | |
availability_zone | text | The AWS Availability Zone where this DevEndpoint is located. | |
created_timestamp | timestamp with time zone | The point in time at which this DevEndpoint was created. | |
endpoint_name | text | = | The name of the DevEndpoint. |
extra_jars_s3_path | text | The path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint. | |
extra_python_libs_s3_path | text | The paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma. | |
failure_reason | text | The reason for a current failure in this DevEndpoint. | |
glue_version | text | Glue version determines the versions of Apache Spark and Python that Glue supports. | |
last_modified_timestamp | timestamp with time zone | The point in time at which this DevEndpoint was last modified. | |
last_update_status | text | The status of the last update. | |
number_of_nodes | bigint | The number of Glue Data Processing Units (DPUs) allocated to this DevEndpoint. | |
number_of_workers | bigint | The number of workers of a defined workerType that are allocated to the development endpoint. | |
partition | text | The AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov). | |
private_address | text | A private IP address to access the DevEndpoint within a VPC if the DevEndpoint is created within one. | |
public_address | text | The public IP address used by this DevEndpoint. The PublicAddress field is present only when you create a non-virtual private cloud (VPC) DevEndpoint. | |
public_key | text | The public key to be used by this DevEndpoint for authentication. | |
public_keys | jsonb | A list of public keys to be used by the DevEndpoints for authentication. | |
region | text | The AWS Region in which the resource is located. | |
role_arn | text | The Amazon Resource Name (ARN) of the IAM role used in this DevEndpoint. | |
security_configuration | text | The name of the SecurityConfiguration structure to be used with this DevEndpoint. | |
security_group_ids | jsonb | A list of security group identifiers used in this DevEndpoint. | |
sp_connection_name | text | =, !=, ~~, ~~*, !~~, !~~* | Steampipe connection name. |
sp_ctx | jsonb | Steampipe context in JSON form. | |
status | text | The current status of this DevEndpoint. | |
subnet_id | text | The subnet ID for this DevEndpoint. | |
title | text | Title of the resource. | |
vpc_id | text | The ID of the virtual private cloud (VPC) used by this DevEndpoint. | |
worker_type | text | The type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X. | |
yarn_endpoint_address | text | The YARN endpoint address used by this DevEndpoint. | |
zeppelin_remote_spark_interpreter_port | bigint | The Apache Zeppelin port for the remote Apache Spark interpreter. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- aws
You can pass the configuration to the command with the --config
argument:
steampipe_export_aws --config '<your_config>' aws_glue_dev_endpoint