steampipe plugin install aws

Table: aws_glue_dev_endpoint - Query AWS Glue Development Endpoints using SQL

The AWS Glue Development Endpoints are interactive programming interfaces for AWS Glue. They provide a development environment to learn, write, and test scripts that extract, transform, and load data. Using these endpoints, you can debug and test your ETL scripts before deploying them.

Table Usage Guide

The aws_glue_dev_endpoint table in Steampipe provides you with comprehensive information about Development Endpoints within AWS Glue. This table allows you, as a developer or data engineer, to query endpoint-specific details, including the endpoint status, security configurations, associated subnet ID, VPC ID, and much more. You can utilize this table to analyze and manage your Glue Development Endpoints, such as identifying endpoints with specific security configurations, verifying endpoint statuses, and understanding the network configurations of the endpoints. The schema outlines the various attributes of the Glue Development Endpoint for you, including the endpoint name, role ARN, public key, creation time, and associated tags.

Examples

Basic info

Explore the status and availability of your AWS Glue development endpoints, including their creation timestamps, versions, and addresses. This can help you monitor the health and performance of your endpoints, ensuring they are functioning optimally and are up-to-date.

select
endpoint_name,
status,
availability_zone,
created_timestamp,
extra_jars_s3_path,
glue_version,
private_address,
public_address
from
aws_glue_dev_endpoint;
select
endpoint_name,
status,
availability_zone,
created_timestamp,
extra_jars_s3_path,
glue_version,
private_address,
public_address
from
aws_glue_dev_endpoint;

List dev endpoints that are not in ready state

Determine the areas in which development endpoints are not yet ready for use. This can aid in identifying potential issues or bottlenecks in the system.

select
endpoint_name,
status,
created_timestamp,
extra_jars_s3_path,
glue_version,
private_address,
public_address
from
aws_glue_dev_endpoint
where
status <> 'READY';
select
endpoint_name,
status,
created_timestamp,
extra_jars_s3_path,
glue_version,
private_address,
public_address
from
aws_glue_dev_endpoint
where
status <> 'READY';

List dev endpoints updated in the last 30 days

Discover the segments that have seen recent modifications in your development endpoints. This is particularly useful to track changes and stay updated with the latest modifications made within the past month.

select
title,
arn,
status,
glue_version,
last_modified_timestamp
from
aws_glue_dev_endpoint
where
last_modified_timestamp >= now() - interval '30' day;
select
title,
arn,
status,
glue_version,
last_modified_timestamp
from
aws_glue_dev_endpoint
where
last_modified_timestamp >= datetime('now', '-30 day');

List dev endpoints older than 30 days

Determine the areas in which development endpoints have been active for more than 30 days. This can be useful for understanding long-term usage patterns and identifying potential areas for optimization or resource reallocation.

select
endpoint_name,
arn,
status,
glue_version,
created_timestamp
from
aws_glue_dev_endpoint
where
created_timestamp >= now() - interval '30' day;
select
endpoint_name,
arn,
status,
glue_version,
created_timestamp
from
aws_glue_dev_endpoint
where
created_timestamp >= datetime('now', '-30 day');

Get subnet details attached to a particular dev endpoint

Explore the specifics of a particular development endpoint, such as the availability zone and IP address count, to gain insights into its configuration and status. This is particularly useful for managing network resources and optimizing system performance.

select
e.endpoint_name,
s.availability_zone,
s.available_ip_address_count,
s.cidr_block,
s.default_for_az,
s.map_customer_owned_ip_on_launch,
s.map_public_ip_on_launch,
s.state
from
aws_glue_dev_endpoint as e,
aws_vpc_subnet as s
where
e.endpoint_name = 'test5'
and e.subnet_id = s.subnet_id;
select
e.endpoint_name,
s.availability_zone,
s.available_ip_address_count,
s.cidr_block,
s.default_for_az,
s.map_customer_owned_ip_on_launch,
s.map_public_ip_on_launch,
s.state
from
aws_glue_dev_endpoint as e
join aws_vpc_subnet as s on e.subnet_id = s.subnet_id
where
e.endpoint_name = 'test5';

Get extra jars s3 bucket details for a dev endpoint

Determine the configuration details of specific S3 buckets that are linked to a development endpoint in AWS Glue. This is useful for assessing the versioning status, policy, and object lock configuration of these buckets, aiding in security and management tasks.

select
e.endpoint_name,
split_part(j, '/', '3') as extra_jars_s3_bucket,
b.versioning_enabled,
b.policy,
b.object_lock_configuration,
b.restrict_public_buckets,
b.policy
from
aws_glue_dev_endpoint as e,
aws_s3_bucket as b,
unnest (string_to_array(e.extra_jars_s3_path, ',')) as j
where
b.name = split_part(j, '/', '3')
and e.endpoint_name = 'test34';
Error: SQLite does not support the unnest,
split_part,
or string_to_array functions.

Schema for aws_glue_dev_endpoint

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form.
account_idtext=, !=, ~~, ~~*, !~~, !~~*The AWS Account ID in which the resource is located.
akasjsonbArray of globally unique identifier strings (also known as) for the resource.
arntextThe Amazon Resource Name (ARN) of the DevEndpoint.
availability_zonetextThe AWS Availability Zone where this DevEndpoint is located.
created_timestamptimestamp with time zoneThe point in time at which this DevEndpoint was created.
endpoint_nametext=The name of the DevEndpoint.
extra_jars_s3_pathtextThe path to one or more Java .jar files in an S3 bucket that should be loaded in your DevEndpoint.
extra_python_libs_s3_pathtextThe paths to one or more Python libraries in an Amazon S3 bucket that should be loaded in your DevEndpoint. Multiple values must be complete paths separated by a comma.
failure_reasontextThe reason for a current failure in this DevEndpoint.
glue_versiontextGlue version determines the versions of Apache Spark and Python that Glue supports.
last_modified_timestamptimestamp with time zoneThe point in time at which this DevEndpoint was last modified.
last_update_statustextThe status of the last update.
number_of_nodesbigintThe number of Glue Data Processing Units (DPUs) allocated to this DevEndpoint.
number_of_workersbigintThe number of workers of a defined workerType that are allocated to the development endpoint.
partitiontextThe AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov).
private_addresstextA private IP address to access the DevEndpoint within a VPC if the DevEndpoint is created within one.
public_addresstextThe public IP address used by this DevEndpoint. The PublicAddress field is present only when you create a non-virtual private cloud (VPC) DevEndpoint.
public_keytextThe public key to be used by this DevEndpoint for authentication.
public_keysjsonbA list of public keys to be used by the DevEndpoints for authentication.
regiontextThe AWS Region in which the resource is located.
role_arntextThe Amazon Resource Name (ARN) of the IAM role used in this DevEndpoint.
security_configurationtextThe name of the SecurityConfiguration structure to be used with this DevEndpoint.
security_group_idsjsonbA list of security group identifiers used in this DevEndpoint.
sp_connection_nametext=, !=, ~~, ~~*, !~~, !~~*Steampipe connection name.
sp_ctxjsonbSteampipe context in JSON form.
statustextThe current status of this DevEndpoint.
subnet_idtextThe subnet ID for this DevEndpoint.
titletextTitle of the resource.
vpc_idtextThe ID of the virtual private cloud (VPC) used by this DevEndpoint.
worker_typetextThe type of predefined worker that is allocated to the development endpoint. Accepts a value of Standard, G.1X, or G.2X.
yarn_endpoint_addresstextThe YARN endpoint address used by this DevEndpoint.
zeppelin_remote_spark_interpreter_portbigintThe Apache Zeppelin port for the remote Apache Spark interpreter.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- aws

You can pass the configuration to the command with the --config argument:

steampipe_export_aws --config '<your_config>' aws_glue_dev_endpoint