Table: aws_emr_instance_group - Query AWS EMR Instance Groups using SQL
The AWS Elastic MapReduce (EMR) Instance Group is a component of Amazon EMR that organizes EC2 instances in a cluster. It is used to host big data frameworks like Apache Hadoop, Spark, HBase, and others for processing vast amounts of data. These groups can be resized manually or automatically, depending on the work requirements.
Table Usage Guide
The aws_emr_instance_group
table in Steampipe provides you with information about instance groups within AWS Elastic MapReduce (EMR). This table allows you, as a DevOps engineer, to query instance group-specific details, including instance group ID, instance type, instance count, and associated metadata. You can utilize this table to gather insights on instance groups, such as their current status, market type, and more. The schema outlines the various attributes of the EMR instance group, including the cluster ID, instance group type, EBS volumes, and associated tags for your convenience.
Examples
Basic info
Explore which Amazon EMR instance groups are currently active and their types. This can be useful for managing resources and understanding the state of your EMR clusters.
select id, arn, cluster_id, instance_group_type, statefrom aws_emr_instance_group;
select id, arn, cluster_id, instance_group_type, statefrom aws_emr_instance_group;
Get the master instance type used for a cluster
Identify the type of master instances used in a cluster to better understand your resource usage and optimize your configurations.
select ig.id as instance_group_id, ig.cluster_id, c.name as cluster_name, ig.instance_typefrom aws_emr_instance_group as ig, aws_emr_cluster as cwhere ig.cluster_id = c.id and ig.instance_group_type = 'MASTER';
select ig.id as instance_group_id, ig.cluster_id, c.name as cluster_name, ig.instance_typefrom aws_emr_instance_group as ig, aws_emr_cluster as cwhere ig.cluster_id = c.id and ig.instance_group_type = 'MASTER';
Get the count of running instances (core and master) per cluster
Explore the distribution of active instances across different clusters to effectively manage resources and ensure optimal performance. This can help in identifying clusters that might be overburdened or underutilized.
select cluster_id, sum(running_instance_count) as running_instance_countfrom aws_emr_instance_groupwhere state = 'RUNNING'group by cluster_id;
select cluster_id, sum(running_instance_count) as running_instance_countfrom aws_emr_instance_groupwhere state = 'RUNNING'group by cluster_id;
Schema for aws_emr_instance_group
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form. | |
account_id | text | =, !=, ~~, ~~*, !~~, !~~* | The AWS Account ID in which the resource is located. |
akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. | |
arn | text | The Amazon Resource Name (ARN) specifying the instance group. | |
autoscaling_policy | jsonb | An automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster. | |
bid_price | text | The maximum price you are willing to pay for Spot Instances. If specified, indicates that the instance group uses Spot Instances. | |
cluster_id | text | The unique identifier for the cluster. | |
configurations | jsonb | A list of configurations supplied for an EMR cluster instance group. Only availbale for Amazon EMR releases 4.x or later. | |
configurations_version | bigint | The version number of the requested configuration specification for this instance group. | |
custom_ami_id | text | The custom AMI ID to use for the provisioned instance group. | |
ebs_block_devices | jsonb | The EBS block devices that are mapped to this instance group. | |
ebs_optimized | boolean | Indicates whether the instance group is EBS-optimized, or not. An Amazon EBS-optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for Amazon EBS I/O. | |
id | text | The identifier of the instance group. | |
instance_group_type | text | The type of the instance group. Valid values are MASTER, CORE or TASK. | |
instance_type | text | The EC2 instance type for all instances in the instance group. | |
last_successfully_applied_configurations | jsonb | A list of configurations that were successfully applied for an instance group last time. | |
last_successfully_applied_configurations_version | bigint | The version number of a configuration specification that was successfully applied for an instance group last time. | |
market | text | The marketplace to provision instances for this group. Valid values are ON_DEMAND or SPOT. | |
name | text | The name of the instance group. | |
partition | text | The AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov). | |
region | text | The AWS Region in which the resource is located. | |
requested_instance_count | bigint | The target number of instances for the instance group. | |
running_instance_count | bigint | The number of instances currently running in this instance group. | |
shrink_policy | jsonb | Policy for customizing shrink operations. | |
sp_connection_name | text | =, !=, ~~, ~~*, !~~, !~~* | Steampipe connection name. |
sp_ctx | jsonb | Steampipe context in JSON form. | |
state | text | The current state of the instance group. | |
state_change_reason | jsonb | The status change reason details for the instance group. | |
status_timeline | jsonb | The timeline of the instance group status over time. | |
title | text | Title of the resource. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- aws
You can pass the configuration to the command with the --config
argument:
steampipe_export_aws --config '<your_config>' aws_emr_instance_group