steampipe plugin install aws

Table: aws_emr_instance_group - Query AWS EMR Instance Groups using SQL

The AWS Elastic MapReduce (EMR) Instance Group is a component of Amazon EMR that organizes EC2 instances in a cluster. It is used to host big data frameworks like Apache Hadoop, Spark, HBase, and others for processing vast amounts of data. These groups can be resized manually or automatically, depending on the work requirements.

Table Usage Guide

The aws_emr_instance_group table in Steampipe provides you with information about instance groups within AWS Elastic MapReduce (EMR). This table allows you, as a DevOps engineer, to query instance group-specific details, including instance group ID, instance type, instance count, and associated metadata. You can utilize this table to gather insights on instance groups, such as their current status, market type, and more. The schema outlines the various attributes of the EMR instance group, including the cluster ID, instance group type, EBS volumes, and associated tags for your convenience.

Examples

Basic info

Explore which Amazon EMR instance groups are currently active and their types. This can be useful for managing resources and understanding the state of your EMR clusters.

select
id,
arn,
cluster_id,
instance_group_type,
state
from
aws_emr_instance_group;
select
id,
arn,
cluster_id,
instance_group_type,
state
from
aws_emr_instance_group;

Get the master instance type used for a cluster

Identify the type of master instances used in a cluster to better understand your resource usage and optimize your configurations.

select
ig.id as instance_group_id,
ig.cluster_id,
c.name as cluster_name,
ig.instance_type
from
aws_emr_instance_group as ig,
aws_emr_cluster as c
where
ig.cluster_id = c.id
and ig.instance_group_type = 'MASTER';
select
ig.id as instance_group_id,
ig.cluster_id,
c.name as cluster_name,
ig.instance_type
from
aws_emr_instance_group as ig,
aws_emr_cluster as c
where
ig.cluster_id = c.id
and ig.instance_group_type = 'MASTER';

Get the count of running instances (core and master) per cluster

Explore the distribution of active instances across different clusters to effectively manage resources and ensure optimal performance. This can help in identifying clusters that might be overburdened or underutilized.

select
cluster_id,
sum(running_instance_count) as running_instance_count
from
aws_emr_instance_group
where
state = 'RUNNING'
group by
cluster_id;
select
cluster_id,
sum(running_instance_count) as running_instance_count
from
aws_emr_instance_group
where
state = 'RUNNING'
group by
cluster_id;

Schema for aws_emr_instance_group

NameTypeOperatorsDescription
_ctxjsonbSteampipe context in JSON form, e.g. connection_name.
account_idtextThe AWS Account ID in which the resource is located.
akasjsonbArray of globally unique identifier strings (also known as) for the resource.
arntextThe Amazon Resource Name (ARN) specifying the instance group.
autoscaling_policyjsonbAn automatic scaling policy for a core instance group or task instance group in an Amazon EMR cluster.
bid_pricetextThe maximum price you are willing to pay for Spot Instances. If specified, indicates that the instance group uses Spot Instances.
cluster_idtextThe unique identifier for the cluster.
configurationsjsonbA list of configurations supplied for an EMR cluster instance group. Only availbale for Amazon EMR releases 4.x or later.
configurations_versionbigintThe version number of the requested configuration specification for this instance group.
ebs_block_devicesjsonbThe EBS block devices that are mapped to this instance group.
ebs_optimizedbooleanIndicates whether the instance group is EBS-optimized, or not. An Amazon EBS-optimized instance uses an optimized configuration stack and provides additional, dedicated capacity for Amazon EBS I/O.
idtextThe identifier of the instance group.
instance_group_typetextThe type of the instance group. Valid values are MASTER, CORE or TASK.
instance_typetextThe EC2 instance type for all instances in the instance group.
last_successfully_applied_configurationsjsonbA list of configurations that were successfully applied for an instance group last time.
last_successfully_applied_configurations_versionbigintThe version number of a configuration specification that was successfully applied for an instance group last time.
markettextThe marketplace to provision instances for this group. Valid values are ON_DEMAND or SPOT.
nametextThe name of the instance group.
partitiontextThe AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov).
regiontextThe AWS Region in which the resource is located.
requested_instance_countbigintThe target number of instances for the instance group.
running_instance_countbigintThe number of instances currently running in this instance group.
shrink_policyjsonbPolicy for customizing shrink operations.
statetextThe current state of the instance group.
state_change_reasonjsonbThe status change reason details for the instance group.
status_timelinejsonbThe timeline of the instance group status over time.
titletextTitle of the resource.

Export

This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.

You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh script:

/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- aws

You can pass the configuration to the command with the --config argument:

steampipe_export_aws --config '<your_config>' aws_emr_instance_group