Table: aws_s3_object - Query AWS S3 Object using SQL
The AWS S3 Object is a fundamental entity within the Amazon S3 service. It is a simple storage unit, which users can store and retrieve, ranging from 0 bytes to 5 terabytes of data. It offers robust, secure and scalable storage for data backup, archival and analytics.
Table Usage Guide
The aws_s3_object
table in Steampipe provides you with information about objects within AWS Simple Storage Service (S3). This table enables you, as a DevOps engineer, to query object-specific details, including its size, last modified date, storage class, and associated metadata. You can utilize this table to gather insights on objects, such as objects' storage utilization, retrieval of object metadata, verification of object encryption status, and more. The schema outlines the various attributes of the S3 object, including the bucket name, key, size, storage class, and associated tags.
Important Notes
- You must specify a
bucket_name
in a where or join clause in order to use this table. - It's recommended that you specify the
prefix
column when querying buckets with a large number of objects to reduce the query time. - The
body
column returns the raw bytes of the object data as a string. If the bytes entirely consist of valid UTF8 runes, e.g.,.txt files
, an UTF8 data will be set as column value and you will be able to query the object body (refer example below). However, for the invalid UTF8 runes, e.g.,.png files
, the bas64 encoding of the bytes will be set as column value and you will not be able to query the object body for those objects. - Using this table adds to the cost of your monthly bill from AWS. Optimizations have been put in place to minimize the impact as much as possible. You should refer to AWS S3 Pricing to understand the cost implications.
Examples
Basic info
Explore which items in your specified AWS S3 bucket have been recently modified or have a specific storage class. This is beneficial in managing storage costs and ensuring data integrity.
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test';
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test';
List all objects with a prefix
in a bucket
This query is useful for pinpointing specific locations within a storage bucket where objects have a certain prefix. This can assist in organizing and managing data in a more efficient manner by focusing on a specific subset of objects.
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and prefix = 'test/logs/2021/03/01/12';
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and prefix = 'test/logs/2021/03/01/12';
Get object with a key
in a bucket
Discover the details of a specific object within a designated bucket, which can be useful for tracking changes, assessing storage class, or identifying versions. This can be particularly beneficial for maintaining data integrity and managing storage within an AWS S3 bucket.
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and prefix = 'test/logs/2021/03/01/12/abc.txt';
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and prefix = 'test/logs/2021/03/01/12/abc.txt';
List all objects which are encrypted with CMK in a bucket
Explore which objects within a specific S3 bucket have been encrypted using a Customer Managed Key (CMK). This is particularly useful for auditing security measures and ensuring compliance with data protection regulations.
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and sse_kms_key_id is not null;
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and sse_kms_key_id is not null;
List all objects which were not modified in the last 3 months in a bucket
Determine the areas in your storage where objects have not been updated in the last three months. This can assist in identifying outdated or unused data, optimizing storage use and managing costs.
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and last_modified < current_date - interval '3 months';
select key, arn, bucket_name, last_modified, storage_class, version_idfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and last_modified < date('now', '-3 months');
List all objects in a bucket where any user other than the OWNER
has FULL_CONTROL
Identify instances where any user, other than the owner, has full control over all objects in a specific bucket. This query is useful for assessing security risks and ensuring proper access management in your AWS S3 buckets.
select s.key, s.bucket_name, s.owner, acl_grant -> 'Grantee' as grantee, acl_grant ->> 'Permission' as permissionfrom aws_s3_object as s, jsonb_array_elements(aws_s3_object.acl -> 'Grants') as acl_grantwhere bucket_name = 'steampipe-test' and acl_grant ->> 'Permission' = 'FULL_CONTROL' and acl_grant -> 'Grantee' ->> 'ID' != aws_s3_object.owner ->> 'ID';
select s.key, s.bucket_name, s.owner, json_extract(acl_grant, '$.Grantee') as grantee, json_extract(acl_grant, '$.Permission') as permissionfrom aws_s3_object as s, json_each(json_extract(aws_s3_object.acl, '$.Grants')) as acl_grantwhere bucket_name = 'steampipe-test' and json_extract(acl_grant, '$.Permission') = 'FULL_CONTROL' and json_extract(json_extract(acl_grant, '$.Grantee'), '$.ID') != json_extract(json_extract(aws_s3_object.owner, '$.ID'));
List all objects in a bucket where the legal hold is enabled
Identify instances where legal holds are active within a specific storage bucket. This can be useful for maintaining compliance and managing data retention policies.
select key, bucket_name, object_lock_legal_hold_statusfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and object_lock_legal_hold_status = 'ON';
select key, bucket_name, object_lock_legal_hold_statusfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and object_lock_legal_hold_status = 'ON';
List all objects in a bucket with governance lock mode enabled
Discover the segments that have a governance lock mode enabled within a specific bucket. This is useful for maintaining compliance and ensuring data immutability in regulated industries.
select key, bucket_name, object_lock_retain_until_date, object_lock_mode, object_lock_legal_hold_statusfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and object_lock_mode = 'GOVERNANCE';
select key, bucket_name, object_lock_retain_until_date, object_lock_mode, object_lock_legal_hold_statusfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and object_lock_mode = 'GOVERNANCE';
List all objects in a bucket which are set to be retained for more than 1 year from now
Discover the objects within a specific storage area that are scheduled to be kept for over a year from the current date. This query can be useful in understanding long-term data retention policies and identifying potential areas for storage optimization.
select s.key, s.bucket_name, s.object_lock_retain_until_date, s.object_lock_mode, s.object_lock_legal_hold_statusfrom aws_s3_object as swhere bucket_name = 'steampipe-test' and object_lock_retain_until_date > current_date + interval '1 year';
select key, bucket_name, object_lock_retain_until_date, object_lock_mode, object_lock_legal_hold_statusfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and date(object_lock_retain_until_date) > date('now', '+1 year');
List objects without the 'application' tags key
Discover the segments that have the 'application' tag key in the 'steampipe-test' bucket. This can be particularly useful when trying to identify specific objects within a large bucket for management or organization purposes.
select key, bucket_name, jsonb_pretty(tags) as tagsfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and tags ->> 'application' is not null;
select key, bucket_name, tagsfrom aws_s3_objectwhere bucket_name = 'steampipe-test' and json_extract(tags, '$.application') is not null;
List all objects where bucket key is disabled
Determine the areas in which the bucket key is disabled in your AWS S3 objects and buckets. This information can be useful for identifying potential security risks or configuration issues.
select key, o.arn as object_arn, bucket_name, last_modified, bucket_key_enabledfrom aws_s3_object as o, aws_s3_bucket as bwhere o.bucket_name = b.name and not bucket_key_enabled;
select key, o.arn as object_arn, bucket_name, last_modified, bucket_key_enabledfrom aws_s3_object as o join aws_s3_bucket as b on o.bucket_name = b.namewhere not bucket_key_enabled;
List all objects where buckets do not block public access
Determine the areas in which your stored objects are potentially exposed to public access due to their hosting buckets' security settings. This is useful for identifying potential security risks and ensuring proper access controls are in place.
select key, arn, bucket_name, last_modified, storage_classfrom aws_s3_objectwhere bucket_name in ( select name from aws_s3_bucket where not block_public_acls or not block_public_policy or not ignore_public_acls or not restrict_public_buckets );
select key, arn, bucket_name, last_modified, storage_classfrom aws_s3_objectwhere bucket_name in ( select name from aws_s3_bucket where block_public_acls = 0 or block_public_policy = 0 or ignore_public_acls = 0 or restrict_public_buckets = 0 );
Get data details of a particular object in a bucket
Discover the segments that provide specific information about an object within a particular storage area. This can be useful for monitoring changes over time or verifying the integrity of the object.
select s.key, b ->> 'awsAccountId' as account_id, b ->> 'digestEndTime' as digest_end_time, b ->> 'digestPublicKeyFingerprint' as digest_public_key_fingerprint, b ->> 'digestS3Bucket' as digest_s3_bucket, b ->> 'digestStartTime' as digest_start_timefrom aws_s3_object as s, jsonb_array_elements(body :: jsonb) as bwhere bucket_name = 'steampipe-test' and prefix = 'test1/log_text.txt';
select s.key, json_extract(b.value, '$.awsAccountId') as account_id, json_extract(b.value, '$.digestEndTime') as digest_end_time, json_extract(b.value, '$.digestPublicKeyFingerprint') as digest_public_key_fingerprint, json_extract(b.value, '$.digestS3Bucket') as digest_s3_bucket, json_extract(b.value, '$.digestStartTime') as digest_start_timefrom aws_s3_object as s, json_each(json(body)) as bwhere bucket_name = 'steampipe-test' and prefix = 'test1/log_text.txt';
Schema for aws_s3_object
Name | Type | Operators | Description |
---|---|---|---|
_ctx | jsonb | Steampipe context in JSON form. | |
accept_ranges | text | Indicates that a range of bytes was specified. | |
account_id | text | =, !=, ~~, ~~*, !~~, !~~* | The AWS Account ID in which the resource is located. |
acl | jsonb | ACLs define which AWS accounts or groups are granted access along with the type of access. | |
akas | jsonb | Array of globally unique identifier strings (also known as) for the resource. | |
arn | text | The ARN of the AWS S3 Object. | |
body | text | The raw bytes of the object data as a string. If the bytes entirely consists of valid UTF8 runes, an UTF8 is sent otherwise the bas64 encoding of the bytes is sent. | |
bucket_key_enabled | boolean | Indicates whether the object uses an S3 Bucket Key for server-side encryption with Amazon Web Services KMS (SSE-KMS) | |
bucket_name | text | = | The name of the container bucket of this object. |
cache_control | text | Specifies caching behavior along the request/reply chain. | |
checksum | jsonb | The checksum or digest of the object. | |
checksum_crc32 | text | The base64-encoded, 32-bit CRC32 checksum of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. | |
checksum_crc32c | text | The base64-encoded, 32-bit CRC32C checksum of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. | |
checksum_sha1 | text | The base64-encoded, 160-bit SHA-1 digest of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. | |
checksum_sha256 | text | The base64-encoded, 256-bit SHA-256 digest of the object. This will only be present if it was uploaded with the object. With multipart uploads, this may not be a checksum value of the object. | |
content_disposition | text | Specifies presentational information for the object. | |
content_encoding | text | Specifies what content encodings have been applied to the object. | |
content_language | text | The language the content is in. | |
content_length | text | Size of the body in bytes. | |
content_range | text | The portion of the object returned in the response. | |
content_type | text | A standard MIME type describing the format of the object data. | |
delete_marker | boolean | Specifies whether the object retrieved was (true) or was not (false) a delete marker. | |
etag | text | The entity tag of the object. | |
expiration | text | If the object expiration is configured (see PUT Bucket lifecycle), the response includes this header. It includes the expiry-date and rule-id key-value pairs providing object expiration information. The value of the rule-id is URL-encoded. | |
expires | timestamp with time zone | The date and time at which the object is no longer cacheable. | |
key | text | The name that you assign to an object. You use the object key to retrieve the object. | |
last_modified | timestamp with time zone | Last modified time of the object. | |
metadata | jsonb | A map of metadata to store with the object in S3. | |
object_lock_legal_hold_status | text | Like a retention period, a legal hold prevents an object version from being overwritten or deleted. A legal hold remains in effect until removed. | |
object_lock_mode | text | The Object Lock mode currently in place for this object. | |
object_lock_retain_until_date | timestamp with time zone | The date and time when this object's Object Lock will expire. | |
object_parts | jsonb | A collection of parts associated with a multipart upload. | |
owner | jsonb | The owner of the object. | |
partition | text | The AWS partition in which the resource is located (aws, aws-cn, or aws-us-gov). | |
prefix | text | = | The prefix of the key of the object. |
region | text | The AWS Region in which the object is located. | |
replication_status | text | Amazon S3 can return this if your request involves a bucket that is either a source or destination in a replication rule. | |
request_charged | text | If present, indicates that the requester was successfully charged for the request. | |
restore | text | Provides information about object restoration action and expiration time of the restored object copy. | |
server_side_encryption | text | The server-side encryption algorithm used when storing this object in Amazon S3. | |
size | bigint | Size in bytes of the object. | |
sp_connection_name | text | =, !=, ~~, ~~*, !~~, !~~* | Steampipe connection name. |
sp_ctx | jsonb | Steampipe context in JSON form. | |
sse_customer_algorithm | text | If server-side encryption with a customer-provided encryption key was requested, the response will include this header confirming the encryption algorithm used. | |
sse_customer_key_md5 | text | If server-side encryption with a customer-provided encryption key was requested, the response will include this header to provide round-trip message integrity verification of the customer-provided encryption key. | |
sse_kms_key_id | text | If present, specifies the ID of the Amazon Web Services Key Management Service(Amazon Web Services KMS) symmetric customer managed key that was used for the object. | |
storage_class | text | The class of storage used to store the object. | |
tag_count | text | The number of tags, if any, on the object. | |
tags | jsonb | A map of tags for the resource. | |
tags_src | jsonb | A list of tags assigned to the object. | |
title | text | Title of the resource. | |
version_id | text | The version ID of the object. | |
website_redirection_location | text | If the bucket is configured as a website, redirects requests for this object to another object in the same bucket or to an external URL. |
Export
This table is available as a standalone Exporter CLI. Steampipe exporters are stand-alone binaries that allow you to extract data using Steampipe plugins without a database.
You can download the tarball for your platform from the Releases page, but it is simplest to install them with the steampipe_export_installer.sh
script:
/bin/sh -c "$(curl -fsSL https://steampipe.io/install/export.sh)" -- aws
You can pass the configuration to the command with the --config
argument:
steampipe_export_aws --config '<your_config>' aws_s3_object