How to Use AWS CLI to Manage Data in Amazon S3 | Day 43 of 90 Days of DevOps

Learn how to create and delete S3 buckets, upload and download files, sync folders, generate pre-signed URLs, and list buckets and objects with AWSCLI

How to Use AWS CLI to Manage Data in Amazon S3 | Day 43 of 90 Days of DevOps

Are you a DevOps learner who wants to use S3 programmatic access with AWS CLI? If yes, then this blog post is for you. In this post, I’ll explain what S3 is and how to use AWS CLI, a unified tool to manage your AWS services from the command line.

By the end of this post, you’ll be able to:

  • Create and delete S3 buckets

  • Upload and download files to and from S3 buckets

  • Sync local folders with S3 buckets

  • Generate pre-signed URLs for S3 objects

  • List S3 buckets and objects

Let’s get started!

What is S3?

Amazon Simple Storage Service (Amazon S3) is an object storage service that provides a secure and scalable way to store and access data on the cloud. It is designed for storing any kind of data, such as text files, images, videos, backups, and more. S3 is commonly used for various purposes, including backup and restore, data archiving, content distribution, and hosting static websites.

S3 has the following features and benefits:

  • S3 offers different storage classes to optimize cost and performance based on your data access patterns. The available classes include Standard, Intelligent-Tiering, Standard-IA (Infrequent Access), One Zone-IA, Glacier, and Glacier Deep Archive.

  • S3 has storage management features that you can use to manage costs, meet regulatory requirements, reduce latency, and save multiple distinct copies of your data for compliance requirements. For example, you can use lifecycle management to automatically move your data to different storage classes as it ages, object lock to prevent your data from being deleted or overwritten for a specified period, and replication to copy your data to multiple regions for high availability and disaster recovery.

  • S3 offers various mechanisms to control access to your buckets and objects. You can use access control lists (ACLs) and bucket policies to define granular permissions for different users and applications. You can also use encryption and encryption keys to protect your data at rest and in transit.

  • S3 provides logging and monitoring tools that you can use to track and analyze how your S3 resources are being used. You can also use features like S3 Object Lambda and Event Notifications to transform data and trigger workflows to automate a variety of other processing activities at scale.

To use S3, you need to create buckets and objects. A bucket is a container for storing objects, and an object is a file or a piece of data that you store in a bucket. Each object has a unique key, which is the name of the object, and a value, which is the data of the object. You can also store metadata, which is additional information about the object, such as the size, date, and content type.

You can access S3 using various methods, such as the AWS Management Console, the AWS SDKs, the AWS CLI, or the S3 API. In this blog post, we’ll focus on using the AWS CLI, which is a unified tool that provides a command-line interface for interacting with AWS services.

Before we start, you’ll need to have the following prerequisites:

  • An AWS account with access to Amazon S3 and Amazon EC2

  • An EC2 instance running Linux or Windows

  • The AWS CLI installed and configured on your EC2 instance

  • A text editor of your choice

Creating and Deleting S3 Buckets

An S3 bucket is a container for storing objects in Amazon S3. Each bucket has a unique name that must be globally unique across all AWS regions. You can create up to 100 buckets per AWS account by default, but you can request a limit increase if you need more.

To create an S3 bucket using the AWS CLI, you can use the aws s3 mb command. The syntax of the command is:

aws s3 mb s3://bucket-name

where bucket-name is the name of the bucket that you want to create. For example, to create a bucket named ajits-bucket-2023, you can run the following command:

aws s3 mb s3://ajits-bucket-2023

You should see a message like this:

make_bucket: ajits-bucket-2023

This means that the bucket has been created successfully. You can verify that the bucket exists by using the aws s3 ls command, which lists all the buckets in your AWS account. You should see something like this:

2023-11-14 02:03:04 ajits-bucket-2023

To delete an S3 bucket using the AWS CLI, you can use the aws s3 rb command. The syntax of the command is:

aws s3 rb s3://bucket-name

where bucket-name is the name of the bucket that you want to delete. For example, to delete the bucket that we just created, you can run the following command:

aws s3 rb s3://ajits-bucket-2023

You should see a message like this:

remove_bucket: ajits-bucket-2023

This means that the bucket has been deleted successfully. You can verify that the bucket no longer exists by using the aws s3 ls command again. You should not see the bucket in the output.

Note: You can only delete an empty bucket using the aws s3 rb command. If the bucket contains any objects, you’ll need to delete them first using the aws s3 rm command or the --force option.

Uploading and Downloading Files to and from S3 Buckets

Once you have an S3 bucket, you can upload and download files to and from it using the AWS CLI. A file in Amazon S3 is called an object, and it has a key and a value. The key is the name of the object, and the value is the content of the object. Each object can be up to 5 TB in size, and you can store an unlimited number of objects in a bucket.

To upload a file to an S3 bucket using the AWS CLI, you can use the aws s3 cp command. The syntax of the command is:

aws s3 cp source destination

where source is the path of the file that you want to upload, and destination is the S3 URI of the bucket and the key of the object that you want to create. For example, to upload a file named hello.txt from your EC2 instance to a bucket named ajits-bucket-2023 with the key hello.txt, you can run the following command:

aws s3 cp hello.txt s3://ajits-bucket-2023/hello.txt

You should see a message like this:

upload: ./hello.txt to s3://ajits-bucket-2023/hello.txt

This means that the file has been uploaded successfully. You can verify that the object exists by using the aws s3 ls command with the S3 URI of the bucket. You should see something like this:

2023-11-14 02:14:00         23 hello.txt

To download a file from an S3 bucket using the AWS CLI, you can use the aws s3 cp command again, but with the source and destination reversed. For example, to download the file that we just uploaded from the bucket to your EC2 instance, you can run the following command:

aws s3 cp s3://ajits-bucket-2023/hello.txt hello.txt

You should see a message like this:

download: s3://ajits-bucket-2023/hello.txt to ./hello.txt

This means that the file has been downloaded successfully. You can verify that the file exists by using the ls command on your EC2 instance. You should see something like this:

hello.txt

Note: You can also use the aws s3 mv command to move files to and from S3 buckets, instead of copying them. This will delete the source file after the transfer is complete.

Syncing Local Folders with S3 Buckets

Sometimes, you may want to sync the contents of a local folder with an S3 bucket, or vice versa. This can be useful for backup, migration, or synchronization purposes. You can use the AWS CLI to sync folders with S3 buckets using the aws s3 sync command. The syntax of the command is:

aws s3 sync source destination

where source and destination are the paths of the folders or the S3 URIs of the buckets that you want to sync. For example, to sync a local folder named my-folder with a bucket named ajits-bucket-2023, you can run the following command:

aws s3 sync my-folder s3://ajits-bucket-2023

This will copy any files that are in the local folder but not in the bucket, and vice versa. It will also update any files that have changed since the last sync. You should see messages like this:

upload: my-folder/file1.txt to s3://ajits-bucket-2023/file1.txt

This means that the sync has been completed successfully. You can verify that the folder and the bucket have the same contents by using the ls and aws s3 ls commands. You should see something like this:

my-folder:
file1.txt  hello.txt

s3://ajits-bucket-2023:
2023-11-14 02:56:40          0 file1.txt
2023-11-14 02:14:00         23 hello.txt

Note: You can use the --delete option to delete any files that are in the destination but not in the source. This will make the destination match the source exactly.

Generating Pre-Signed URLs for S3 Objects

A pre-signed URL is a URL that grants temporary access to an S3 object, without requiring AWS credentials. You can use pre-signed URLs to share S3 objects with others or to allow others to upload or download S3 objects. Pre-signed URLs have an expiration time, after which they become invalid.

To generate a pre-signed URL for an S3 object using the AWS CLI, you can use the aws s3 presign command. The syntax of the command is:

aws s3 presign s3://bucket-name/object-key

where bucket-name and object-key are the name of the bucket and the key of the object that you want to generate a pre-signed URL for. For example, to generate a pre-signed URL for the file hello.txt that we uploaded earlier, you can run the following command:

aws s3 presign s3://ajits-bucket-2023/hello.txt

You should see a URL like this:

https://ajits-bucket-2023.s3.ap-south-1.amazonaws.com/hello.txt?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAYKSQYUGGDOPB6TH3%2F20231113%2Fap-south-1%2Fs3%2Faws4_request&X-Amz-Date=20231113T212920Z&X-Amz-Expires=3600&X-Amz-SignedHeaders=host&X-Amz-Signature=21b7d769ba821701e575b57d9f4069322f6799d1e15a15f2cee3da4d3d096f8c

This is the pre-signed URL that you can share with others to grant them access to the file. The URL contains the following components:

  • The bucket name and the object key

  • The AWS access key ID of the user who generated the URL

  • The signature of the URL, which is calculated using the AWS secret access key of the user who generated the URL

  • The expiration time of the URL, which is specified in Unix epoch time

By default, the pre-signed URL expires in 3600 seconds (one hour) from the time it is generated. You can change the expiration time by using the --expires-in option with the aws s3 presign command. The value of the option is the number of seconds that you want the URL to be valid for. For example, to generate a pre-signed URL that expires in 10 minutes, you can run the following command:

aws s3 presign s3://ajits-bucket-2023/hello.txt --expires-in 600

You should see a URL like this:

https://ajits-bucket-2023.s3.amazonaws.com/hello.txt?AWSAccessKeyId=AKIAIOSFODNN7EXAMPLE&Signature=Zkyj4tCckG8gCycXkz5a%2BYdl%2Fbg%3D&Expires=2023-11-14T01%3A45%3A00Z

Notice that the Expires parameter has changed to reflect the new expiration time.

You can test the pre-signed URL by opening it in a browser or using a tool like curl. You should be able to access the file without any authentication. However, if you try to access the URL after it has expired, you’ll get an error message like this:

<Error>
<Code>AccessDenied</Code>
<Message>Invalid date (should be seconds since epoch): 2023-11-14T01:45:00Z</Message>
<RequestId>ZS40VWNC265KCA8G</RequestId>
<HostId>txGSoSf4qrwtgnkFrda0IrRzYGWenxj1Za/k5L4I/m/0PIGIPLbFZW4SdwVctGoZPfwRThMHl0g=</HostId>
</Error>

This means that the URL is no longer valid and you’ll need to generate a new one if you want to access the file again.

Note: You can also use the aws s3api get-object command to generate a pre-signed URL for an S3 object, using the --presigned-url option. This command uses the S3 API instead of the S3 high-level commands, and it allows you to specify additional parameters for the URL, such as the HTTP method, the content type, and the response headers. For more information, see the AWS CLI documentation.

Listing S3 Buckets and Objects

Another useful task that you can perform with the AWS CLI is listing the S3 buckets and objects in your AWS account. This can help you to keep track of your data and manage your storage usage. You can use the aws s3 ls and aws s3api list-buckets commands to list S3 buckets and objects using the AWS CLI.

The aws s3 ls command is a high-level command that lists the S3 buckets or objects in a simple and human-readable format. The syntax of the command is:

aws s3 ls [s3://bucket-name]

where s3://bucket-name is the optional S3 URI of the bucket that you want to list the objects in. If you omit the bucket name, the command will list all the buckets in your AWS account. For example, to list all the buckets in your AWS account, you can run the following command:

aws s3 ls

You should see an output like this:

2023-11-14 02:13:54 ajits-bucket-2023

This output shows the creation date and the name of each bucket in your AWS account.

To list the objects in a specific bucket, you can specify the bucket name in the command. For example, to list the objects in the bucket ajits-bucket-2023, you can run the following command:

aws s3 ls s3://ajits-bucket-2023

You should see an output like this:

2023-11-14 02:56:40          0 file1.txt
2023-11-14 02:14:00         23 hello.txt

This output shows the last modified date, the size, and the key of each object in the bucket.

The aws s3api list-buckets command is a low-level command that lists the S3 buckets in your AWS account using the S3 API. The syntax of the command is:

aws s3api list-buckets

This command does not take any parameters. When you run this command, you should see an output like this:

{
    "Buckets": [
        {
            "Name": "ajits-bucket-2023",
            "CreationDate": "2023-11-13T20:43:54+00:00"
        }
    ],
    "Owner": {
        "ID": "2dea46f6bb11505667678a7a8fd9b319c2113393408dbcb3b4223a5a8d14122c"
    }
}

This output shows the name and the creation date of each bucket in your AWS account, as well as the display name and the ID of the owner of the buckets. The output is in JSON format, which is more structured and machine-readable than the output of the aws s3 ls command.

To list the objects in a specific bucket using the S3 API, you can use the aws s3api list-objects command. The syntax of the command is:

aws s3api list-objects --bucket bucket-name

where bucket-name is the name of the bucket that you want to list the objects in. For example, to list the objects in the bucket ajits-bucket-2023, you can run the following command:

aws s3api list-objects --bucket ajits-bucket-2023

You should see an output like this:

{
    "Contents": [
        {
            "Key": "file1.txt",
            "LastModified": "2023-11-13T21:26:40+00:00",
            "ETag": "\"d41d8cd98f00b204e9800998ecf8427e\"",
            "Size": 0,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "2dea46f6bb11505667678a7a8fd9b319c2113393408dbcb3b4223a5a8d14122c"
            }
        },
        {
            "Key": "hello.txt",
            "LastModified": "2023-11-13T20:44:00+00:00",
            "ETag": "\"a9e77f60fb986b347e8b65d5e3dc2e00\"",
            "Size": 23,
            "StorageClass": "STANDARD",
            "Owner": {
                "ID": "2dea46f6bb11505667678a7a8fd9b319c2113393408dbcb3b4223a5a8d14122c"
            }
        }
    ],
    "RequestCharged": null
}

This output shows the key, the last modified date, the ETag, the size, the storage class, and the owner of each object in the bucket. The output is also in JSON format, which is more structured and machine-readable than the output of the aws s3 ls command.

Note: You can use the --output option with the aws s3api commands to change the format of the output. The supported formats are JSON, text, and table. For more information, see the AWS CLI documentation.

That’s it for this section. In this section, I showed you how to list S3 buckets and objects using the AWS CLI. You learned how to use the aws s3 ls and aws s3api list-buckets commands to list the buckets in your AWS account, how to use the aws s3 ls and aws s3api list-objects commands to list the objects in a specific bucket. You also learned how to change the output format of the commands using the --output option.

Conclusion

In this blog post, I showed you how to use the AWS CLI to manage data in Amazon S3. You learned how to perform the following tasks:

  • Creating and deleting S3 buckets

  • Uploading and downloading files to and from S3 buckets

  • Syncing local folders with S3 buckets

  • Generating pre-signed URLs for S3 objects

  • Listing S3 buckets and objects

By using the AWS CLI, you can automate tasks, manage resources, and perform operations that are not available in the AWS Management Console. You can also use the AWS CLI to interact with other AWS services, such as Amazon EC2, Amazon DynamoDB, and Amazon Lambda. The AWS CLI is a powerful and versatile tool that can help you work with AWS more efficiently and effectively.

I hope you enjoyed this blog post and learned something new. If you have any questions or feedback, please feel free to leave a comment below. I’d love to hear from you.

Thank you for reading and happy coding!

Did you find this article valuable?

Support Ajit Fawade by becoming a sponsor. Any amount is appreciated!