What is the most cost effective Amazon S3 storage tier for data that is not often accessed but requires high availability?

Amazon Simple Storage Service -Amazon S3 for short- is a secure, durable and highly available object storage service of AWS, enabling users to store their flat files (called as objects) up to 5 TB. Amazon S3 offers unlimited storage for your objects in the cloud with up to 99.99% availability and 11 nines of durability guarantees. Amazon S3 basically works through storing your data in a specific region of your choice across 24 available regions of AWS global infrastructure, where you can access the data from anywhere in the world. The built-in availability of the service comes from data replication across different Availability Zones within the region that you selected. This way, you ensure that your data will still be accessible should a failure occur in a particular Availability Zone. You can also leverage the cross-region replication feature of Amazon S3 to replicate your data to another region depending on your availability preferences and business needs as well.

Amazon S3 is the oldest service launched by AWS in 2006, and it keeps being improved and advanced since then. Over time, many additional features and storage classes are being added to Amazon S3, which enables low cost and secure object storage for businesses of all sizes. Amazon S3 enables you to create lifecycle policies for your objects, use version control, encrypt your data at rest and manage access controls using built-in security features such as access control lists and bucket policies.

In this blog post, we’ll focus on how Amazon S3 provides low cost object storage solutions for businesses through available storage classes and how to optimize your storage costs with right-sizing your storage class based on your needs. Let’s start with Amazon S3 Storage Classes.

Amazon S3 Storage Classes

Amazon S3 is an object based storage platform that enables you to upload and store your files in buckets. For each object you upload to S3, you can select the optimal storage class in line with your access needs. There are 6 different storage classes offered within Amazon S3 that allow you to choose the one that specifically addresses your data access and cost requirements. All these storage classes are designed to provide scalable, available, durable, secure and high performance storage solutions for your objects.

What is the most cost effective Amazon S3 storage tier for data that is not often accessed but requires high availability?

Diagram Source: AWS1

1. S3 Standard

S3 Standard is the default storage class within Amazon S3. It includes the 99.99% availability and 11 nines durability commitment for your data and provides low latency and high throughput advantages. This storage class is optimal for frequently accessed data that needs immediate access. For example, if you are considering using Amazon S3 for your cloud application or websites that continuously need data access with maximum speed, S3 Standard is a great option for you.

2. S3 Standard-Infrequent Access

Let’s say you still need immediate access to your data, but not as often as you would expect from S3 Standard class. S3 Standard-IA class provides a lower cost object storage option for this case. For a solid comparison, while S3 Standard charges $0.021 to $0.023 per GB (based on the storage size), S3 Standard-IA only charges 0.0125 per GB of data, almost half of the former one. Alongside the cost benefits, this class requires you to commit a minimum 30 days of data storage and charges for retrieval fee per GB of data requested. S3 Standard-IA is really useful if you store data that you don’t need to access often, but when you do, you require immediate access. This storage class also provides 11x9 durability, replicates your data over minimum 3 Availability Zones and offers 99.9% availability, which is a bit lower than the S3 Standard. Example use cases for this class can be given as secondary copies of your mission critical data backups, data for disaster recovery processes and etc. Since you don’t need such data often, S3 Standard-IA allows you to save cost but still ensures rapid access.

3. S3 Intelligent-Tiering

This great option is the world's only cloud storage class that enables you to reduce your costs automatically! S3 Intelligent-Tiering automatically monitors your data access patterns and decides on which access tier to move your data into. There are 2 access tiers within this class, namely frequent and infrequent access tiers. If your data is not accessed for 30 days, it automatically moves your data into the Infrequent Access Tier to help you reduce and optimize your costs. The cost varies across the tiers that your data is stored within, specifically between $0.0125 and $0.023 per GB for Infrequent and Frequent Access Tiers. There is also a fixed cost for monitoring and automation that is equal to $0.0025 per 1000 objects.

What is the most cost effective Amazon S3 storage tier for data that is not often accessed but requires high availability?

Diagram Source: AWS2

This option is extremely useful for the data that you don’t exactly know the future access patterns and unpredictable workloads. Let’s say your application serves movie reviews and ratings, in which case you aren’t very sure about when the interest for a specific movie will decline, S3 Intelligent-Tiering helps you save on storage costs while you don’t need the access.

In November 2020, AWS announced that 2 new access tiers were added to S3 Intelligent-Tiering to help you reduce their storage costs for rarely accessed objects. While Frequent Access Tier and Infrequent Access Tier help you to automatically move their objects between appropriate classes based on the unpredictable access patterns, these 2 new tiers helps to optimize costs through moving rarely accessed objects, specifically moving objects to Archive Access Tier if not accessed for 90 days, and to Deep Archive Access Tier if not accessed for 180 days. Once you access an object that is stored in Archive or Deep Archive Tiers it will be moved to Frequent Access Tier, as in Infrequent Access Tier option. These newly introduced Archive Tier and Deep Archive Tier have the same characteristics of S3 Glacier and S3 Deep Glacier Deep Archive storage classes -explained below- respectively, in terms of performance, pricing and retrieval times. You can opt-in to these tiers to achieve cost savings through automated movements of your data across cost-optimizing classes with intelligent decisions.

4. S3 One Zone-IA

Following the S3 Standard-IA, S3 One Zone-IA offers even cheaper storage options for the data that is accessed infrequently but rapidly. The main point of this class is that your data is stored in only one Availability Zone, which has some impacts on availability of your data. The availability offering of this storage class becomes 99.5%. Your data is not replicated between different Availability Zones (unlike all other storage classes), so you are responsible for ensuring that the data you store in S3 One Zone IA can be recreated and not mission critical. The cost per GB of data reduces to $0.01 within this storage class.

5. S3 Glacier

S3 Glacier is one of the low cost data archiving options available within Amazon S3. The service costs $0.004 per GB, which is even lower than one fifth of the standard storage option. While S3 Glacier also provides 11x9 durability and 99.99% availability for your archived data, the retrieval times are longer than the all previous classes. The retrieval time ranges between 3 to 5 hours, so there is not an immediate access option. The service requires minimum 90 days storage duration and minimum 40 KB capacity for storage. The service is useful for the data that you access least frequently and have the ability to wait for access, such as your data archives. There are also other retrieval capability options that you can choose to reduce or increase retrieval time based on your needs. For instance, you can ensure that you can access your archived data in 1-5 minutes with expedited retrieval options, but it becomes much more expensive that it reduces the cost savings you have generated within this class option in the first place.

6. S3 Glacier Deep Archive

The lowest cost object storage class offering S3 Glacier Deep Archive enables you to archive your data and meet your long term storage requirements with minimum storage costs. The 11 nines durability and 99.99% availability offering ensures that you are safely storing your archived data. Whether you archive data for compliance requirements or data retention objectives for your business goals, S3 Glacier Deep Archive enables you to store terabytes of data for around $1 a month. This storage class expects a minimum of 180 days for object storage duration. It is extremely cost-effective and efficient for the data that you don’t really need to access other than archive or documentation requirements.

For the full list of current pricing details of Amazon S3 Storage Classes, check here:

aws.amazon.com/s3/pricing

Amazon S3 Cost Optimization with Right-Sizing Storage Classes

With the 6 different storage classes described above, it is easy to address different storage requirements of your various data sets. Lowering your storage costs on Amazon S3 requires effective planning and continuous evaluation of your data access patterns. Based on your access and retrieval urgency requirements, you can achieve cost savings with right storage class selection on Amazon S3 and further optimization with S3 Lifecycle policies.

Understanding Your Needs

When you first start using Amazon S3 or looking for ways to optimize your spending, it is essential to clearly define the nature of your data and application needs. All the availability, durability, resiliency, accessibility, size and duration characteristics of S3 Storage Classes address distinct requirements. You should start with assessing the nature of your application and goals first. You should be able to answer how much uptime you require for your data, your urgency level for access, performance targets and size of the data, etc. Can you handle one zone availability or do you need continuous support on optimizing your access tiers for unpredictable workloads? All these decisions and requirements help you choose the right storage class and achieve cost savings.

Organizing Your Data

Following the clear definition of your needs, you should also spend time to organize your data. Bucket names, prefixes and resource tags on individual levels to help you define your large data sets effectively and choose right storage classes and tiers later on. Let’s say that your finance department needs immediate access to your customer data, but only needs infrequent access to last season’s sales or even last year’s assets. You can benefit from different data storage classes for these different needs if you define and track your data effectively with proper tags and organization. You can also use this valuable information for S3 Lifecycle management and decide on data transfer between different classes.

Monitoring and Analyzing Your Access Patterns and Spending

As your operations evolve over time, it is important to monitor and analyze your data access patterns and service usage for cost optimization. Here, capabilities of AWS services such as Storage Class Analysis, Amazon CloudWatch Metrics or S3 Server Access Logging help you understand the access patterns. Storage Class Analysis enables you to analyze and understand your object access patterns and then decide on how to define your lifecycle policies for transition or expiration actions on your S3 objects. It helps you to define the periods to transit your objects to optimal storage classes designed for less or more frequent access or simply define an expiration time limit for objects to be deleted.

On the other hand, you can use Amazon CloudWatch Metrics to understand daily storage metrics across your buckets and determine the increasing size patterns of your objects. Similar to the Amazon CloudWatch Metrics, AWS Server Access Logging also helps you to analyze the requests made to your buckets and understand the existing patterns across data access. It also eases the analysis of large data sets across various applications and helps you track and monitor your bucket records in a more structured manner. You can then decide on which storage classes to be used based on these metrics.

Last but not least, while you can manually configure S3 Lifecycle policies to manage your data based on your preferred duration limits with the help insights you achieve by the services above, you can also use the Intelligent-Tiering option to automate this process and ensure that you never compromise performance while your objects are transferred between most cost-efficient storage classes. In the end, defining your needs and associated class options within Amazon S3 help you store your data within effective storage class alternatives for increased cost savings.

What is the most cost effective Amazon S3 storage tier for data that is not often accessed but requires high availability?

Interested in leveraging cloud-based storage services for maximum availability and durability for your business data? Book an Appointment now to optimize your data storage and reduce costs!