S3 Storage Classes

Introduction

In the previous article of this series, I gave you an introductory overview of S3 buckets, what they are, and what they are used for. S3 is one of the most widely used AWS services, and although it’s not difficult to store your files using this service, there are many aspects you need to master to get the most out of it. In this article, we will be talking about the storage classes available in S3.

Why do you need to know about storage classes?

Regarding storage classes, you should keep in mind that they influence the following aspects of your objects stored in S3:

  • Storage price
  • Data retrieval price
  • Availability
  • Data retrieval time
  • Minimum storage duration charges

Durability does not vary between the different S3 storage classes. It always provides high durability, 99.999999999% (yes, that’s 11 9s 😂), which means that if you store 10 million objects, you might lose 1 every 10,000 years.

Basically, you need to analyze, depending on the access pattern for the objects stored in the S3 buckets, which storage class is best, considering the aspects mentioned above. Before we look at each of the storage classes, we need to understand these terms:

Storage price

This represents how much AWS will charge for storing objects in S3 buckets.

Data retrieval price

This represents how much AWS will charge each time you request a specific object; that is, this only applies when you retrieve an object.

Availability

This is used to measure the time our objects will be available; for example, most classes offer 99.99% availability.

99.99% availability implies that the service will be unavailable for 53 minutes per year.

Data retrieval time

This represents the time it takes S3 to return an object to us; it can be instantaneous, in the order of minutes, or even in the order of several hours. It is very important to analyze this when choosing the storage class.

Minimum storage duration charges

If applicable, this represents the minimum storage time that AWS will charge us for each object. For example, if this value is 30 days for a class, it means that for each object, we will have to pay for the equivalent of storing it for 30 days, even if we delete it before this period or transition it to another storage class.

S3 Standard

This is the default class for all objects we store in our S3 bucket. It is used for frequently accessed data. It provides high durability and performance. It replicates objects across a minimum of three Availability Zones, guaranteeing 99.99% availability. It has the highest storage price of all classes, but its data retrieval price is zero. In addition, minimum storage duration charges do not apply, and data retrieval is instantaneous.

Use Cases

  • Hosting static websites.
  • Temporary storage service, for example, when it’s necessary to keep application logs for a day. In this case, it’s viable because minimum storage duration charges do not apply to this class.

S3 Standard Infrequent Access

This storage class (S3 Standard-IA) is used to store objects that are accessed infrequently but still need to be returned instantaneously. It replicates objects across a minimum of three Availability Zones, guaranteeing 99.99% availability. This class does apply charges for a minimum storage duration of 30 days, and data retrieval is instantaneous. The storage price is lower than that of the S3 Standard class, but this class does have a data retrieval price, which is measured per GB.

Use Cases

  • Data backups.
  • Any application where infrequently accessed data must remain available for a long time but needs to be retrieved immediately.

S3 One Zone Infrequent Access

This storage class (S3 One Zone-IA) is used to store objects with the same characteristics as the previous class, with the difference that they must be easily reproducible, as this class stores data in only one Availability Zone. This results in an availability of 99.95% for this class, the lowest among all S3 storage classes. It is cheaper than the S3 Standard Infrequent Access class, applies charges for a minimum storage duration of 30 days, and data retrieval is instantaneous.

Use Cases

  • The cheapest solution for storing infrequently accessed data that must be retrieved immediately.
  • Applications that do not require the availability and resilience of the classes we have seen so far.
  • For storing data that can be easily regenerated.

S3 Intelligent-Tiering

This class has the ability to continuously assign classes to our objects whenever their access pattern changes. It is very useful from an operational point of view because it moves objects to the cheapest class based on their access pattern. Minimum storage duration charges of 30 days apply, and data retrieval charges do not apply. This class moves objects between tiers as follows:

  • S3 Standard by default (automatic)
  • S3 Standard Infrequent Access when the object has not been accessed for 30 days (automatic)
  • S3 Glacier Instant Retrieval when the object has not been accessed for 90 days (automatic)
  • S3 Glacier Flexible Retrieval when the object has not been accessed for more than 90 days (configurable)
  • S3 Glacier Deep Archive when the object has not been accessed for more than 180 days (configurable)

Use Cases

  • Useful in applications where the object access pattern is unpredictable.
  • For buckets containing some frequently accessed data and other infrequently accessed data.
  • If the data access pattern changes all the time.
  • If data is accessed by users in variable time periods.
  • If you want to avoid lifecycle policies.

S3 Glacier

This class is very useful for archiving data. It is a very cheap storage solution for data that does not need to be retrieved instantaneously. It replicates data across a minimum of 3 Availability Zones, guaranteeing 99.99% availability. Minimum storage duration charges of 90 days apply, and high data retrieval prices apply. Objects can be moved directly from S3 Standard or S3 Standard-IA to this class using lifecycle policies.

Use Cases

  • Archiving data for a certain number of years before the data can be deleted.

Retrieval Options

Expedited

  • Provides fast access to a subset of archived objects.
  • Allows data access within a period ranging from 1 - 5 minutes, as long as the file does not exceed 250 MB.
  • From an operational standpoint, there must be sufficient retrieval capacity to perform this operation.

Standard

  • This is the default option when retrieving an object with the Glacier class.
  • Allows data access within a period ranging from 3 - 5 hours.

Bulk

  • This is the cheapest data retrieval option.
  • Returns a large amount of data in less than 12 hours.
  • Usually completes the process between 5 - 12 hours.

S3 Glacier Deep Archive

This class offers the lowest storage cost in S3. It supports long-term retention and digital preservation of data. Primarily used for retaining data for more than 7 years. It replicates data across at least 3 Availability Zones. Minimum storage duration charges of 180 days apply.

Use Cases

  • Archiving data that will be very rarely accessed and does not have a strict retrieval time.

Retrieval Options

Standard

  • This is the default data retrieval option for this storage class.
  • Data is restored within 12 hours.

Bulk

  • Cheaper than the Standard option.
  • Data is restored within 48 hours.

Recommendation

Below, I share how I decide which storage class to use when working with S3:

  • If the data is accessed frequently, for example, I am storing the files for a static website in the bucket, I use the Standard class.
  • If the data is not accessed frequently but needs to be returned instantaneously, I decide between S3 Standard Infrequent Access and S3 One Zone Infrequent Access based on whether the data can be easily reproduced. If the data does not need to be returned instantaneously, I decide based on the frequency of access: if it’s once every three months, I use Glacier with Expedited retrieval; if the data is accessed once a year, I use Glacier with Standard or Bulk retrieval; and if the data will be accessed less than once a year, I use Glacier Deep Archive.
  • If the object access pattern is unclear, I simply use Intelligent-Tiering so that the storage class of each object changes depending on the access pattern.

See you soon

That’s all for now regarding the second article in this series where I’ll be providing you with a complete guide to S3 buckets. In the next article, we’ll be talking about lifecycle policies. See you soon.


Related Content

Get latest posts delivered right to your inbox
0%