paulund

Module 6: Storage

Module 6: Storage

Introduction to Storage

AWS offers several storage types, each designed for a different use case:

  • Block storage — Low-latency volumes that attach directly to EC2 instances. Best suited to databases and applications that need consistent, high-performance access to data. The primary service is EBS (Elastic Block Store).
  • Object storage — Stores data as individual objects, each consisting of the data itself, a unique identifier, and metadata. Objects are organised in buckets. Best suited to images, videos, and other unstructured data. The primary service is Amazon S3.
  • File storage — Uses a hierarchical directory structure, similar to a traditional file system. Suitable for shared file access across multiple instances.
  • AWS Storage Gateway — A hybrid cloud storage service that gives your on-premises infrastructure access to virtually unlimited cloud storage.
  • AWS Elastic Disaster Recovery — Replicates critical workloads to AWS so that you can recover applications and data quickly after an incident.

Block Storage

  • EC2 Instance Store provides temporary block storage that is tied directly to the EC2 instance. The data does not persist if the instance is stopped or terminated.
  • EBS (Elastic Block Store) provides persistent block storage volumes. Unlike instance store, EBS volumes survive instance restarts and can be detached from one instance and attached to another.
  • EBS is best suited to databases and applications that require consistent, low-latency performance.

Amazon EBS Data Lifecycle

Managing EBS volumes effectively involves creating, backing up, and eventually deleting them. AWS provides tools to automate this process:

  • EBS Snapshots are point-in-time copies of an EBS volume. You can schedule snapshots on a regular basis — hourly, daily, weekly, or monthly — using Amazon Data Lifecycle Manager.
  • Snapshots support incremental backups, meaning only the data that has changed since the last snapshot is saved. This makes them faster and cheaper to create and store than full copies.
  • Snapshots are commonly used for disaster recovery, data migration, and routine backups.
  • The customer is responsible for managing EBS snapshots and configuring Data Lifecycle Manager.

Amazon Simple Storage Service (S3)

Amazon S3 is AWS's object storage service. It organises data into buckets and can store effectively unlimited amounts of data.

  • Well suited to documents, images, videos, and other unstructured data.
  • Supports versioning, which lets you keep, retrieve, and restore every version of an object.
  • Offers lifecycle management to automatically transition objects between storage classes or delete them after a defined period.
  • Supports multiple storage classes, each optimised for a different access pattern and cost profile.
  • A common use case is hosting static websites, where S3 serves the HTML, CSS, and image files directly.
  • All objects stored in S3 are private by default. You must explicitly grant access.

S3 Storage Classes and Lifecycle

S3 offers a range of storage classes, each designed for a specific access pattern and cost profile:

  • S3 Standard — General-purpose storage for frequently accessed data, including dynamic websites.
  • S3 Standard-IA (Infrequent Access) — Lower-cost storage for data that is accessed less frequently but still needs to be retrieved quickly when needed.
  • S3 One Zone-IA — A lower-cost option for infrequently accessed data that does not require resilience across multiple Availability Zones.
  • S3 Express One Zone — Optimised for low-latency access to data stored within a single Availability Zone.
  • S3 Intelligent-Tiering — Automatically moves objects between access tiers based on how frequently they are accessed, removing the need to manage tiering manually.
  • S3 Glacier Instant Retrieval — Low-cost archival storage for data that is rarely accessed but needs to be retrieved in milliseconds when it is.
  • S3 Glacier Flexible Retrieval — Low-cost storage for data that is infrequently accessed and can tolerate retrieval times of minutes to hours.
  • S3 Glacier Deep Archive — The lowest-cost storage class, designed for data that is rarely accessed and can tolerate retrieval times of hours to days.
  • S3 Outposts — Delivers S3 storage capabilities to your on-premises environment.

You can configure lifecycle policies to automatically transition objects between storage classes or delete them after a set period, reducing both manual effort and storage costs.

Amazon Elastic File System (EFS)

  • EFS is a fully managed file storage service that can be accessed simultaneously from multiple EC2 instances.
  • The file system grows and shrinks automatically as you add or remove files, without requiring you to provision capacity in advance.
  • Common use cases include content management, web serving, and shared data access across teams or instances.
  • EFS supports cross-region access, allowing instances in different regions to read from the same file system.
  • Lifecycle policies can automatically move files to lower-cost storage tiers or delete them after a defined period.

Amazon FSx

  • FSx is a managed file storage service that provides high-performance file systems tailored to specific workloads.
  • It supports multiple file system types, including Windows File Server and Lustre (for high-performance computing).
  • FSx makes it straightforward to migrate existing Windows file servers into the cloud without changing your applications.

AWS Storage Gateway

  • Storage Gateway is a hybrid solution that connects your on-premises infrastructure to AWS cloud storage.
  • You can extend your local storage capacity into the cloud, making S3 buckets appear as local drives to your on-premises applications.

AWS Elastic Disaster Recovery

  • Elastic Disaster Recovery continuously replicates your critical workloads to AWS.
  • In the event of a disaster, you can recover your applications and data in the cloud within minutes, minimising downtime and data loss.
  • This service is well suited to organisations that need to ensure business continuity without maintaining a full secondary data centre.

Comparing Storage Services

Service Best For
Amazon S3 Static resources, documents, images, and videos
Amazon EBS Scalable block storage attached to EC2 instances
Amazon EFS Scalable file storage accessible from multiple instances