Resilience on AWS at CodeFreeze 2023
January 12, 2023 #aws #architecture #disasterrecovery #conference
I had a great time talking about Resilience on AWS at Code Freeze 2023 and wanted to share some links to related content.
Some definitions from the AWS Well Architected Framework reliability pillar:
- Resilience - The ability of a workload to recover from infrastructure or service disruptions, dynamically acquire computing resources to meet demand, and mitigate disruptions, such as misconfigurations or transient network issues
- Reliability - The ability of a workload or application to perform its intended function correctly and consistently
- Availability - The percentage of time that a workload is available for use (e.g. 99.99% available)
Some key Best Practices from the reliability pillar that are key to Resilience.
- REL05 Design interactions in a distributed system to mitigate or withstand failures
- REL04 Design interactions in a distributed system to prevent failures
As with security and sustainability, resilience is a shared responsibility between AWS and the customer; AWS is responsible for resilience of the cloud, and our customers are responsible for resilience of their workloads in the cloud. Some resources to help:
- Shared Responsibility Model for Resiliency
- AWS Solutions Library Solutions for Resilience
- The Amazon Builders’ Library
- AWS Fault Isolation Boundaries AWS Whitepaper (Publication date: November 16, 2022)