Introduction
Each section below has a brief introduction, followed by a diagram. I then breakdown the diagram and provide documentation that can verify practices and answer further questions.
Table of Contents
- Private and Durable Event Driven Architecture
- RDS Availability and Disaster Recovery
- Other Diagrams I’ve Published
Private and Durable Event Driven Architecture
This zero trust architecture ensures traffic does not traverse the public internet and provides back pressure during high volume events, preventing potential data loss.
Breakdown
- The ECS task is running a Docker container.
- RDS security group only allows traffic from the ECS task.
- Data is encrypted in transit between the ECS task and the RDS instances using SSL/TLS.
- The ECS task’s IAM role grants the task access to the SQS queue and ensures least-privilege access.
- The ECS task uses a VPC Endpoint to access the SQS queue, ensuring the traffic remains in the VPC.
- KMS is used to encrypt data at rest when a message is placed onto an SQS queue. The key’s policy grants encrypt/decrpt permissions to SQS and the ECS task’s IAM role.
- The durability of SQS ensures messages are not dropped due to any number of factors (volume of traffic, bugs, etc).
- If the ECS task fails to process a message successfully, SQS puts the message on a dead-letter queue (DLQ). Messages on the DLQ can be retried or expire after a set amount of time.
- A CloudWatch alarm will be triggered when a message is placed on the DLQ.
RDS Availability and Disaster Recovery
High availability (HA) refers to a system’s ability to remain operational and accessible despite any faults or failures in its components. Cross-region backups provide off site disaster recovery, contributing to an larger DR strategy.
Breakdown
- Three availability zones provide the capability to reach quorum in the event of a network partition.
- Multi-AZ clusters deployments provision a writer DB instance and two reader DB instances in three separate Availability Zones, respectfully, and within the same AWS Region.
- A Multi-AZ cluster uses DNS to provide two types of endpoints:
- functional endpoints*: read and write endpoints used by your code
- instance endpoints: for troubleshooting specific RDS instances
- RDS provides automatic failover if a writer instance experiences an outage. AWS recommends setting a default TTL to less than 60 seconds to resolve the updated writer DNS address.
- Automated RDS backups (nice).
- Configure RDS to replicate snapshots and transaction logs to different AWS Region.
* RDS docs simply call these “endpoints”