2024 AWS Solutions architect associate exam guides and tips

The 2024 exam guide includes the latest tips and certification guide. I have included some extra tips for your day-to-day cloud operations, so please enjoy the article and if you need any help just please get in touch with me:

  1. Design Secure Architectures
  2. Design Resilient Architectures
  3. Design High-Performing Architectures
  4. Design Cost-Optimized Architectures

While this post covers most of the content, going through books are great way to improve your AW knowledge

1. Design Secure Architectures

IAM – Identify and Access Management

IAM is used to access AWS resources. It cannot be used to access OS of servers. Like other AWS resources, IAM falls under shared responsibility model. This means you are responsible for assigning the right access to the right user or resources, AWS is not.

Principals

IAM entity which lets you manage AWS resources. It can be temporary or permanent.
There are three types of principles:

  • Root Users
  • IAM Users
  • Roles/Temporary Security Token

Root user

  • The account which you create your AWS Account
  • Best practice is not to use it for daily tasks and lock it away
  • It has access to all AWS resources.

IAM Users:

  • Like user accounts in Windows.
  • An user account is persistent, it will not expire. Unless, the administrator deletes it.
  • A new IAM user has no access key nor a password.

Roles / Temporary Security Tokens

  • Roles are granted access privileges for a set duration of time. (compared to USER which is persistent roles are not)
  • When a service (actor) tries to access a resource, AWS provides a Temporary Security Token. It is valid from 15 minutes to 36 hours.
  • Roles do not have username or access keys.
  • Use cases of Roles:
    • EC2-Server: Instead of saving S3 credentials in EC2 server, a role can be attached to the EC2 server to access S3 without username and password.
    • Granting permission to other AWS Account to access your resources.
  • It is recommended to assign roles to EC2 instances instead of storing AWS credentials in code. AWS SDK can use roles. This makes it very secure in production.

IAM Federation

Let’s roles to be attached to non- AWS IAM Users. Example: A company that has migrated its resources to AWS, does not need to create the same users in AWS. It can integrate IAM with its Active Directory and attach roles to the users in the Active Directory.  

IAM can integrate with the below providers:  

  • OpenID for Google, Facebook auth. 
  • SAML for Active Directory 

Policies

Set of action that can be executed on resources.

{ 
  "Version": "2012-10-17", 
  "Statement": { 
    "Effect": "Allow", 
    "Action": "s3:ListBucket", 
    "Resource": "arn:aws:s3:::example_bucket" 
  } 
}

It is written in JSON and can be attached to a group, user or role.
Effects: Can be either Allow or Deny.
Action: Which AWS service does this policy apply
Resource: The Amazon Resource Name (ARN).
Condition: List of conditions that is required for the action to be allowed.

Associating a policy to an IAM user can be attached in the following methods:

  • Inline Policy.
  • Managed Policy.
  • Group Policy.
  • Managed Policies.

There are 6 types of Policies:

  1. Identity-based policies: are the JSON policies which are attached to users or roles. Two types: Inline and Managed Policy. Managed policy is recommended because it avoid duplication and can be attached to other roles.
  2. Resource-based policies: are policies which you attached to AWS resources such as S3.
  3. IAM Permissions Boundaries: It is used to set maximum permissions an identity-based policy can grant to an IAM entity.
  4. Service Control Policies (SCPs): JSON Policies can specify the maximum permissions for an organization or organization unit.
  5. Access Control Policies (ACLs): Allows you to control which principals in another account can access a resource.
  6. Session Policies: Policies which are passed as a parameter .

Security in multiple Accounts

Multiple account offers greater security posture. Each account is responsible for a specific workloads. Landing Zone is a concept where multiple accounts are operated. A typical landing zone has the following accounts:

  1. Security Account
  2. Log Archive
  3. Networking
  4. Management
  5. Workload

AWS Control Tower lets you build a secure landing zone.

SCP are guard-rails written in JSON to apply to accounts. For example, you have 50 accounts and you are required to restrict users from logging in without MFA. This can be achieved by a SCP.

AWS SSO

AWS SSO(Single Sign On) is the new way of managing IAM users, it lets you integrate with third party login providers such as Okta, Azure AD or any other SAML provider. It consists of permission sets.

For the exam: you don’t need to know what is permission set or how it works.

Secrets Management

Parameter store and secrets manager can be used to store secrets. Secrets Manager offers more advantages such as versioning of secrets, higher request rate and rotation of secrets.

AWS Security Services

AWS Security Hub: Aggregates data from multiple security sources. Create a compliance dashboard.

AWS Inspector: Agent that is installed on EC2 monitor for vulnerability, has integration with ECR and Lambda

AWS GuardDuty: Threat detection service that actively monitors your AWS account from logins, RDS etc.

AWS Macie Helps in identifying PI data on S3 bucket. You might get asked what service do you use to help detect PI data. It is always Macie

AWS Shield: offers protection at Layer 3 and 4.

AWS WAF: is the application firewall that protects from XSS, SQL Injection.

AWS Network Firewall: controls outbound and inbound traffic. You can define which domain can be accessed through your network.

AWS Direct Connect

Let’s you connect from your in house network to AWS data centre over a dedicated network. Traffic flows privately not through internet. This is highly secure.

AWS VPN

The alternative method over Direct Connect. You can connect to AWS services from your on-premises network over the internet. On-premises devices can access Private IP’s of your network.

 

2. Design Resilient Architectures

In this section, AWS will question your knowledge on storage, decoupling systems so if one component fails it won’t cause major system downtime, how to design a multi-tier architecture, so systems can scale independently. It is more understanding what services you’re familiar with and which one would you pick. A solid understanding of AWS services is important:

Storage types:

1.Instance store: Block storage volume attached to some EC2 instances. Very cheap and ideal for temporary storage. Data is lost when the instance is restarted.

2. EBS volume: This type of storage can be mounted on a single server. Data on EBS can persist beyond the EC2 lifecycle! Key point:

There are four types of EBS volumes: (use cases are important and the IOPS)

  1. Cold HDD (sc1)
    • Use case: great for infrequent accessed, throughput-intensive workloads.
    • Max IOPS / Volume 250.
    • Cannot be used as a boot volume.
  2. Throughput HDD (st1)
    • Use case: Big data, log processing, Log processing
    • Max IOPS / Volume 500.
  3. General Purpose SSD (gp2, gp3)
    • Use case: OS boot volume, databases.
    • Max IOPS / Volume 10,000
  4. Provisioned IOPS SSD. (io1, io2)
    • Use case: Critical application, hosting ELK, Applications that need very fast data access
    • Max IOPS/ Volume 32,000

 

EBS offers Mult-attach that lets you attach a single volume to multiple instances only if they are in the same Availabilty zone. This is only for SSD io1 or io2 type.

 

EFS – Elastic File System

Can be mounted on multiple EC2 instances. This is a key differences to EBS. Don’t get this confused with EBS multi-attach as they serve different purpose.

EFS volume can be easily mounted on an EC2 instance. They cannot be used as a boot volume.

AWS S3

acts as object storage.

  • Each object consists of a MetaData (created by Amazon) and Data (custom data).
  • An object size could be from 0 to 5 TB.
  • S3 Objects are replicated across multiple devices within a region!
  • S3 Objects are saved in a container called “Bucket“, consider Bucket as the root folder.
  • To prevent accidental object deletion, enable versioning and MFA.
  • S3 data can be replicated across other regions, this is usually done for compliance. Note: Only new objects will be replicated.
  • Bucket names are unique across all AWS Accounts!
  • The bucket name must be between 3 and 63 characters and can contain numbers, hyphens or periods.
  • Consistency Model of S3
    • When you create a new object, you will receive the latest object. (Read After Write Consistency) – PUTS to the new object
    • When you PUT or DELETE a current object, AWS provides Eventual consistency, it might take a while for the changes to be affected.
  • There are eight S3 storage classes:
    1. S3 Standard: 99.99% availability, ideal for frequently accessed data.
    2. S3 Intelligent-Tiering: Consider it as an automated AI which monitors the object lifecycle in S3 and moves them to the appropriate class. For example: When an object is not accessed for more than 30 days it will be migrated to the infrequent access class. S3 Intelligent-Tiering has no impact on performance. This is a great service when the understanding of object storage duration is unknown and you would like to save cost in the long term run.
    3. S3 Standard-IA Infrequent Access: 99.9% availability, ideal for less frequently accessed data, cheaper than S3 Standard, minimum object size 128kb
    4. S3 One Zone-Infrequent-IA Access: 99.5% availability. It is 20% cheaper than S3 Standard. Data is stored in a single Availability Zone. Ideal for objects that are accessed less frequently but requires sudden access. If data resilience is not important and you would like to reduce cost this is the ideal class.
    5. S3 Glacier Instant Retrieval: Provides the quickest time to archive storage! (this is important to note). 128kb minimum object size.
    6. S3 Glacier Flexible Retrieval: Used for data archiving such as long term compliance. There is no upfront cost, pricing is based upon per GB storage. Data is encrypted by default. This is good for data that needs to be accessed 1 or 2 times a day. Provides three methods to access data
      1) Expedited: Retrieve data within 1 to 5 minutes
      2) Standard: Retrieve data within 3-5 hours
      3) Bulk: Retrieve data within 5 to 12 hours.
    7. S3 Glacier Deep Archive: Lowest cost of all S3 storage methods.
    8. S3 on Outposts: lets S3 deliver files to on-premises AWS Outposts environment.

S3 Object Lock

Object lock are one of my used features of S3 when you are operating in a regulated industry, object lock helps you guarantee compliance officer that when data is saved no entity can modify or delete the data. In AWS S3 this level of assurance is offered by object lock functionality.

To enforce an object, we must select one of the below retention periods:

1.  Compliance mode

Object versions cannot be overwritten or deleted by any user including the root user. This is most secure mode as the only way to delete the object is by closing the AWS account.

Tip: If you get asked a scenario where data shouldn’t be modified at all and not deleted, it is Compliance mode.

2. Governance mode
  • Most users can’t overwrite or delete an object version or alter its lock settings.
  • Some users have special permissions to change the retention or delete the project.

Tip: If you get asked that object can be deleted by few users, it is the Governance mode.

S3 Glacier retrieval policies

When data is stored in Glacier at some point we like to retrieve this data. The retrieval of data comes at a cost. There are three data retrieval options:

  1. Free tier:
    • You can this to avoid any data retrieval costs
    • S3 rejects retrieval requests that exceeds your AWS Free Tier allowance
  2. Max Retrieval:
    • When you want to retrieve more data than your free tier allowance. (it’s a good practice to set first Free tier then to Max Retrieval)
  3. No Retrieval Limit
    • this is the default option and there is a data retrieval cost.

S3 Glacier Vault Lock

Glacier supports lock policies, these policies allows you to create JSON policy that controls how glacier objects can be deleted etc.

AWS EC2

Key points to remember:

Understand the difference between the three payment plans

On-Demand instances:

  • pay per hour
  • Use cases: development/ testing environment.
  • Going to production not 100% confident of your resource utilisation, go for On-demand first then Reserved

Reserved instances

  • 1 or 3-year agreement (up to 75% discount in 2020 this has increased to 75%)
  • Capacity reservation is possible with Reserved instances!
  • Use cases: production-ready applications, applications that will be live for a long time.

Spot Instances

  • bid on the price
  • Use cases: applications that termination of an instance won’t cause an issue.

Saving Plans:

This is a new pricing plan: It’s a flexible plan, it is similar to the Reserved plan. The discount is not tied to the instance type but to the usage. This also applies to Fargate

Dedicated Hosts

  • You purchase the physical EC2 machine. Ideally, this is used for compliance purposes only

How does Spot Instance works: If the price of the spot instance goes above the bid price, or there is not enough capacity, the AWS EC2 instance will receive a termination notice and will be terminated in two minutes.

There are three tenancy options:

    1. Shared Tenancy: Multiple instances run on the same hardware (default option)
    2. Dedicated instance: A dedicated hardware that runs only a single customer instance
    3. Dedicated host: Physical server with full EC2 capacity dedicated to the user. It’s mostly used for Licensing reasons.

Launch Templates Similar to Launch config but it comes with Versioning. Launch templates can have different versions with each can have different parameters. They are used in autoscaling policies

Instance store data is lost when an Amazon EC2 instance is restarted or terminated. It is temporary data!

Public IP is changed on a stop/start of an instance. To avoid the change of IP, associate an Elastic IP to your instance.

Elastic IP attached to an EC2 instance will not incur any charge! But, if it is not associated you will be charged.

Bootstrapping: This allows you to execute a script when an instance is booted. Usually, it involves installing certain packages or configuring Chef/Puppet.
Ideally, you would write a cloudinit script that would install Chef/Puppet or Ansible then they would handle the provisioning of the instances

Enhanced Networking: This is a feature in AWS EC2 which improves network connectivity. Note: Only specific EC2 types support it and can be enabled in a VPC only.

Termination Protection prevents accidental deletion of an EC2 instance.

Placement Group:  Let’s you place multiple AWS EC2 instances in a group, this will provide a lower network latency.

AWS Databases:

AWS Aurora Serverless is a database type that you don’t need to manage the scaling of it. Unlike RDS which you pay for the instance type, in Aurora Serverless you pay for the storage and capacity only.

Aurora – Multi Master

  • In case you want immediate failover for write node (HA)
  • Every node does R/W 

Aurora – Global Aurora 

  • Cross-region read replicas

Aurora Global Database (recommended)

  1. 1 primary region (read/write)
  2. Up to 5 secondaries (read-only) regions/ replication lag is less than 1 second
  3. Up to 16 Read Replicas per secondary region
  4. Helps for decreasing latency
  5. Exam tip: If it is asking for replication less than 1 second it is Global Aurora

In my experience these are the most common point that you should know before taking up the exam:

  1. Understand the difference between
    1. OLTP – Online Transaction Processing (database types: AWS RDS, AWS Aurora)
    2. OLAP – Online Analytic Processing (AWS Redshift)
  2. Understand the difference between
    • RPO (Recovery Point Objective) – the acceptable data loss.
    • RTO (Recovery Time Objective) – the time in future that your application can be live from failure.
  3. Manual DB Snapshots are not deleted automatically compared to Automated DB Snapshots!
  4. To create a fault-tolerant and high available database architecture, implement Multi-AZ. When the master database fails, the slaves will become the master.
  5. Use the DNS name in your application to connect to the database. If the database fails, AWS will update the records so it won’t impact your application. (Used in Multi AZ)
  6. Use Read Replicas in a heavy read traffic website. To offload the load from the master database
  7. Using AWS Redshift bulk import command is much more efficient than raw SQL Queries.
  8. Amazon DynamoDB is the AWS Managed NoSQL database.

CloudFront

  • CDN of AWS
  • To reduce the distance between the user and the webserver, CDN stores the cached version
    of the content in various locations called “edge locations” . The user is routed to the nearest
    edge location.
  • Cloudfront can work with AWS resources such as S3 or non-AWS resources (websites not
    hosted on AWS)

Key points in CloudFront

  • Distributions: the CDN domain name. In order to use CloudFront you would need to create a
    distribution d123123.cloudfront.net . All you need to do is replace your domain name with
    the distribution name. ex: https://infinitypp.com/media/user-pic.jpg would be
    https://d123123.cloudfront.net/media/user-pic.jpg
  • Origins: the location in which CloudFront fetches files. CloudFront can fetch files from the
    below:

    • S3 Bucket.
    • Custom origin.
    • EC2 Instance.
    • ELB
    • AWS Elemental MediaPackage Endpoint
    • AWS Elemental MediaStore container
  • Cache Control: the default expiry time is after 24 hours. This can be controlled by using the
    Cache-Control Header. To remove files from the cache the invalidation API needs to be
    called.

ElastiCache

Managed in-memory cache which lets you store data. To offload database load, you can cache the DB results in ElastiCache

THERE ARE TWO TYPES OF ELASTICACHE
  • Redis Fully managed Redis engine.
    1. Allows persistence data storage
    2. Atomic operations
    3. Pub/sub messaging
  • Memcached
    1. Low maintenance
    2. Multithreading
    3. Memcached doesn’t support persistence storage!

 

There are more topics to be covered but the above should be good enough for a review of the important sections. I’ll post part 2 in sometime soon.