Amazon Database

Database and AWS


  1.  Relational
  2.  NoSQL

Relational is most common

  •  Online Transaction Processing OLTP, frequent writing
  •  Online Analytic Processing OLAP, data warehouse

Amazon Relational Database Service (RDS)

supports 6 types of database engines

  1. MySQL
  2. Oracle
  3. PostrgreSQL
  4. Microsoft SQL Server
  5. MariaDB
  6. Amazon Aurora

Data Warehouses – Amazon Redshift

central repository for data from many sources

  • OLAP
  • Batch jobs

Amazon Redshift is designed for OLAP

NoSQL Amazon DynamoDB

  • allows for horizontal scalability
  • non-relational
  • do not have same table and columns as relational
  • Key/Value store or document stores
  • used for managing
    • user session state
    • user profiles
    • shopping cart
    • time series database

Amazon RDS simplifies:

  • setup
  • operations
  • scaling

Amazon RDS offload common tasks such as:

  • backups
  • patching
  • scaling
  • replication

Amazon RDS has no shell access and restricts access to certain system procedures and tables

DB Parameter group

a container for engine config values to be applied to one or more DB instances

DB Options

a container for engine features, ex: Oracle statspack

Operations Benefits

  • reliability
  • simplify common tasks

Database Engines


support Multi-AZ deployment for HA and Read Replicas for horizontal scaling


support Multi-AZ deployment for HA and Read Replicas for horizontal scaling


support Multi-AZ deployment for HA and Read Replicas for horizontal scaling



  • Standard Edition One
  • Standard Edition
  • Enterprise Edition

Microsoft SQL Server


  • Express Edition
  • Web Edition
  • Standard Edition
  • Enterprise Edition


Oracle and Microsoft SQL server

  • License include
    • AWS holds licenses
    • included in price
    • Oracle – Standard Edition One
    • Microsoft – Express, Web, Standard

Amazon Aurora

5 times the performance of MySQL

Creates a DB cluster that has one or more instances and a cluster volume. Cluster Volume is a virtual database storage volume spanning multiple AZs.

Primary Instance is the main instance that supports both read and write

Amazon Aurora Replicas are secondary instances that supports read operations

each DB cluster can have up to 15 Amazon Aurora Replicas in addition to primary instances.

Can Locate Replicas in multiple AZs

Storage Options

  • users EBS
  • 4 to 6 TB
  • up to 30,000 IOPS

Backup and Recovery

Automated backups

tracks change sand backs up DB

creates storage volume snapshot.

Backup Retention Period

  • Default: retains 1 day of backups
  • Max: 35 days of backups

Delete DB and all automated backups are deleted also

Daily during a configurable 30 minute maintenance window

Manual Backups

manually initiated and snapshots are kept until explicitly deleted


when restored and only default Security Group and DB parameter is set, After completing recovery you must reset Security Group and custom DB parameters

High Availability with Multi-AZ

allows you to crate a dB cluster across Multiple AZs.

Lets you meet:

  • RPO – recover point objective
    • max period of data loss that is acceptable
  • RTO – recovery time objective
    • max amount of downtime that is permitted to recover from backup

Automatically replicated data from primary to slave

Automatically performs failover in the event:

  • loss of primary AZ
  • loss of network connectivity to primary
  • compute unit failure on primary
  • storage failure on primary

Failover takes 1-2 minutes

Scale up and out

when going up in compute, memory or storage, changes are scheduled during next maintenance or immediately using ModifyDBInstance

Horizontal Scalability with Partitioning

partitioning or sharding allows horizontal scaling

NoSQL are designed to scale horizontally

Horizontal Scalability with Read Replicas

  • heavy read workloads
  • offload reporting

To enhance disaster recovery or reduce global latencies use cross region read replicas to serve read requests


  • use IAM Policies
  • Private Subnets
  • ACL
  • Security Groups
  • SSL

Amazon Redshift

  • petabyte scale data warehouse
  • relational DB for OLAP
  • uses PostgreSQL

Clusters and Nodes

key component of Redshift is a cluster


  • Leader node
  • one or more compute nodes

client only interacts with leader node

Dense Compute

up to 32 TB using fast SSDs

Dense Storage

up to 2 PB using magnetic disks

Increase query performance by adding compute nodes

Resize Operation on Redshift

  • creates new cluster
  • DB becomes read only until finished

Amazon Redshift uses compression

Samples then selects best compression scheme

Distribution Styles

  • Even: default
  • Key: distributed based on values in one column
  • All: full copy of table distributed to every node

Sort Keys

  • compound: more efficient when using prefixes
  • interleaved: equal weight to each column

Loading Data

  • standard insert and update
  • bulk uses Copy
  • Unload for exporting

Query Data

  • Workload Management (WLM): queues and prioritized queries


  • automatically stored internally on S3 but can do manual snapshots

Amazon DynamoDB

NoSQL – fast, low latency

must define a certain read or write capacity and Amazon DynamoDB auto adds capacity to support it

  • on fast SSDs
  • requires tables have a primary key
  • stores info as key/value pair

Supports 5 Scalar Data Types

  • String
  • Number
  • Binary
  • Boolean
  • Null

Set Data Types

unique list of one or more scalar types

  • String
  • Number
  • Binary

Document Data Types

represents multiple nested attributes

  • List: ordered list
  • Map: unordered list

Primary Key is made of one attributes

Partition and Sort key

  • primary key is mad of two attributes
  • first attribute is partition key
  • second attribute is sort (or range) key

Secondary Indexs can be made on a table

  • Global
  • Local


  • Query: primary search operation that requires a partition key attribute name and a distinct value
  • Scan: reads every time in table