Module 5 Storage and Databases

Instance Stores and Amazon Elastic Block Store (Amazon EBS)

Block Level Storage
- to store files (bytes stored on disk)
- When data change only change that section
- Hard drive
- File systems, databases use it
Instance STore Volumes
- storage that can be provided by EC2 instance
- Physical attached to host that EC2 is running on
- If EC2 is terminated, the data on the ISV is deleted
  - As the EC2 might start up on another host
- Useful for temp data, scratch files, data that can be easily recreated without consequence
- Not good for persisting data outside of lifecycle of EC2
Amazon Elastic Block Store (EBS)
- Create virtual Hard drives (volumes) that are attached to the EC2
- Not tied to the host
- persist data outside of lifecycle of EC2
- Can configure the type you want (size, type) and attach it the EC2
- Can take incremental back ups of data (snapshots)
  - configurable
  - only the blocks of data that have chagned are saved
- Up to 16TB
- SSD by default, but HDD options

Amazon Simple Storage Service (Amazon S3)

Data to be stored somewhere
Store and retrieve unlimited amount of data
Data stored as objects
- each object contains data, metadata and key
  - metadate = info about data, how its used, object size
  - key = unique id of object
- When object is updated, the whole object is modified
- Stored in buckets
- Max size of object =5Tb
- Used for write once read many
- Each object has a url
- Can version objects (keep history of object, can rollback if deleted)
- Create permissions (visibility, write access) for multiple buckets
- Different tiers/classes
Storage classes
- Standard
  - 11 9s of durability (remain intact for one year)
  - for frequent access
  - Multiple copies are stored in 3 availability zones
  - Content distributino
  - data analytics
  - Static web hosting
    Load all static files (html etc) to S3 and check box to host it as site
- Standard Infrequent Access (STandard-IA)
  - For data that is accessed less frequently, but requires rapid access when needed
  - backups, disasted recovery files, or long term storage
  - Audit data, stored for seveal years can be moved to other classes
  - Lower storage price
  - higher retrieval price
  - Multiple copies are stored in 3 availability zones
- One Zone Infrequent Access
  - 1 copy are stored in 1 availability zones
  - Lower storage price than STandard IA
  - Saving costs on storage
  - Can easily reproduce data incase of failure of zone or loss of data
- Intelligent Tiering
  - Data with unknown or changing access patterns
  - Monthly monitroing and automation fee per object
- Glacier Instant Retrieval
  - For archived data that needs immediate access
  - Access time of milliseconds (same perfromance as standard)
- Glacier Flexible Retrieval
  - Low cost storage
  - Takes 1 minutes to 12 hours to access data
  - Audit data, stored for several years
  - Use vaults
- Glacier Deep Archive
  - Lowest cost
  - Retrieve within 12 to 48 hours
  - Long retention
  - aim for 1/2 times a years access
  - Multiple copies are stored in 3 availability zones
- Outposts
  - Creates buckets on Outposts
  - Easier to retrieve
  - Puts it on your on premise site
Lifecycle polices
- Setup rules to move data between tiers
- ie after x days move to another class
- Default polics
  - haven’t accessed an object for 30 consecutive days, Amazon S3 automatically moves it to the infrequent access tier, S3 Standard-IA. If you access an object in the infrequent access tier, Amazon S3 automatically moves it to the frequent access tier, S3 Standard.
EBS vs S3

Amazon Elastic File System (EFS)

A type of file storage
- a storage server uses block storage with a local file system to organize files. Clients access data through file paths.
Ensures that
- access the same data at the same time
- Storage can handle the amount of data
- scale with increase demand
- that backups are taken
- data is stored is redundantly
- management of servers holding data
EFS handles all this
EFS
- Multiple instances can access (read/write) the data in EFS at the same time
- Linux file system
- regional resource
- automatically scales
- In different availability zones
  - allows for concurrent access
difference with EBS
- does not scale, once you attach it to EC2 thats it
- EBS must be in same availability zone
Can be AWS cloud service or on prem
- on prem access with AWS Direct Connect

Amazon Relational Database Service (RDS)

RDBS useful for data stored that has relationships with other stored data
Data is stored in tables
Tables are defined by schemas
relationships between data in tables is done via a key
Querying is done via standard language such as SQL
- so is defining the schema of the tables (definition)
- And commands (updates/delets/writes)
Supported DB
- postgres
- mysql
- oracle
- sql server
Security
- at encryption at rest
- encryption in tranist
Migrate database from on prem to aws
- Lift and shift migration
- Have more control over OS, memory, CPU, storage capacity, etc
Can use Amazon RDBS, a managed db service
- supportst the major DB engines
- Hardware provisionin
- Automated patching
- backups
- redundancy
- failover
- disaster recovery
Amazon Aurora
- enterprise class rdbs
- a more managed db system
- mysql or postgres flavours
  - 5 times faster than mysql and 3 time sfaster than postgres
- Reduces unecessary IO
- cheaper than other db engines
- data is replicated across facilities (6 copies at any one time)
- up to 15 read replicas
- Continuous back up to S3
Can be slow, due to the overhead of the queries/commands over several tables
For business analytics, over many tables

Amazon DynamoDB

Serverless DB
Data stored in tables, as items with attributes
Handles the storage, automatic scaling, stored redundently accross multiple AZ,
millisecond respone time
- Dont have to provision, patch or manage servers, or install, maintain or operate software
Does not use sql, does not need to define schema
Useful for data that is not rigid (ie cannot be defined by a scehma) and need high performance
Non relational db
Have simple flexible schemas
Can add/remove attributes to a table at any time
NOt every item must have the same attributes
Store data as key-value pairs
- key = items
- value = attirbutes
Queries are much simpler
- focus on collection of items from one table
- Not on queries from multiple tables
- leads to quick response time and high scalability
It is purpose built and fits a specific usecase
Most data is used for lookup lists
- this can be done via non relational DB rather than sqlDB

Amazon Redshift

Used for data analysing what happened
Using traditional RDBMS for querying data which is constantly updated
- causes performance issues
- used for high speed real time ingestion, rather than complex queries over ltos of data
- VAriety of data that is spread out has issues with this analytics
USe of data wharehousing
- engineered for big data and historical analystics instead of operational analysis
- For questions about looking backwards,rather than looking at the current information for current processing (which is what RDBMS is built for)
Redshift
- DW that is tuned, resiliant and highly scalable
- Nodes can handle mutliple PBytes

AWS Database Migration Service (AWS DMS)

Help migrate DB onto AWS securly adn easily
Source DB remains fully operational during the migration
- reduces the downtime
Dont have to migrate to the same type of DB
- Same type migrations = homogenous
  - straigthforward
- source and target are different = hetrogenous
  - Two step process
    Need to convert schema structure/data types and db code using AWS Schema Conversion Tool to match the target db
    Then use DMS to migrate the data
Can migrate from from on prem to EC2 or RDS
Other migrations include
- dev/test db migrations
  - copy prod data to test env (one off or continuously)
- db consolidations
  - have multipe db but move to one db
- continous db replication
  - for disaster recovery or geographic separation

Additional Database SErvices

Amazon DocumentDB
- Document db
- supports MongoDB
Amazon Neptune
- graph DB
- works with highly connected datasets
- ie recommendation engines, fraud detection, and knowledge graphs.
Amazon Quantum Ledger Database (Amazon QLDB)
- review a complete history of all the changes that have been made to your application data.
- Data never deletd
Amazon Managed Blockchain
- create and manage blockchain networks with open-source frameworks.
- Blockchain is a distributed ledger system that lets multiple parties run transactions and share data without a central authority.
Amazon ElastiCache
- adds a caching layer on top of db
- improve read times for common requests
- two types: redis and memcached
Amazon DynamoDB Accelerator
- in memory cahce for dynamo db
- millis to micro

Links

https://aws.amazon.com/products/storage
https://aws.amazon.com/blogs/storage/
https://aws.amazon.com/getting-started/hands-on/?awsf.getting-started-category=category%23storage&awsf.getting-started-content-type=content-type%23hands-on
https://aws.amazon.com/solutions/case-studies/?customer-references-cards.sort-by=item.additionalFields.publishedDate&customer-references-cards.sort-order=desc&awsf.customer-references-location=*all&awsf.customer-references-segment=*all&awsf.customer-references-product=product%23vpc%7Cproduct%23api-gateway%7Cproduct%23cloudfront%7Cproduct%23route53%7Cproduct%23directconnect%7Cproduct%23elb&awsf.customer-references-category=category%23storage
https://aws.amazon.com/dms/
https://aws.amazon.com/products/databases
https://aws.amazon.com/getting-started/deep-dive-databases/
https://aws.amazon.com/blogs/database/
https://aws.amazon.com/solutions/case-studies/?customer-references-cards.sort-by=item.additionalFields.publishedDate&customer-references-cards.sort-order=desc&awsf.customer-references-location=*all&awsf.customer-references-segment=*all&awsf.customer-references-product=product%23vpc%7Cproduct%23api-gateway%7Cproduct%23cloudfront%7Cproduct%23route53%7Cproduct%23directconnect%7Cproduct%23elb&awsf.customer-references-category=category%23databases

PreviousModule 4 Networking NextSecurity

Last updated 1 year ago

Was this helpful?

Instance Stores and Amazon Elastic Block Store (Amazon EBS)

Amazon Simple Storage Service (Amazon S3)

EBS vs S3

Amazon Elastic File System (EFS)

Amazon Relational Database Service (RDS)

Amazon DynamoDB

Amazon Redshift

AWS Database Migration Service (AWS DMS)

Additional Database SErvices

Links