Saturday, August 17, 2013

Fun with AWS Architecture

...or What I did on my summer vacation.

OK I only took two days off to review some things I learned in 3 day AWS architecture class...

http://aws.amazon.com/training/architect/

I do actually take real vacations on occasion. But here's what I have been up to...

Route 53 (DNS Hosting)

Moved all my domains off expensive, redundant DNS hosting to Route 53. DNSStuff.com says there's something odd about the configuration however the sites run quickly and allows me to do some interesting things with static content on S3 (more on that to follow). I was able to export the zone file from EasyDNS and import it into Amazon. I'm still testing this out but so far so good. EasyDNS has served me well but is a bit expensive now compared to Amazon DNS and Amazon Route 53 is the only service they offer with 100% SLA because they have servers in something like 43 locations. That's some decent coverage.

http://aws.amazon.com/route53/

IAM Roles (Security)

AWS security allows you to set up roles and when an EC2 instance (server or virtual server) is launched you give it a role. That role is allowed to do certain things. This prevents having to hard code permissions in files on the server or embed security credentials in an application. The application specifies the role it should use and gets a temporary token from the AWS Security Token Service. Amazon has set up instances to securely manage the roles and rotate permissions periodically.

http://docs.aws.amazon.com/IAM/latest/UserGuide/role-usecase-ec2app.html

http://docs.aws.amazon.com/STS/latest/UsingSTS/Welcome.html#AccessingSTS

http://docs.aws.amazon.com/STS/latest/UsingSTS/UsingTokens.html#RequestWithSTS

S3 (Application Storage for Static Content)

S3 is for sometimes changing static content for application servers that doesn't really belong in databases. Say images. You can also host a whole static web site on S3 and is cheaper than using an EC2 instance.

http://docs.aws.amazon.com/AmazonS3/latest/dev/website-hosting-custom-domain-walkthrough.html

http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/HowToAliasRRS.html#CreateAliasRRSConsole

There are some benefits to hosting content and images on separate domains for speedier web site loading. I've done this for years but using S3 as opposed to EC2 and saving money is pretty cool. You can also put CloudFront (CDN) in front of it to distribute to multiple parts of the world but I'm not there yet.

For my purposes I set up a static domain and image domain for my content. I made the images publicly accessible. For the other content I want it to go through an analysis engine first so keeping that private and granting access to specific roles:

http://docs.aws.amazon.com/AmazonS3/latest/UG/EditingBucketPermissions.html

http://docs.aws.amazon.com/AmazonS3/latest/dev/AccessPolicyLanguage_UseCases_s3_a.html

DynamoDB (NoSQL Database)

I set up a table in DynamoDB for logging requests. DynamoDB is a NoSQL database hosted on SSD so is very fast. Will be faster to save data here than in a traditional SQL database and possibly cheaper than the SQL Server EC2 instance I'm also hosting. Will do some analysis after getting this all set up. DynamoDB also integrates with Amazon's Hadoop MapReduce service (see below) to run queries and analyze no SQL data efficiently - if data is structured for parallel processing.

http://aws.amazon.com/dynamodb/

SQS (Queue)

I set up an SQS (simple queue service) queue to accept asynchronous request logging messages to feed into an application that will save the data to DynamoDB. The problem when setting up DynamoDB is that it requires estimating throughput and throughput will vary widely for the web sites I host based on time of day. Rather than try to predict I can set the throughput very low to save money there and feed all the requests through the queue. The beauty of the queue is that the queue will save the data up to four days and feed it into DynamoDB over time. I can use the AWS Asynchronous client so it won't hold up the web pages from loading and I'm in no hurry to get it into the database. Additionally this decouples my logging from my application so any issues with logging will not bring down the application - the queue is scalable to handle traffic as needed.

http://aws.amazon.com/sqs/

http://tech.shazam.com/server/using-sqs-to-throttle-dynamodb-throughput/

In addition to the above want to design for horizontal scaling for a flexible, cost-effective architecture that can expand and contract based on usage. Also the way SQS is priced will save money by batching messages:

http://docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/throughput.html


EC2 Instances  (Web Servers)

I changed the code on web servers to test out my theories with one web site. I have a generic servlet which takes a GET HTTP request, sends a message to the logging queue then returns static content. Will add into this business analysis and appropriate decision making. This web server is in a VPC (virtual private cloud - or Amazon's name for a virtual private network) in a public subnet with Internet access to receive web requests.

http://aws.amazon.com/ec2/

VPC (Virtual Private Cloud / Network)

I already had a VPC set up to keep certain servers in private and public subnets but here's some info on that:

http://aws.amazon.com/vpc/

http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-network-security.html

Still working on configuring my Cisco firewall and VPN to work with the above. (Time Factor.)

EC2  (Application Server - Queue Reader Service)

Created application that reads messages from the queue. It will log errors to an error log repository and send the request data to DynamoDB.I am reviewing AWS buffering to determine if there is a more cost effective way to retrieve the messages and also the details to ensure no message is every picked up twice. Initially thought this might require multiple threads to talk to DynamoDB if waiting but DynamoDB is so fast only need one even though chose lowest possible thru put.

Also had to handle errors:

http://docs.aws.amazon.com/AWSSimpleQueueService/latest/APIReference/Query_QueryErrors.html


EMR (Amazon Elastic MapReduce - Hadoop)

Also working on using MapReduce to run queries and generate traffic reports for customers. Amazon's MapReduce basically spins up Hadoop on EC2 instances.

http://aws.amazon.com/elasticmapreduce/

Future plans...


The ultimate goal is to set up a horizontally scalable architecture that can be built from a single file plus backups which expands and contracts automatically based on traffic needs and is fault tolerant. (Fault tolerant such that the Netflix Chaos Monkey could not hurt it: http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html)

Move data, files and application logic around for better security, fault tolerance, performance and cost optimization. Set up load balancing and auto scaling. Check out CloudFront CDN. Should be hosting across multiple availability zones with master/slave database for RDS. Check out the mail and SMS services to replace other things I'm doing. Move backups to Glacier for cost effective archiving. Write CloudFormation scripts to spin up entire architecture from back up with basically one file for disaster recovery. And lots of other fun stuff...