What is AWS and why should you care?
The behemoth that is Amazon.com shared it's 2017 Q2 results recently, and in revenue terms showed no sign of slowing it's impressive long term surge to become one of the largest companies in the world. Near the end of July, the Amazon share price peaked at over $1,052 briefly moving CEO and founder Jeff Bezos ahead of Bill Gates at the top of the global rich list.
Amazon is famous for it's e-commerce business but in recent times much of the market interest regarding their financial results revolves around AWS.
AWS stands for Amazon Web Services, and is the company's subsidiary that offers on-demand cloud computing platforms. Across 2016, AWS brought in revenue of US$12.2 billion and with US$3.66 billion in Q1 2017 (up 42% YoY) and US$4.10 billion in Q2 2017 (also up 42% YoY) you can see why investors get so excited.
Amazon themselves describe AWS as "a secure cloud services platform, offering compute power, database storage, content delivery and other functionality to help businesses scale and grow"
AWS is sometimes referred to as "the largest company you've never heard of" as they're providing web and data platforms for some of the largest companies in the world including Airbnb, Reddit, and (direct competitor to it's Prime Video service) Netflix! However, the fact it's products are priced on-demand (pay for what you use) in most cases means that their solutions are just as applicable and cost effective for a global corporation as they are for a keen data scientist working on a Kaggle competition at home.
AWS offer over 70 products and services which can be set up and controlled within the AWS web based console. Many of which will be of specific interest to analysts and data scientists alike, I've summarised a couple below:
Elastic Compute Cloud (EC2) - each EC2 'instance' is essentially a virtual cloud based machine on which you can store and use pretty much any software you like. These virtual machines can be set up as Windows or Linux and can be optimised for memory, performance, and storage - and are priced accordingly by hour of usage. It's easy enough to try this out for free however, AWS offer a year free trial allowing you to use a t2.micro instance. There are pre-build AMI's (Amazon Machine Image) which enable you to easily install a web-based version of R-Studio for example
Lambda - enables you to run code/requests/functions in a server-less environment meaning there is potential for huge cost savings. Lambda requires code in either Node.js, Java, C#, or Python. These requests can be set up to run automatically, i.e. when a new object is added to your AWS S3 bucket it is then transferred or re-sized, or a new order comes in to your online business then this data is formatting and transferred to Redshift or DynamoDB. Keep an eye on AWS Lambda in the future, it could really revolutionise compute functionality for businesses.
Redshift - A data warehouse product for structured data that uses PostgreSQL and offers petabyte scale warehousing. It uses parallel query execution so can be extremely fast. Again, it's priced on-demand and can be far cheaper than traditional server based solutions
DynamoDB - A NoSQL database service for unstructured data
Simple Storage Solution (S3) - use a web interface to access stored objects & data. Again, it's on-demand pricing means you only pay for what you use
Elastic MapReduce (EMR) - Gives you the ability to spin up a powerful Hadoop clusters in a matter of minutes. It allows you to access huge computing and processing power but keep costs to a minimum as you can tell the cluster to automatically terminate once processing is complete. EMR allows you to incorporate other key tools like Apache Spark, Apache Zeppelin, and Hue
Lex - Chat-bots that use Deep Learning (specifically Natural Language Processing) enabling you to create applications that engage with users or customers
Rekognition - A service that allows you to instantly apply image recognition, facial recognition and analysis, facial similarity, and image moderation. This is one of my favourites to play around with. You can try their demo here - however you'll have to create an account first.
Machine Learning - A service that allows anyone to gain the benefits of advanced and scalable Machine Learning models and algorithms
Game & App development platforms, Internet of Things, Business Productivity tools - the list goes on! Well worth a look just to see what is possible using AWS!
To summarise, from an analytics and data science/machine learning point of view, I think AWS are really pushing the boundaries within this industry and are leading the way. Google, Microsoft, and IBM are well aware of this and are doing there best to try and scrape together some market share so it'll be an interesting next few years in this space.
If you're not using AWS already, I'd definitely recommend signing up to the year free trial and having a bit of a play with what is available. For your job and career prospects it's well worth a basic knowledge of what can be done!
Hopefully this has been an interesting introduction into the world of AWS. Please feel free to share!