Big Data and Analytics - General Notes and Resources.
Docker is a containerization platform. Docker is primarily developed for Linux, but there is a limited feature version for Windows as well.
This section contains notes on Amazon EC2 and related compute services.
Note: AWS Elastic Beanstalk is not covered here, but under the section for management services.
This section contains the study notes for DynamoDB. Initial set of notes cover the essentials, which is followed by some advanced topics. If you have any doubts or questions, please ask as a comment and if it is a valid question we will create a page and share the link.
This section of notes will cover the S3 and Storage Services topics for the associate level certifications for Amazon Web Services (AWS).
Here I will add notes on security basics and IAM. While some of them are very basic some of them might be best understood after understanding other services. So please feel free to come back and read them again once you learn notes from other books.
Apache Spark is a unified analytics engine for big data processing, with built-in modules for streaming, SQL, machine learning and graph processing.
Apache ZooKeeper is a software project of the Apache Software Foundation, providing an open source distributed configuration service, synchronization service, and naming registry for large distributed systems.
Apache Kafka is an open source publish-subscribe based distributed messaging system. From the architecture perspective, Kafka is closer to traditional messaging systems such as ActiveMQ or RabitMQ. However from a Big Data and Hadoop perspective, Kafka can be compared with Scribe or Flume as it is useful for processing activity stream data.