Apache Cassandra

Reading Time: 2 minutes

Apache Cassandra Course Syllabus

    Module 1 : What is Big data?

    • Feature and benefit of Big data
    • Explain on Velocity, Variety and Veracity
    • Characteristics of big data
    • Big data Architecture
    • Distributed Computing
    • Overview of Stream and Batch processing

    Module 2 : Introduction to Cassandra

    • Understanding NoSQL
    • Types of NoSQL databases
    • What is wrong in Relational database
    • NoSQL Ecosystem
    • Overview of Cassandra
    • Feature of Cassandra
    • High availability
    • Replication and Multiple data centers

    Module 3 : Architecture of Cassandra

    • Understanding high level Cassandra architecture
    • Peer-to-Peer design
    • Network topology
    • Virtual Node
    • Components of Cassandra
    • Partitioner and Replication
    • Memtables and SSTables
    • Bloom Filters
    • Managers and Services
    • Cassandra read and write process
    • Failure scenario

    Module 4 : Installation and Configuration

    • Overview of Cassandra version history
    • Pre-requisite for Installation
    • Installing Cassandra from binary
    • Verify and running Cassandra
    • Command-line client Interface
    • CLI Commands
    • Logging setup in Cassandra
    • Replication Factor
    • Create and setup Cluster
    • Miscellaneous setting

    Module 5 : Cassandra Data Model

    • Introduction to Data Model
    • Data Types and Dynamic Columns
    • Amend Data types
    • Counter Column
    • Cassandra DDL and DML process
    • Composite Keys
    • Collection Columns in Cassandra
    • Table Operations and CURD on Cassandra
    • Design between RDBMS and Cassandra
    • Best Practice on Data Model

    Module 6 : Indexes and Composite

    • Overview of Index and benefit
    • Index on Distributed Database
    • Clustered Indexes vs Non-Clustered Indexes
    • Secondary Index
    • Composite Columns
    • Data Partitioning
    • Data Colocation

    Module 7 : Working with MapReduce

    • Overview of MapReduce
    • Batch Processing
    • MapReduce Integration with Cassandra
    • Thrift way
    • Stream Analytics

    Module 8 : Cassandra Query Language(CQL)

    • Introduction to CQL
    • Syntax on CQL
    • Database User and Roles
    • Control Commands
    • Data Definition and Manipulation
    • Complex query
    • Built-in and User defined Function
    • Run CQL Scripts from the command line
    • JSON support

    Module 9 : Cassandra Interfaces

    • Java interfaces to connect Cassandra
    • ODBC interface to connect Cassandra

    Module 10 : Data Migration

    • Understanding Data Migration and Analytics
    • Understanding Pig
    • Pig with Cassandra
    • Apache Hive
    • Understanding UDF
    • Hive with Cassandra
    • Data Migration from any Database to Cassandra

    Module 11 : Performance Tuning and Monitoring

    • Understanding Performance Indicators
    • CPU and Memory resource utilization
    • Understand Logical and Physical Reads
    • JVM Setting
    • Concurrency
    • Configuration for Data Cache
    • Off-Heap Vs On-Heap
    • Stress Testing for Cassandra
    • Overview of Monitoring
    • Cassandra Monitoring Tools
    • Logging
    • Cassandra MBeans
    • Health Check