Mastering Apache Cassandra 3.x an expert guide to improving database scalability and availability without compromising performance

Build, manage, and configure high-performing, reliable NoSQL database for your applications with Cassandra Key Features Write programs more efficiently using Cassandra's features with the help of examples Configure Cassandra and fine-tune its parameters depending on your needs Integrate Cassand...

Full description

Bibliographic Details
Other Authors: Ploetz, Aaron, author (author), Malepati, Tejaswi, author, Neeraj, Nishant, author
Format: eBook
Language:Inglés
Published: Birmingham : Packt 2018.
Edition:Third edition
Subjects:
See on Biblioteca Universitat Ramon Llull:https://discovery.url.edu/permalink/34CSUC_URL/1im36ta/alma991009631843006719
Table of Contents:
  • Cover
  • Title Page
  • Copyright and Credits
  • Packt Upsell
  • Foreward
  • Contributors
  • Table of Contents
  • Preface
  • Chapter 1: Quick Start
  • Introduction to Cassandra
  • High availability
  • Distributed
  • Partitioned row store
  • Installation
  • Configuration
  • cassandra.yaml
  • cassandra-rackdc.properties
  • Starting Cassandra
  • Cassandra Cluster Manager
  • A quick introduction to the data model
  • Using Cassandra with cqlsh
  • Shutting down Cassandra
  • Summary
  • Chapter 2: Cassandra Architecture
  • Why was Cassandra created?
  • RDBMS and problems at scale
  • Cassandra and the CAP theorem
  • Cassandra's ring architecture
  • Partitioners
  • ByteOrderedPartitioner
  • RandomPartitioner
  • Murmur3Partitioner
  • Single token range per node
  • Vnodes
  • Cassandra's write path
  • Cassandra's read path
  • On-disk storage
  • SSTables
  • How data was structured in prior versions
  • How data is structured in newer versions
  • Additional components of Cassandra
  • Gossiper
  • Snitch
  • Phi failure-detector
  • Tombstones
  • Hinted handoff
  • Compaction
  • Repair
  • Merkle tree calculation
  • Streaming data
  • Read repair
  • Security
  • Authentication
  • Authorization
  • Managing roles
  • Client-to-node SSL
  • Node-to-node SSL
  • Summary
  • Chapter 3: Effective CQL
  • An overview of Cassandra data modeling
  • [Cassandra storage model for versions 3.0 and beyond]
  • Cassandra storage model for versions 3.0 and beyond
  • Data cells
  • cqlsh
  • Logging into cqlsh
  • Problems connecting to cqlsh
  • Local cluster without security enabled
  • Remote cluster with user security enabled
  • Remote cluster with auth and SSL enabled
  • Connecting with cqlsh over SSL
  • Converting the Java keyStore into a PKCS12 keyStore
  • Exporting the certificate from the PKCS12 keyStore
  • Modifying your cqlshrc file
  • Testing your connection via cqlsh.
  • Getting started with CQL
  • Creating a keyspace
  • Single data center example
  • Multi-data center example
  • Creating a table
  • Simple table example
  • Clustering key example
  • Composite partition key example
  • Table options
  • Data types
  • Type conversion
  • The primary key
  • Designing a primary key
  • Selecting a good partition key
  • Selecting a good clustering key
  • Querying data
  • The IN operator
  • Writing data
  • Inserting data
  • Updating data
  • Deleting data
  • Lightweight transactions
  • Executing a BATCH statement
  • The expiring cell
  • Altering a keyspace
  • Dropping a keyspace
  • Altering a table
  • Truncating a table
  • Dropping a table
  • Truncate versus drop
  • Creating an index
  • Caution with implementing secondary indexes
  • Dropping an index
  • Creating a custom data type
  • Altering a custom type
  • Dropping a custom type
  • User management
  • Creating a user and role
  • Altering a user and role
  • Dropping a user and role
  • Granting permissions
  • Revoking permissions
  • Other CQL commands
  • COUNT
  • DISTINCT
  • LIMIT
  • STATIC
  • User-defined functions
  • cqlsh commands
  • CONSISTENCY
  • COPY
  • DESCRIBE
  • TRACING
  • Summary
  • Chapter 4: Configuring a Cluster
  • Evaluating instance requirements
  • RAM
  • CPU
  • Disk
  • Solid state drives
  • Cloud storage offerings
  • SAN and NAS
  • Network
  • Public cloud networks
  • Firewall considerations
  • Strategy for many small instances versus few large instances
  • Operating system optimizations
  • Disable swap
  • XFS
  • Limits
  • limits.conf
  • sysctl.conf
  • Time synchronization
  • Configuring the JVM
  • Garbage collection
  • CMS
  • G1GC
  • Garbage collection with Cassandra
  • Installation of JVM
  • JCE
  • Configuring Cassandra
  • cassandra.yaml
  • cassandra-env.sh
  • cassandra-rackdc.properties
  • dc
  • rack
  • dc_suffix
  • prefer_local
  • cassandra-topology.properties.
  • jvm.options
  • logback.xml
  • Managing a deployment pipeline
  • Orchestration tools
  • Configuration management tools
  • Recommended approach
  • Local repository for downloadable files
  • Summary
  • Chapter 5: Performance Tuning
  • Cassandra-Stress
  • The Cassandra-Stress YAML file
  • name
  • size
  • population
  • cluster
  • Cassandra-Stress results
  • Write performance
  • Commitlog mount point
  • Scaling out
  • Scaling out a data center
  • Read performance
  • Compaction strategy selection
  • Optimizing read throughput for time-series models
  • Optimizing tables for read-heavy models
  • Cache settings
  • Appropriate uses for row-caching
  • Compression
  • Chunk size
  • The bloom filter configuration
  • Read performance issues
  • Other performance considerations
  • JVM configuration
  • Cassandra anti-patterns
  • Building a queue
  • Query flexibility
  • Querying an entire table
  • Incorrect use of BATCH
  • Network
  • Summary
  • Chapter 6: Managing a Cluster
  • Revisiting nodetool
  • A warning about using nodetool
  • Scaling up
  • Adding nodes to a cluster
  • Cleaning up the original nodes
  • Adding a new data center
  • Adjusting the cassandra-rackdc.properties file
  • A warning about SimpleStrategy
  • Streaming data
  • Scaling down
  • Removing nodes from a cluster
  • Removing a live node
  • Removing a dead node
  • Other removenode options
  • When removenode doesn't work (nodetool assassinate)
  • Assassinating a node on an older version
  • Removing a data center
  • Backing up and restoring data
  • Taking snapshots
  • Enabling incremental backups
  • Recovering from snapshots
  • Maintenance
  • Replacing a node
  • Repair
  • A warning about incremental repairs
  • Cassandra Reaper
  • Forcing read repairs at consistency - ALL
  • Clearing snapshots and incremental backups
  • Snapshots
  • Incremental backups
  • Compaction.
  • Why you should never invoke compaction manually
  • Adjusting compaction throughput due to available resources
  • Summary
  • Chapter 7: Monitoring
  • JMX interface
  • MBean packages exposed by Cassandra
  • JConsole (GUI)
  • Connection and overview
  • Viewing metrics
  • Performing an operation
  • JMXTerm (CLI)
  • Connection and domains
  • Getting a metric
  • Performing an operation
  • The nodetool utility
  • Monitoring using nodetool
  • describecluster
  • gcstats
  • getcompactionthreshold
  • getcompactionthroughput
  • getconcurrentcompactors
  • getendpoints
  • getlogginglevels
  • getstreamthroughput
  • gettimeout
  • gossipinfo
  • info
  • netstats
  • proxyhistograms
  • status
  • tablestats
  • tpstats
  • verify
  • Administering using nodetool
  • cleanup
  • drain
  • flush
  • resetlocalschema
  • stopdaemon
  • truncatehints
  • upgradeSSTable
  • Metric stack
  • Telegraf
  • Installation
  • Configuration
  • JMXTrans
  • Installation
  • Configuration
  • InfluxDB
  • Installation
  • Configuration
  • InfluxDB CLI
  • Grafana
  • Installation
  • Configuration
  • Visualization
  • Alerting
  • Custom setup
  • Log stack
  • The system/debug/gc logs
  • Filebeat
  • Installation
  • Configuration
  • Elasticsearch
  • Installation
  • Configuration
  • Kibana
  • Installation
  • Configuration
  • Troubleshooting
  • High CPU usage
  • Different garbage-collection patterns
  • Hotspots
  • Disk performance
  • Node flakiness
  • All-in-one Docker
  • Creating a database and other monitoring components locally
  • Web links
  • Summary
  • Chapter 8: Application Development
  • Getting started
  • The path to failure
  • Is Cassandra the right database?
  • Good use cases for Apache Cassandra
  • Use and expectations around application data consistency
  • Choosing the right driver
  • Building a Java application
  • Driver dependency configuration with Apache Maven
  • Connection class.
  • Other connection options
  • Retry policy
  • Default keyspace
  • Port
  • SSL
  • Connection pooling options
  • Starting simple - Hello World!
  • Using the object mapper
  • Building a data loader
  • Asynchronous operations
  • Data loader example
  • Summary
  • Chapter 9: Integration with Apache Spark
  • Spark
  • Architecture
  • Installation
  • Running custom Spark Docker locally
  • Configuration
  • The web UI
  • Master
  • Worker
  • Application
  • PySpark
  • Connection config
  • Accessing Cassandra data
  • SparkR
  • Connection config
  • Accessing Cassandra data
  • RStudio
  • Connection config
  • Accessing Cassandra data
  • Jupyter
  • Architecture
  • Installation
  • Configuration
  • Web UI
  • PYSpark through Juypter
  • Summary
  • Appendix: References
  • Chapter 1 - Quick Start
  • Chapter 2 - Cassandra Architecture
  • Chapter 3 - Effective CQL
  • Chapter 4 - Configuring a Cluster
  • Chapter 5 - Performance Tuning
  • Chapter 6 - Managing a Cluster
  • Chapter 7 - Monitoring
  • Chapter 8 - Application Development
  • Chapter 9 - Integration with Apache Spark
  • Other Books You May Enjoy
  • Index.