Curriculum

Cassandra Interview Questions & Answers Part 2

In this tutorial, let’s look at the 2nd set of interview questions and answers on Cassandra.

31. What is a tombstone in Cassandra?

A tombstone in Cassandra is a marker that indicates that a particular piece of data has been deleted. Tombstones are used to ensure that deleted data is eventually removed from all replicas of the data.

32. What is a counter in Cassandra?

A counter in Cassandra is a special data type that allows you to perform atomic increments and decrements on a single value. Counters are eventually consistent and have some limitations on their usage.

33. What is a materialized view in Cassandra?

A materialized view in Cassandra is a precomputed view of a table that can be queried independently. Materialized views are updated automatically as the underlying data changes, and they can improve read performance for specific use cases.

34. What is the role of the partition key in Cassandra?

The partition key in Cassandra is used to determine which node in the cluster is responsible for storing a particular piece of data. It is also used to group related data together in the same partition.

35. What is a secondary index in Cassandra?

A secondary index in Cassandra is an index on a non-primary key column that allows you to query the data based on that column. Secondary indexes have some limitations and should be used carefully.

36. What is the difference between a clustering key and a partition key in Cassandra?

In Cassandra, the partition key is used to determine which node in the cluster is responsible for storing a particular piece of data, and it is also used to group related data together in the same partition. The clustering key is used to determine the order of the data within a partition.

37. What is a write-ahead log (WAL) in Cassandra?

A write-ahead log (WAL) in Cassandra is a log of data modifications that is written to disk before the modifications are applied to the in-memory data structures. The WAL is used for crash recovery and durability.

38. What is a read path in Cassandra?

The read path in Cassandra is the process of retrieving data from the cluster in response to a read request. The read path involves contacting multiple nodes in the cluster and merging the results to produce the final response.

39. What is a write path in Cassandra?

The write path in Cassandra is the process of storing data in the cluster in response to a write request. The write path involves contacting multiple nodes in the cluster and ensuring that the data is stored correctly and consistently.

40. What is a gossip protocol in Cassandra?

The gossip protocol in Cassandra is a peer-to-peer communication protocol used by nodes in the cluster to share information about the cluster’s topology, status, and schema. It helps ensure that all nodes have a consistent view of the cluster.

41. What is a hinted handoff in Cassandra?

A hinted handoff in Cassandra is a mechanism used to ensure that writes are eventually applied to all replicas of the data, even if some replicas are temporarily unavailable. When a write request fails due to a replica being unavailable, a hint is stored on the coordinator node. The hint is replayed when the unavailable replica comes back online.

42. What is a commit log in Cassandra?

A commit log in Cassandra is a sequential log of all writes to the database. The commit log is used for durability and crash recovery. When a node crashes, the commit log is used to recover the state of the database.

43. What is a batch statement in Cassandra?

A batch statement in Cassandra is a way to group multiple write requests into a single atomic operation. Batch statements can improve write performance by reducing the number of network round trips required to write data.

44. What is a lightweight transaction in Cassandra?

A lightweight transaction in Cassandra is a way to perform a conditional write that ensures that no other write has occurred on the data since the last read. Lightweight transactions can be slower than normal writes, but they provide stronger consistency guarantees.

45. What is a compaction in Cassandra?

A compaction in Cassandra is the process of merging multiple SSTables (sorted string tables) into a single SSTable. Compaction helps reclaim disk space and improve read performance.

46. What is a bloom filter in Cassandra?

A bloom filter in Cassandra is a probabilistic data structure used to test whether an element is a member of a set. Bloom filters are used to improve read performance by reducing the number of disk seeks required to find data.

47. What is a key cache in Cassandra?

A key cache in Cassandra is an in-memory cache of the partition keys that are currently in use by the system. Key caches can improve read performance by reducing the number of disk seeks required to find data.

48. What is a row cache in Cassandra?

A row cache in Cassandra is an in-memory cache of the entire rows of data that are frequently accessed. Row caches can improve read performance by reducing the number of disk seeks required to find data.

49. What is a column family in Cassandra?

A column family in Cassandra is a container for data that is organized as rows and columns. Each row has a unique key, and each column has a name and a value.

50. What is the difference between a column family and a table in Cassandra?

In Cassandra, a column family is an older term for a container for data that is organized as rows and columns. A table is the current term for the same concept.

51. What is a partition key in Cassandra?

A partition key in Cassandra is a value used to determine which node in a cluster is responsible for storing a particular piece of data. The partition key is used to distribute data across the cluster and is typically a subset of the row key.

52. What is a clustering column in Cassandra?

A clustering column in Cassandra is a column used to sort rows within a partition. Clustering columns allow you to retrieve data in a specific order and can improve read performance.

53. What is a compound key in Cassandra?

A compound key in Cassandra is a key that consists of multiple columns. The first column is the partition key, and the remaining columns are clustering columns.

54. What is the difference between a partition key and a clustering column in Cassandra?

In Cassandra, a partition key is used to determine which node in a cluster is responsible for storing a particular piece of data. A clustering column is used to sort rows within a partition.

55. What is a secondary index in Cassandra?

A secondary index in Cassandra is an index on a non-partition key column. Secondary indexes allow you to query data based on a column other than the partition key.

56. What is a materialized view in Cassandra?

A materialized view in Cassandra is a denormalized view of a base table that is created to optimize specific queries. Materialized views can improve read performance by reducing the number of disk seeks required to find data.

57. What is a tombstone in Cassandra?

A tombstone in Cassandra is a marker that indicates that a piece of data has been deleted. Tombstones are used to ensure that deleted data is eventually removed from all replicas.

58. What is the compaction strategy in Cassandra?

The compaction strategy in Cassandra is a configuration setting that determines how SSTables are merged together during the compaction process. The compaction strategy can affect read and write performance, disk space usage, and data durability.

59. What is a replica in Cassandra?

A replica in Cassandra is a copy of a piece of data that is stored on a different node in the cluster. Replicas are used to provide fault tolerance and ensure that data is available even if some nodes fail.

60. What is a consistency level in Cassandra?

A consistency level in Cassandra is a setting that determines how many replicas must acknowledge a write or read request before it is considered successful. Consistency levels can affect the availability and durability of data in the cluster.

Cassandra