Curriculum
Batching is a powerful feature in Cassandra that allows us to execute multiple data operations (inserts, updates, deletes) in a single network request, reducing the number of network round trips and improving the overall write performance. Batches in Cassandra are atomic, meaning that either all or none of the statements in a batch will be executed.
Cassandra supports two types of batch operations:
Logged batches are the default batch type in Cassandra, and they provide atomicity and isolation guarantees. Unlogged batches, on the other hand, are not atomic and do not provide isolation guarantees.
Syntax of Batch in Cassandra
The syntax for a batch statement in Cassandra is as follows:
BEGIN BATCH // First operation // Second operation // ... APPLY BATCH;
The BEGIN BATCH keyword signals the start of a batch statement, and APPLY BATCH indicates the end of the statement. In between these two keywords, we can specify any number of data operations, such as INSERT, UPDATE, and DELETE.
Examples of Batch in Cassandra
Let’s take a look at some examples of batch operations in Cassandra.
Suppose we want to insert three records into a table named users
. We can use a simple logged batch to perform this operation:
BEGIN BATCH INSERT INTO users (id, name, email) VALUES ('1', 'John Doe', '[email protected]'); INSERT INTO users (id, name, email) VALUES ('2', 'Jane Smith', '[email protected]'); INSERT INTO users (id, name, email) VALUES ('3', 'Bob Johnson', '[email protected]'); APPLY BATCH;
In this example, we are inserting three records into the users
table using a batch statement. All three insert operations are logged and will be executed atomically.
If we don’t require atomicity or isolation guarantees, we can use an unlogged batch to improve write performance. The syntax for an unlogged batch is the same as for a logged batch, except that we use the keyword UNLOGGED
before BATCH
.
BEGIN UNLOGGED BATCH INSERT INTO users (id, name, email) VALUES ('4', 'Alice Brown', '[email protected]'); INSERT INTO users (id, name, email) VALUES ('5', 'David Lee', '[email protected]'); APPLY BATCH;
In this example, we are inserting two records into the users
table using an unlogged batch. These operations are not atomic and do not provide isolation guarantees.
We can also use batch statements to perform conditional updates. For example, suppose we have a table named products
with a quantity
column, and we want to update the quantity of two products with the IDs 123
and 456
, but only if the current quantity is greater than 0. We can use a batch statement to perform this operation:
BEGIN BATCH UPDATE products SET quantity = 10 WHERE id = '123' IF quantity > 0; UPDATE products SET quantity = 5 WHERE id = '456' IF quantity > 0; APPLY BATCH;
In this example, we are updating the quantity of two products using a batch statement. However, we are only updating the records if the current quantity is greater than 0, which is specified using the IF
keyword. If the condition is not met, the update operation will not be executed.
Batching in Cassandra is a powerful feature that allows us to perform multiple data operations in a single