http://shermandigital.com/blog/designing-a-cassandra-data-model/
When inserting records, Cassandra will hash the value of the inserted data's partition key; Cassandra uses this hash value to determine which node is responsible for storing the data.
for Cassandra to look up a set of data (or a set of rows in the relational model), we have to store all of the data under the same partition key. To summarize, rows in Cassandra are essentially data embedded within a partition due to the fact that the data share the same partition key.
Data model goals
- Spread data evenly around the cluster. Paritions are distributed around the cluster based on a hash of the partition key. To distribute work across nodes, it's desirable for every node in the cluster to have roughly the same amount of data.
- Minimize the number of partitions read. Partitions are groups of columns that share the same partition key. Since each partition may reside on a different node, the query coordinator will generally need to issue separate commands to separate nodes for each partition we query.
- Satisfy a query by reading a single partition. This means we will use roughly one table per query. Supporting multiple query patterns usually means we need more than one table. Data duplication is encouraged.
Column families are represented in Cassandra as a map of sorted maps. The partition key acts as the lookup value; the sorted map consists of column keys and their associated values.
Map<ParitionKey, SortedMap<ColumnKey, ColumnValue>>
Compound key
Compound keys include multiple columns in the primary key, but these additional columns do not necessarily affect the partition key. A partition key with multiple columns is known as a composite key
CREATE TABLE crossfit_gyms_by_location ( country_code text, state_province text, city text, gym_name text, PRIMARY KEY (country_code, state_province, city, gym_name) );
Note that only the first column of the primary key above is considered the partition key; the rest of columns are clustering keys. This means that while the primary key represents a unique gym record/row, all gyms within a country reside on the same partition.
Clustering key
Clustering keys are responsible for sorting data within a partition. Each primary key column after the partition key is considered a clustering key. In the crossfit_gyms_by_location example, country_code is the partition key; state_province, city, and gym_name are the clustering keys. Clustering keys are sorted in ascending order by default. So when we query for all gyms in the United States, the result set will be ordered first by state_province in ascending order, followed by city in ascending order, and finally gym_name in ascending order.
Order by
To sort in descending order, add a WITH clause to the end of the CREATE TABLE statement.
CREATE TABLE crossfit_gyms_by_location ( country_code text, state_province text, city text, gym_name text, PRIMARY KEY (country_code, state_province, city, gym_name) ) WITH CLUSTERING ORDER BY (state_province DESC, city ASC, gym_name ASC);
The partition key is not part of the ORDER BY statement because its values are hashed and therefore won't be close to each other in the cluster.
Composite key
Composite keys are partition keys that consist of multiple columns. The crossfit_gyms_by_location example only used country_code for partitioning. The result is that all gyms in the same country reside within a single partition. This can lead to wide rows. In the case of our example, there are over 7,000 CrossFit gyms in the United States, so using the single column partition key results in a row with over 7,000 combinations.
To avoid wide rows, we can move to a composite key consisting of additional columns. If we change the partition key to include the state_province and city columns, the partition hash value will no longer be calculated off only country_code. Now, each combination of country_code, state_province, and city will have its own hash value and be stored in a separate partition within the cluster. We accomplish this by nesting parenthesis around the columns we want included in the composite key.
CREATE TABLE crossfit_gyms_by_city ( country_code text, state_province text, city text, gym_name text, opening_date timestamp, PRIMARY KEY ((country_code, state_province, city), opening_date, gym_name) ) WITH CLUSTERING ORDER BY ( opening_data ASC, gym_name ASC );
When issuing a CQL query, you must include all partition key columns, at a minimum. You can then apply an additional filter by adding each clustering key in the order in which the clustering keys appear.
Invalid queries:
SELECT * FROM crossfit_gyms_by_city WHERE country_code = 'USA' and state_province = 'VA'
SELECT * FROM crossfit_gyms_by_city WHERE country_code = 'USA' and state_province = 'VA' and city = 'Arlington' and gym_name = 'CrossFit Route 7'
The first invalid query is missing the city partition key column. The second invalid query uses the clustering key gym_name without including the preceding clustering key opening_date.
The reason the order of clustering keys matters is because the clustering keys provide the sort order of the result set. Because of the clustering key's responsibility for sorting, we know all data matching the first clustering key will be adjacent to all other data matching that clustering key.
In our example, this means all gyms with the same opening date will be grouped together in alphabetical order. Gyms with different opening dates will appear in temporal order.
Because we know the order, CQL can easily truncate sections of the partition that don't match our query to satisfy the WHERE conditions pertaining to columns that are not part of the partition key. However, because the clustering key gym_name is secondary to clustering key opening_date, gyms will appear in alphabetical order only for gyms opened on the same day (within a particular city, in this case). Therefore, we can't specify the gym name in our CQL query without first specifying an opening date.
Cassandra Features
- Continuous availability. The peer-to-peer replication of data to nodes within a cluster results in no single point of failure. This is true even across data centers.
- Linear performance when scaling nodes in a cluster. If three nodes are achieving 3,000 writes per second, adding three more nodes will result in a cluster of six nodes achieving 6,000 writes per second.
- Tunable consistency. If we want to replicate data across three nodes, we can have a replication factor of three, yet not necessarily wait for all three nodes to acknowledge the write. Data will eventually be written to all three nodes, but we can acknowledge the write after writing the data to one or more nodes without waiting for the full replication to finish.
- Flexible data model. Every row can have a different number of columns with support for many types of data.
- Query language (CQL) with a SQL-like syntax.
- Support for Java Monitoring Extensions (JMX). Metrics about performance, latency, system usage, etc. are available for consumption by other applications.
Cassandra Limitations
- No join or subquery support for aggregation. According to Cassandra’s documentation, this is by design, encouraging denormalization of data into partitions that can be queried efficiently from a single node, rather than gathering data from across the entire cluster.
- Ordering is set at table creation time on a per-partition basis. This avoids clients attempting to sort billions of rows at run time.
- All data for a single partition must fit on disk in a single node in the cluster.
- It's recommended to keep the number of rows within a partition below 100,000 items and the disk size under 100 MB.
- A single column value is limited to 2 GB (1 MB is recommended).
A less obvious limitation of Cassandra is its lack of row-level consistency. Modifications to a column family (table) that affect the same row and are processed with the same timestamp will result in a tie.
In the event of a tie Cassandra follows two rules:
- Deletes take precedence over inserts/updates.
- If there are two updates, the one with the lexically larger value wins.
This means for inserts/updates, Cassandra resolves row-level ties by comparing values at the column (cell) level, writing the greater value. This can result in one update modifying one column while another update modifies another column, resulting in rows with combinations of values that never existed.