NGCC Summary
The Next Generation Cassandra Conference (NGCC) is an annual meeting of developers, contributors, committers, driver authors and anyone interested in the development of Apache Cassandra where they can all get together and talk about the technical and community direction of the Cassandra project. This year the NGCC was held in San Antonio, Texas.
The talks and discussions were largely driven by large operators who also contribute heavily to the Cassandra codebase, reflective of this we saw a lot of work being put into the manageability, stability and performance of Cassandra compared with previous years.
This NGCC also marked the first one since Datastax announced its withdrawal from the project, despite this, it was great to see a number of Datastax employed contributors who came to NGCC in a personal and private capacity.
Upcoming features
Some of the exciting and most notable upcoming features and improvements are as follows:
- Pluggable storage with RocksDB delivering considerable performance improvements. The headline act is pluggable storage and the RocksDB storage engine. RocksDB will move a lot of the storage engine code into a C implementation which dramatically reduces GC pressure and has resulted in Scylla levels of low p95 latency. I’m not sure what the net performance increase is (Instagram has only benchmarked with a constant disk I/O load of 20mbps), but I would hazard a guess there is a significant improvement. The great thing about pluggable storage is it opens the way for also integrating many different and new storage technologies such as in-memory storage.
- Virtual Tables. Some great work being done by Jeff Jirsa, this will be mean API developers can target to create virtual tables in Cassandra. The initial use case is to expose a large number of JMX metrics in a read-only manner as a table in Cassandra. A large number of RDBMS systems have a similar approach and it can make life much easier from an operations perspective. For example creating a driver that routes requests away from replicas that have a newgen occupancy > 80% (e.g. about to GC). Eventually, this could lead to foreign data wrapper like functionality in Cassandra.
- CDC improvements. Uber is spending a ton of time trying to improve CDC performance as they have built some in process CDC mechanisms which means this code path is going to be better tested. This is going to lend itself to Enterprise grade features like better audit.
- Decoupling redundancy from availability. Both Instagram and Apple have ongoing work to allow Cassandra to have nodes that act as hint stores or lightweight replicas in specific situations using different approaches. This allows operators and users to decouple data redundancy from availability with the net result of being able to achieve a net availability of higher than RF3 with fewer nodes. Similar approaches are done with Amazon Aurora and other DBs, however this is sure to be a controversial topic as potentially it makes it easier to shoot yourself in the foot without strong guidance.
Community health
There is a slow down in commits mainly due to a number of active committers no longer working on the project on a daily basis due to the Datastax withdrawal. This is slowly being resolved as other contributors and committers are stepping up to fill the gaps, but it takes time to build the capability. We at Instaclustr are also doing our part by dedicating full time resources to working on the project.
Project Quality
A central theme that arose from a number of technical and community discussions was how to improve the quality of Apache Cassandra releases. After the experiment with tick-tock plus other issues with previous releases there exists a number of features in Apache Cassandra that while are functional, have a number of quite sharp edge cases (e.g. Materialised Views, Incremental Repair etc).
Discussion ranged from our approach to testing, our release cadence and also sectioning off new features behind experimental flags. Based on the outcome of these discussions and mailing list discussions we should start to see the stability and performance of Apache Cassandra improve in new releases.
Summary
All in all it was a wonderful vibe during the whole day in a room of people way smarter than me that are incredibly passionate about an open source project that powers applications and services used by a significant proportion of the world’s population on a daily basis.
Despite the challenge of having a large contributor withdraw, the community has come out the back more diverse, more focused and with a drive to have a much more stable and performant database in the coming years.