Our engineering team operates on an agile methodology with two-week sprints that result in a new release. This allows us to regularly update our platform with new features for our customers and keep our managed environments up to date with the latest releases and patches.
Since a major upgrade to our management system was released early in 2015, we have been operating to a two-weekly cycle of sprints and releases. The engineering team have put in place the tools we need to keep the development process agile and lean. We have implemented continuous integration services, automated build and release procedures and development processes that are paying real dividends by allowing us to accelerate the delivery of new functionality to our customers.
While we’ve been putting these internal improvements in place we’ve also released a number of enhancements to our system that are improving the service we provide to our customers.
Some of the highlights over the last couple of months include:
- Upgrading our standard Apache Cassandra image to 2.0.13 and then 2.0.14 – 2.0.13 has been rolled out initially to all our customers who required the fix to the nodetool cleanup; we are beginning the process of rolling out 2.0.14 to our remaining customers.
- Rebuilding our base OS image for Cassandra nodes to use the latest version of CoreOS, removing the use of BTRFS which had been causing us issues in managing available disk space on the OS and logging partitions of our nodes; all customers will be upgrade to this image as part of the 2.0.14 upgrade.
- Providing the ability for customers to add nodes to their cluster directly from our dashboard. Currently, this logs a ticket for our support to team to check the health of the cluster before releasing our provisioning system to automatically provision the nodes. We have been deliberately cautious in including this manual check and expect to remove it in the near future to allow immediate provisioning of new nodes by customers.
- Providing the ability for customers to whitelist a network range through the firewall to their Cassandra cluster. Previously, only individual IPs were supported. This feature is important if you are connecting to Instaclustr from an environment such as Heroku where you can’t control the source IP.
- Introduced a new monitoring architecture based on Reimann. This immediately improved our ability to detect and respond to issues with customer Cassandra clusters. It also lays the architectural foundation for planned future enhancements such as providing greater customer visibility into cluster status (above what is currently available though OpsCenter) and development of predictive monitoring and automated failure resolution.
- Started restructuring our dashboard UI to accommodate some of the new features that we have planned.
As well as these features that are already released, we’ve made big steps towards completing a number of other significant features that we will be releasing over the next month or two. We expect the foundations we have laid to allow us to accelerate the pace of enhancements and are excited about driving the Instaclustr service to amazing levels of reliability, functionality and ease of use.