Running a database in the cloud can provide easily attainable benefits but some databases are just better suited for cloud infrastructure than others. In this second post in our series about hosting database in the cloud, we’ll dive into the cloud-native databases. Read ahead to learn more or check the first part of the series about benefits of the cloud infrastructure if you haven’t seen it yet.
The best DB in the cloud
Even though most databases can run in the cloud, picking the right starting point is the work half done. Getting the best results will come down to the architecture of the database. A cloud-native database that is designed, built, and run on the cloud computing delivery model could provide the most value in the long run. This can be true even when factoring in the possible costs of having to update your application level to support a new database architecture if you are migrating from an older database.
Cloud-native software specialised in working on the cloud platform and as such can provide features beyond traditional models. In the case of databases, cloud-native data is usually stored and structured in such ways that improve and encourage flexibility. From a database perspective, flexible data models enhance clustering capabilities and provide elasticity.
Advantages of the cloud-native approach
Elasticity refers to the degree to which a system can adjust to changes in workload by provisioning and de-provisioning resources to match current demand. In an ideal environment, the aim is to avoid both over and underprovisioning. This can be achieved by actively tuning the system resources according to the load for which cloud is an ideal platform.
Besides resource matching and optimising costs, cloud-native databases can also be designed for geographic resilience. With the advantages of clustering, modern databases can react to changes in the network to rebalance or heal itself. Advanced automation is even capable of detecting regional user patterns. This information can be used to bringing relevant data closer to the users to improve performance on demand. Thanks to a fully featured application programming interface, database automation is truly at home in the cloud.
Cloud-native databases to check out
Databases native to the cloud are differentiated according to their data model, relational or non-relational. Both models have their advantages.
Commonly, applications have been built around the standard SQL databases which need to be taken into account when migrating to the cloud. Although relational databases are widely adopted, the technology was not originally designed for distributed systems or the cloud. But there are exceptions such as CockroachDB. It is a good example of a cloud-native relational database that is built for scaling and resilience.
Meanwhile non-relational have been generally easier to deploy in the cloud and provide efficient horizontal scalability. NoSQL databases are built to service heavy read and write loads and can be very elastic. As such, they are well natively suited to running in the cloud. As an example, MongoDB is one of the most popular NoSQL and very cloud orientated.
Performance in cloud-native applications
NoSQL databases and MongoDB especially are often employed by larger services such as the Apache Kafka distributed streaming platform. The high-performance open-source stream processing platform allows for collecting and processing large numbers of messages in real-time. It enables you to accept streaming data such as website click streams, events, transactions or many other telemetries in real-time and at scale.
Kafka is built to be distributable, easily scalable as well as highly fault tolerant. Adding more horizontal nodes to tackle growing loads is fairly straightforward and automatic replication of the data over more than one node maintains availability if another node fails. Thanks to Kafka’s architecture, it’s well suited for running in the cloud when paired with a database such as MongoDB. The benefits of the cloud are even further emphasised by deploying onto a fast provider. Database management specialists at Aiven demonstrated the differences between cloud providers for running Kafka by benchmarking the large clouds in comparison to UpCloud.
Write performance (3 nodes @ 8 GB RAM, 2 CPU, 400 GB disk each):
Above are the benchmark results of write performance in a distributed 3-node environment. Each cloud server was configured with 8 GB RAM, 2 CPU, and 400 GB storage. Kafka managed to write 320,000 messages per second on UpCloud, 205,000 on Azure, 170,000 on Google and 160,000 messages per second on AWS.
The takeaway
Cloud-native databases are one of the best options for building data handling in the cloud. The flexibility of the cloud as a platform along with the advantages of the cloud-native approach allows your data to achieve great performance at affordable prices.
This post was the second part of our series of databases in the cloud. Stay tuned for a follow-up on database management systems. In the meanwhile, check out the full story on Apache Kafka benchmarks by Aiven in the article below.