Install MongoDB Sharded Cluster on K8S

Tram Ho

Prologue

Hello everyone, back to the “What I know about MongoDB” series of articles, a series about an extremely popular NoSQL DB, MongoDB.

In the previous section, I showed you how to install mongodb shard cluster on Linux servers. By manually installing each component and configuring sharding, configuring the replicas will give you a better understanding of the system.

In this article, I will show you how to install mongodb shard cluster on K8S using helm chart. With this installation, it requires you to understand the architecture and components of MongoDB, from which it is easy to customize the available configuration parameters provided by this helmchart.

Installing mongodb on k8s using helm chart is quite simple. Because most of the installation and configuration steps are already automated. My job is just to choose specific configuration for each component I want (mongos, configserver, shard), for example:

  • Do you use sharding?
  • How many shards to use
  • Does each shard have a replicaset configuration?
  • Does ConfigServer have a Replicaset configured..

Model deployment

I will install it on a cluster of K8S installed on EC2 servers as follows:

The mongodb cluster I will install follows the same model as the previous post, which is to install 3 shardsvc (each shardsvr has 3 replicas), configsvr has 3 replicas and 2 mongos services.

Objectives to be achieved:

  • Make sure replicas of the same shardsvr will not be on the same node
  • Make sure replicas of configsvr will not be on the same node

To store data for mongodb on k8s, I will use Persistent Volume (Always use AWS’s EBS Storage by creating a Storage Class for automatic PV allocation). If you use K8S Onprem, you can use NFS as a storage backend for k8s.

Deployment

A summary of the implementation steps is as follows:

  • Prepare the environment: K8S, Storage Class
  • Prepare helm-chart
  • Customization of setting parameters
  • Install helm-chart and check the results.

Prepare the Kubernetes environment

K8S has been built, including 4 worker nodes (to set the model to install 3 shards, you should have at least 3 nodes). Now I will install the storage class using AWS EBS.

To do this you need to do:

  • Authorize EC2s to have read and write access to EBS
  • Install EBS CSI Driver on K8S
  • Install Storage Class on K8S

The above content is mainly related to AWS, I will not talk about it in depth. As a result, I have a Storage Class to generate PV automatically and the name of this Storage Class will be used for configs when installing mongo using helm chart.

Download mongodb helmchart

Create mongodb helmchart install and download directory:

I will use helmchart bitnami/mongodb-sharded to install. Before the name is to download helmchart to save it on your computer to customize and use:

At this point we will have a mongodb-sharded directory containing helmchart to install mongodb. It contains a values.yaml file that stores parameters for customization when installing this helmchart. I will copy it out to customize the parameters to suit my needs. Now the directory containing the installation file will look like this:

Customization of setting parameters

This will be the most important part when installing a software from helmchart and also requires the most experience. Because a parameter file (helm-value file) will usually be very long, sometimes up to several thousand lines. So before configuring the parameters, I need to know what I will install, and what parameters I need to customize.

My initial problem was to install mongo shard cluster with the following goals:

  • There are 3 shards, each shard 3 replicas
  • Configsvr has 3 replicas
  • Mongos has 2 replicas
  • Mongo will be used for applications in k8s so there is no need to expose it to the outside (will only need a ClusterIP service). In case you need to expose it to the outside, you can use NodePort.
  • Enable Metrics so you can monitor it later on Prometheus and Grafana

Basically, the parameters of the helm-value file will allow us to customize the main objects mongos , configsvr , shardsvr and the metric section for application monitoring.

I will check and explain some of the most basic parameters to customize each of the components above.

General parameters

In this part, I will need to care about image information. I often have a habit of tagging public images about the private registry, so I have to edit the image information according to the private registry in this section. Another note is that when using the private registry, you need to create a secret to use for pulling images from this private registry.

Customize the value of helm-value

Basically the default parameters of this helm chart are already very good, basically I just need to change very little. Some key parameters need to change depending on the configuration you want to deploy.

Authentication configuration, here I do not configure passwords for replicasets.

Configure the number of Shardsvr in the cluster, I set it to 3 according to the original model:

The service configuration is by default ClusterIP and port 27017, basically I don’t have to update anything here but still mention it so that if needed I can change it to NodePort to expose it to the outside, for example:

Customizing the configuration of configsvr: Then I will set parameters to create 3 replicas for configsvr and configure a persistent volume for it including storageClass information and required storage:

Customize the configuration of mongos: Simply set 2 replicas. The podAntiAffinityPreset parameter automatically generates a podAntiAffinity rule with the goal of trying to allocate this application’s Pods across different nodes:

Customize the configuration of shardsvr: Similar to setting 3 replicas for each shard, configure podAntiAffinityPreset: soft so that the Pods of the shardsvr will prioritize not running on the same node.

Note that we should only set the PodAntiAffinity parameter to soft, corresponding to the preferredDuringSchedulingIgnoredDuringExecution configuration. This will avoid the case that the Pod cannot find a node that meets the antiAffinity condition and is in the Pending state.

In summary, I will have a custom-mongo-val.yaml file containing the following custom parameters:

Setting

Before installing, I can review a configuration that will actually be deployed before applying to the system. I will use the helm template command to check that with the above custom configuration, this helmchart will generate ntn resources:

With the default affinity configuration as above, the helmchart set has created us the affinity rule as follows:

For each component there will be a label app.kubernetes.io/component corresponding to mongos , configsvr or shardsvr and that will be the main factor to implement the PodAntiAffinity rule.

Simply put, the system will prioritize not creating Pods of configsvr when that node already has a Pod with label app.kubernetes.io/component=configsvr running on it. And from there it helps to “allocate” Pods of the same function across different nodes.

Here we can perform the installation:

Check the result:

Thus we can see:

  • 2 Pods of mongos running on 2 nodes are node-2 and node-3
  • 3 Pods of configsvr running on 3 different nodes are node-1 , node-2 and node-3
  • First 3 Pods of shardsvr running on 3 different nodes

The essence here when we create 3 shards, this helmchart will create us 3 statefulset corresponding to 3 shardsvr:

And according to the characteristics of the statefulset, its Pods will be generated sequentially, so the first 3 Pods of each Statefulset will be created first, when this Pod is created ok, continue to create the 2nd, 3rd Pod.

And the final result when the installation is done:

And for each Pod of the statefulset, we will have 1 PVC, PV respectively:

Check connection to DB

When the installation is complete with helmchart, it will give us instructions to connect to the db. I create a Pod with mongodb client to connect to the cluster just installed:

Creating and adding shardsvr to the cluster, as well as configuring replicas for shardsvr/configsvr is fully automated. I can check the sharding status of the cluster as follows:

So I have installed mongodb shard cluster on K8S quite simply. To summarize, once you get used to it, you only have to edit a few parameters and set the command to set up a mongodb cluster.

Thank you for taking the time to read it. If you find the article useful, please leave an upvote and bookmark the article to support me!

Share the news now

Source : Viblo