The Terraform VMware Cloud Director Supplier v3.11.0 now helps putting in and managing Container Service Extension (CSE) 4.1, with a brand new set of enhancements, the brand new vcd_rde_behavior_invocation
information supply and up to date guides for VMware Cloud
Director customers to deploy the required elements.
On this weblog put up, we shall be putting in CSE 4.1 in an current VCD and creating and managing a TKGm cluster.
Making ready the set up
Initially, we should be sure that all of the conditions listed within the Terraform VCD Supplier documentation are met. CSE 4.1 requires a minimum of VCD 10.4.2, we are able to verify our VCD model within the popup that exhibits up by clicking the About possibility inside the assistance “(?)” button subsequent to our username within the prime proper nook:
Test that you just even have ALB controllers obtainable to be consumed from VMware Cloud Director, because the created clusters require them for load-balancing functions.
Step 1: Putting in the conditions
Step one of the set up mimics the UI wizard step through which conditions are created:
We’ll do that precise step programmatically with Terraform. To do this, let’s clone the terraform-provider-vcd repository so we are able to obtain the required schemas, entities, and examples:
|
git clone https://github.com/vmware/terraform-provider-vcd.git cd terraform–supplier–vcd git checkout v3.11.0 cd examples/container–service–extension/v4.1/set up/step1 |
If we open 3.11-cse-install-2-cse-server-prerequisites.tf
we are able to see that these configuration information create all of the RDE framework elements that CSE makes use of to work, consuming the schemas which can be hosted within the GitHub repository, plus all of the rights and roles which can be wanted. We received’t customise something inside these information, as they create the identical objects because the UI wizard step proven within the above screenshot, which doesn’t permit customization both.
Now we open 3.11-cse-install-3-cse-server-settings.tf
, this one is equal to the next UI wizard step:
We will observe that the UI wizard permits us to set some configuration parameters, and if we glance to terraform.tfvars.instance
we are going to observe that the requested configuration values match.
Earlier than making use of all of the Terraform configuration information which can be obtainable on this folder, we are going to rename terraform.tfvars.instance
to terraform.tfvars
, and we are going to set the variables with appropriate values. The defaults that we are able to see in variables.tf
and terraform.tfvars.instance
match with these of the UI wizard, which must be good for CSE 4.1. In our case, our VMware Cloud Director has full Web entry, so we aren’t setting any customized Docker registry or certificates right here.
We also needs to have in mind that the terraform.tfvars.instance
is asking for a username and password to create a person that shall be used to provision API tokens for the CSE Server to run. We additionally depart these as they’re, as we just like the "cse_admin"
username.
As soon as we evaluation the configuration, we are able to safely full this step by operating:
|
terraform init terraform apply |
The plan ought to show all the weather which can be going to be created. We full the operation (by writing sure
to the immediate) so step one of the set up is completed. This may be simply checked within the UI as now the wizard doesn’t ask us to finish this step, as an alternative, it exhibits the CSE Server configuration we simply utilized:
Step 2: Configuring VMware Cloud Director and operating the CSE Server
We transfer to the following step, which is positioned at examples/container-service-extension/v4.1/set up/step2
of our cloned repository.
|
cd examples/container–service–extension/v4.1/set up/step2 |
This step is essentially the most customizable one, because it will depend on our particular wants. Ideally, because the CSE documentation implies, there must be two Organizations: Options Group
and Tenant Group
, with Web entry so all of the required Docker photographs and packages might be downloaded (or with entry to an inside Docker registry if we had chosen a customized registry within the earlier step).
We will examine the totally different information obtainable and alter the whole lot that doesn’t match with our wants. For instance, if we already had the Group VDCs created, we may change from utilizing assets to utilizing information sources as an alternative.
In our case, the VMware Cloud Director equipment the place we’re putting in CSE 4.1 is empty, so we have to create the whole lot from scratch. That is what the information on this folder do, they create a primary and minimal set of elements to make CSE 4.1 work.
Similar as earlier than, we rename terraform.tfvars.instance
to terraform.tfvars
and examine the file contents so we are able to set the proper configuration. As we talked about, establishing the variables of this step will depend on our wants and the way we need to arrange the networking, the NSX ALB, and which TKGm OVAs we need to present to our tenants. We also needs to remember that some constraints have to be met, just like the VM Sizing Insurance policies which can be required for CSE to work being printed to the VDCs, so let’s learn and perceive the set up information for that goal.
As soon as we evaluation the configuration, we are able to full this step by operating:
|
terraform init terraform apply |
Now we should always evaluation that the plan is appropriate and matches to what we need to obtain. It ought to create the 2 required Organizations, our VDCs, and most significantly, the networking configuration ought to permit Web visitors to retrieve the required packages for the TKGm clusters to be provisioned with out points (do not forget that within the earlier step, we didn’t set any inside registry nor certificates). We full the operation (by writing sure
to the immediate) so the second step of the set up is completed.
We will additionally double-check that the whole lot is appropriate within the UI, or do a connectivity check by deploying a VM and utilizing the console to ping an outside-world web site.
Cluster creation with Terraform
Provided that we have now completed the set up course of and we nonetheless have the cloned repository from the earlier steps, we transfer to examples/container-service-extension/v4.1/cluster
.
|
cd examples/container–service–extension/v4.1/cluster |
The cluster is created by the configuration file 3.11-cluster-creation.tf
, by additionally utilizing the RDE framework. We encourage the readers to verify each the vcd_rde
documentation and the cluster administration information earlier than continuing, because it’s essential to know the way this useful resource works in Terraform, and most significantly, how CSE 4.1 makes use of it.
We’ll open 3.11-cluster-creation.tf
and examine it, to instantly see that it makes use of the JSON template positioned at examples/container-service-extension/v4.1/entities/tkgmcluster.json.template
. That is the payload that the CSE 4.1 RDE requires to initialize a TKGm cluster. We will customise this JSON to our wants, for instance, we are going to take away the defaultStorageClassOptions
block from it as we received’t use storage in our clusters.
The preliminary JSON template tkgmcluster.json.template
appears like this now:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
|
{ “apiVersion”: “capvcd.vmware.com/v1.1”, “form”: “CAPVCDCluster”, “title”: “${title}”, “metadata”: { “title”: “${title}”, “orgName”: “${org}”, “website”: “${vcd_url}”, “virtualDataCenterName”: “${vdc}” }, “spec”: { “vcdKe”: { “isVCDKECluster”: true, “markForDelete”: ${delete}, “forceDelete”: ${force_delete}, “autoRepairOnErrors”: ${auto_repair_on_errors}, “safe”: { “apiToken”: “${api_token}” } }, “capiYaml”: ${capi_yaml} } } |
There’s nothing else that we are able to customise there, so we depart it like that.
The subsequent factor that we discover is that we’d like a sound CAPVCD YAML, we are able to obtain it from right here. We’ll deploy a v1.25.7 Tanzu cluster, so we obtain this one to begin getting ready it.
We open it with our editor and add the required snippets as said in the documentation. We begin with the form: Cluster
blocks which can be required by the CSE Server to provision clusters:
|
apiVersion: cluster.x–k8s.io/v1beta1 form: Cluster metadata: title: ${CLUSTER_NAME} namespace: ${TARGET_NAMESPACE} labels: # We add this block cluster–function.tkg.tanzu.vmware.com/administration: “” tanzuKubernetesRelease: ${TKR_VERSION} tkg.tanzu.vmware.com/cluster–title: ${CLUSTER_NAME} annotations: # We add this block TKGVERSION: ${TKGVERSION} # … |
We added the 2 labels
and annotations
blocks, with the required placeholders TKR_VERSION
, CLUSTER_NAME
, and TKGVERSION
. These placeholders are used to set the values through Terraform configuration.
Now we add the Machine Well being Test block, which is able to permit to make use of one of many new highly effective options of CSE 4.1, that remediates nodes in failed standing by changing them, enabling cluster self-healing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
|
apiVersion: cluster.x–k8s.io/v1beta1 form: MachineHealthCheck metadata: title: ${CLUSTER_NAME} namespace: ${TARGET_NAMESPACE} labels: clusterctl.cluster.x–k8s.io: “” clusterctl.cluster.x–k8s.io/transfer: “” spec: clusterName: ${CLUSTER_NAME} maxUnhealthy: ${MAX_UNHEALTHY_NODE_PERCENTAGE}% nodeStartupTimeout: ${NODE_STARTUP_TIMEOUT}s selector: matchLabels: cluster.x–k8s.io/cluster–title: ${CLUSTER_NAME} unhealthyConditions: – sort: Prepared standing: Unknown timeout: ${NODE_UNKNOWN_TIMEOUT}s – sort: Prepared standing: “False” timeout: ${NODE_NOT_READY_TIMEOUT}s — |
Discover that the timeouts have an s
because the values launched throughout set up had been in seconds. If we hadn’t put the worth in seconds, or we put the worth like 15m
, we are able to take away the s
suffix from these block choices.
Let’s add the final elements, that are most related when specifying customized certificates throughout the set up course of. In form: KubeadmConfigTemplate
we should add the preKubeadmCommands
and useExperimentalRetryJoin
blocks below the spec
> customers
part:
|
preKubeadmCommands: – mv /and many others/ssl/certs/custom_certificate_*.crt /usr/native/share/ca–certificates && replace–ca–certificates useExperimentalRetryJoin: true |
In form: KubeadmControlPlane
we should add the preKubeadmCommands
and controllerManager
blocks contained in the kubeadmConfigSpec
part:
|
preKubeadmCommands: – mv /and many others/ssl/certs/custom_certificate_*.crt /usr/native/share/ca–certificates && replace–ca–certificates controllerManager: extraArgs: allow–hostpath–provisioner: “true” |
As soon as it’s accomplished, the ensuing YAML must be much like the one already supplied within the examples/cluster
folder, cluster-template-v1.25.7.yaml
, because it makes use of the identical model of Tanzu and has all of those additions already launched. This can be a good train to verify whether or not our YAML is appropriate earlier than continuing additional.
After we evaluation the crafted YAML, let’s create a tenant person with the Kubernetes Cluster Writer
function. This person shall be required to provision clusters:
useful resource “vcd_org_user” “cluster_author” {
title = “cluster_author”
password = “dummyPassword” # This one must be most likely a wise variable and a bit safer.
function = information.vcd_global_role.k8s_cluster_author.title
}
|
information “vcd_global_role” “k8s_cluster_author” { title = “Kubernetes Cluster Writer” }
useful resource “vcd_org_user” “cluster_author” { title = “cluster_author” password = “dummyPassword” # This one must be most likely a wise variable and a bit safer. function = information.vcd_global_role.k8s_cluster_author.title } |
Now, we are able to full the customization of the configuration file 3.11-cluster-creation.tf
by renaming terraform.tfvars.instance
to terraform.tfvars
and configuring the parameters of our cluster. Let’s verify ours:
cluster_author_token_file = “cse_cluster_author_api_token.json”
k8s_cluster_name = “instance”
cluster_organization = “tenant_org”
cluster_vdc = “tenant_vdc”
cluster_routed_network = “tenant_net_routed”
control_plane_machine_count = “1”
worker_machine_count = “1”
control_plane_sizing_policy = “TKG small”
control_plane_placement_policy = “”””
control_plane_storage_profile = “*”
worker_sizing_policy = “TKG small”
worker_placement_policy = “”””
worker_storage_profile = “*”
disk_size = “20Gi”
tkgm_catalog = “tkgm_catalog”
tkgm_ova_name = “ubuntu-2004-kube-v1.25.7+vmware.2-tkg.1-8a74b9f12e488c54605b3537acb683bc”
pod_cidr = “100.96.0.0/11”
service_cidr = “100.64.0.0/13”
tkr_version = “v1.25.7—vmware.2-tkg.1”
tkg_version = “v2.2.0”
auto_repair_on_errors = true
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
|
vcd_url = “https://…” cluster_author_user = “cluster_author” cluster_author_password = “dummyPassword”
cluster_author_token_file = “cse_cluster_author_api_token.json”
k8s_cluster_name = “instance” cluster_organization = “tenant_org” cluster_vdc = “tenant_vdc” cluster_routed_network = “tenant_net_routed”
control_plane_machine_count = “1” worker_machine_count = “1”
control_plane_sizing_policy = “TKG small” control_plane_placement_policy = “””” control_plane_storage_profile = “*”
worker_sizing_policy = “TKG small” worker_placement_policy = “””” worker_storage_profile = “*”
disk_size = “20Gi” tkgm_catalog = “tkgm_catalog” tkgm_ova_name = “ubuntu-2004-kube-v1.25.7+vmware.2-tkg.1-8a74b9f12e488c54605b3537acb683bc”
pod_cidr = “100.96.0.0/11” service_cidr = “100.64.0.0/13”
tkr_version = “v1.25.7—vmware.2-tkg.1” tkg_version = “v2.2.0”
auto_repair_on_errors = true |
We will discover that control_plane_placement_policy = """"
, that is to keep away from errors after we don’t need to use a VM Placement Coverage. We will verify that the downloaded CAPVCD YAML forces us to put double quotes on this worth when it’s not used.
The tkr_version
and tkg_version
values had been obtained from the already supplied in the documentation.
As soon as we’re proud of the totally different choices, we apply the configuration:
|
terraform init terraform apply |
Now we should always evaluation the plan as a lot as potential to forestall errors. It ought to create the vcd_rde
useful resource with the weather we supplied.
We full the operation (by writing sure
to the immediate) so the cluster ought to begin getting created. We will monitor the method both in UI or with the 2 outputs supplied for example:
output “computed_k8s_cluster_status” {
worth = native.has_status && !native.being_deleted ? native.k8s_cluster_computed[”standing”][”vcdKe”][”state”] : null
}
output “computed_k8s_cluster_events” {
worth = native.has_status && !native.being_deleted ? native.k8s_cluster_computed[”standing”][”vcdKe”][”eventSet”] : null
}
|
locals tobool(jsondecode(vcd_rde.k8s_cluster_instance.input_entity)[“spec”][“vcdKe”][“forceDelete”]) has_status = lookup(native.k8s_cluster_computed, “standing”, null) != null
output “computed_k8s_cluster_status” { worth = native.has_status && !native.being_deleted ? native.k8s_cluster_computed[“standing”][“vcdKe”][“state”] : null }
output “computed_k8s_cluster_events” { worth = native.has_status && !native.being_deleted ? native.k8s_cluster_computed[“standing”][“vcdKe”][“eventSet”] : null } |
Then we are able to do terraform refresh
as many occasions as we wish, to watch the occasions with:
|
terraform output computed_k8s_cluster_status terraform output computed_k8s_cluster_events |
As soon as computed_k8s_cluster_status
states provisioned
, this step shall be completed and the cluster shall be prepared to make use of. Let’s retrieve the Kubeconfig, which in CSE 4.1 is finished utterly in another way than in 4.0, as we’re required to invoke a Conduct to get it. In 3.11-cluster-creation.tf
we are able to see a commented part that has a vcd_rde_behavior_invocation
information supply. If we uncomment these and do one other terraform apply
, we should always be capable of get the Kubeconfig by operating
|
terraform output kubeconfig |
We will reserve it to a file to begin interacting with our cluster and kubectl
.
Cluster replace
Instance use case: we realized that our cluster is just too small, so we have to scale it up. We’ll arrange 3 employee nodes.
To replace it, we have to make certain that it’s in provisioned
standing. For that, we are able to use the identical mechanism that we used when the cluster creation began:
|
terraform output computed_k8s_cluster_status |
This could show provisioned
. If that’s the case, we are able to proceed with the replace.
As with the cluster creation, we first want to know how the vcd_rde
useful resource works to keep away from errors, so it’s inspired to verify each the vcd_rde
documentation and the cluster administration information earlier than continuing. The essential concept is that we should replace the input_entity
argument with the knowledge that CSE saves within the computed_entity
attribute, in any other case, we may break the cluster.
To do this, we are able to use the next output that can return the computed_entity
attribute:
|
output “computed_k8s_cluster” { worth = vcd_rde.k8s_cluster_instance.computed_entity # References the created cluster } |
Then we run this command to reserve it to a file for a greater studying:
|
terraform output –json computed_k8s_cluster > computed.json |
Let’s open computed.json
for inspection. We will simply see that it appears just about the identical as tkgmcluster.json.template
however with the addition of a giant "standing"
object that accommodates important details about the cluster. This should be despatched again on updates, so we copy the entire "standing"
object as it’s and we place it within the unique tkgmcluster.json.template
.
After that, we are able to change worker_machine_count = 1
to worker_machine_count = 3
within the current terraform.tfvars, to finish the replace course of with:
Now it’s essential to confirm and make certain that the output plan exhibits that the "standing"
is being added to the input_entity
payload. If that isn’t the case, we should always cease the operation instantly and verify what went flawed. If "standing"
is seen within the plan as being added, you possibly can full the replace operation by writing sure
to the immediate.
Cluster deletion
The principle concept of deleting a TKGm cluster is that we should always not use terraform destroy
for that, even when that’s the first concept we take note of. The reason being that the CSE Server creates a variety of parts (VMs, Digital Companies, and many others) that will be in an “orphan” state if we simply delete the cluster RDE. We have to let the CSE Server do the cleanup for us.
For that matter, the vcd_rde
current in 3.11-cluster-creation.tf
accommodates two particular arguments, that mimic the deletion possibility from UI:
|
delete = false # Make this true to delete the cluster force_delete = false # Make this true to forcefully delete the cluster |
To set off an asynchronous deletion course of we should always change them to true
and execute terraform apply
to carry out an replace. We should additionally introduce the latest "standing"
object to the tkgmcluster.json.template
when making use of, just about like within the replace situation described within the earlier part.
Closing ideas
We hope you loved the method of putting in CSE 4.1 in your VMware Cloud Director equipment. For a greater understanding of the method, please learn the prevailing set up and cluster administration guides.