Skip to content

Installation of Components

Notes on the Installation Environment

The instructions for installing the components that constitute the prerequisites for Data Analytics System are described with commands and syntax valid for use from a Linux terminal with a BASH shell.

In particular, the kubectl and helm commands will be used without explicitly specifying the --kubeconfig parameter, assuming therefore that the kubeconfig file of the cluster is pointed to by the $KUBECONFIG environment variable (or that a copy is saved in the file $HOME/.kube/config).

Furthermore, it is assumed that the host from which the commands are launched can reach the cluster at the URL of the server contained in the kubeconfig.

The following programs must also be available on that host:

  • jq
  • yq
  • base64
  • curl
  • wget
  • xxd

The versions used for client and server are:

  • kubectl v1.33.5
  • helm v3.19.0
  • kubernetes v1.33.5

The Kubernetes distribution used as a reference is K3s v1.33.5+k3s1 (https://k3s-io.github.io/).

Introduction

For a specific Data Analytics System installation, several resources (templates, manifests, dumps, directories) will need to be customized and updated at the time of installation.

Furthermore, in a separate file (e.g., .env.sh), some environment variables will need to be preset to support and simplify the installation procedure. To make these variables available in the terminal with which you execute the commands, it will be necessary to include it in the shell with the source command.

The components are installed using different methodologies (helm chart, operator, manifest, ...) and generally using for each some specific parameters typically structured as follows:

HELM_URL                                URL of the charts
HELM_REPO                               Name of the specific repo
RELEASE_NAME                            Name of the specific application
HELM_VER                                Version of the chart for the specific application
NAMESPACE                               Namespace in which to install the components and the chart
VALUES_FILE_NAME                        File of values to use
HELM_CHART="$HELM_REPO/$RELEASE_NAME"   Calculated value

Note

  • It is advisable to review the output produced by the execution of the helm install command because - in addition to providing the outcome of the operation itself - it may contain useful information to support the use and/or testing of the deployed components.

  • In general, it is always advisable to verify the completion and successful outcome of each step performed (even those that involve instructions different from helm) before proceeding with the next one; to this end, you can use commands like kubectl -n $NAMESPACE get all and/or kubectl get pod -A or other similar ones.

Service Exposure

The user interfaces that need to be accessed outside the cluster are configured using Kubernetes services in NodePort mode to allow their exposure through a Reverse Proxy (external to the cluster itself) which also functions as a load balancer.

For illustrative purposes only, here is a snippet of a minimal configuration based on Apache server:

<VirtualHost <PUBLIC_IP>:443>
  ServerName <SERVICE_NAME>.<TENANT_DOMAIN>

  SSLEngine on
  SSLCertificateFile /etc/letsencrypt/live/<SERVICE_NAME_CERT>/fullchain.pem
  SSLCertificateKeyFile /etc/letsencrypt/live/<SERVICE_NAME_CERT>/privkey.pem

  Header set Content-Security-Policy "..."

  RequestHeader set ...
  ProxyRequests Off
  ProxyPreserveHost on

  <Proxy balancer://<SERVICE_NAME>>
    BalancerMember http://<NODE_1_IP>:<NODEPORT> route=u1
    ...
    BalancerMember http://<NODE_X_IP>:<NODEPORT> route=u<X>
  </Proxy>

  ProxyPass / balancer://<SERVICE_NAME>/
  ProxyPassReverse / balancer://<SERVICE_NAME>/
</VirtualHost>

This approach is not binding for the operation of the platform and the system administrator can autonomously evaluate the preferred alternatives (e.g. use of Traefik and load balancers implemented within the cluster itself).

Regardless of the approach chosen, it is still necessary to register domain names (see Table 2) and have TLS certificates in order to access the services in HTTPS.