SUSE AI on SLES 15 SP6
In this post I will try to walk you through the steps needed to get a working SUSE AI stack on a single node system. In my example, I will be using a gaming laptop which cost me about 900EUR and has an NVIDIA 3050 inside. I also added memory to move from 16G to 64G memory.The baseline not covered in this post is listed below in bullet form.
- SLES installed with text mode X server
- DNS A record of hostname.domainname.tld pointing to your host IP
- DNA A record of *.apps.hostname.domainname.tld pointing to your host IP
- SSH access so all steps can be performed inside the SLES OS
Install NVIDIA GPU Driver
Normally the operator will install the driver but we are doing this today separately so the nvidia-open drivers will be used. From your SLES terminal or SSH connection you can run these steps to get your driver installed.sudo -i zypper ar https://download.nvidia.com/suse/sle15sp6/ nvidia-sle15sp6-main zypper --gpg-auto-import-keys refresh zypper install -y --auto-agree-with-licenses nvidia-open-driver-G06-signed-kmp nvidia-compute-utils-G06 exit reboot
Install RKE2
From your SLES you can run these commands.sudo -i curl -sfL https://get.rke2.io | sh - systemctl enable --now rke2-server.service^This last step takes time depending on your internet speed^
Copy over your kubeconfig and install kubectl/helm
This step will allow you to run kubectl and helm commands from inside the SLES node as root user.sudo -i mkdir .kube cat /etc/rancher/rke2/rke2.yaml > .kube/config cd /usr/local/bin curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" chmod 700 kubectl cd curl -fsSL -o get_helm.sh https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 chmod 700 get_helm.sh ./get_helm.sh
Grab your helm and kubectl token from apps.rancher.io
Go to apps.rancher.io and click on the top right icon to go to user settings. Once there you can generate tokens.
sudo -i
helm registry login dp.apps.rancher.io -u elajoie@suse.com -p XYZXYZ
kubectl create secret docker-registry application-collection --docker-server=dp.apps.rancher.io\
--docker-username=elajoie@suse.com --docker-password=XYZXYZ
Install Rancher Prime and CertManager
These steps will get your Rancher GUI working and Cert-Manager for your ingress controller cert manager.sudo -i helm repo add jetstack https://charts.jetstack.io kubectl apply -f https://github.com/jetstack/cert-manager/releases/download/v1.6.1/cert-manager.crds.yaml helm upgrade cert-manager jetstack/cert-manager --install --create-namespace --namespace cert-manager helm repo add rancher-prime https://charts.rancher.com/server-charts/prime kubectl create namespace cattle-system helm upgrade rancher rancher-stable/rancher --install --create-namespace \ --namespace cattle-system --set installCRDs=true --set hostname=ai.lajoie.de \ --set bootstrapPassword=admin --set replicas=1 kubectl -n cattle-system rollout status deploy/rancherOnce the rollout status is complete you can now go to your URL which in the example above if ai.lajoie.de. Once on the site you can used the admin password and put in a new secure password for the admin user. With all these steps done, you now have a working single node rancher cluster.
Install NVIDIA GPU Operator
Go to apps.rancher.io and click on the top right icon to go to user settings. Once there you can generate tokens.sudo -i kubectl label node ai accelerator=nvidia-gpu helm repo add nvidia https://helm.ngc.nvidia.com/nvidia helm repo update helm install --wait nvidiagpu -n gpu-operator --create-namespace \ --set toolkit.env[0].name=CONTAINERD_CONFIG \ --set toolkit.env[0].value=/var/lib/rancher/rke2/agent/etc/containerd/config.toml.tmpl \ --set toolkit.env[1].name=CONTAINERD_SOCKET \ --set toolkit.env[1].value=/run/k3s/containerd/containerd.sock \ --set toolkit.env[2].name=CONTAINERD_RUNTIME_CLASS --set toolkit.env[2].value=nvidia \ --set toolkit.env[3].name=CONTAINERD_SET_AS_DEFAULT \ --set-string toolkit.env[3].value=true nvidia/gpu-operatorWith the step obove complete you can check the status of things with the two examples below.
kubectl describe pod -n gpu-operator -l app=nvidia-operator-validator kubectl get node ai -o jsonpath='{.metadata.labels}' | jq | grep "nvidia.com"
Install Local Storage (local-path)
Since we need PVCs for both Milvus and OWUI we will use local-path as out storage solution.kubectl apply -f \ https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml
Install SUSE AI
This release of SUSE AI in February 2025 is about two month old and things are progressing fast so by the time you read or see this there may be a ton of new features and new ways to install. We will do a two step install where first we install Milvus and secondly we install a combo Ollama & OWUI together.kubectl create namespace suseai kubectl create secret docker-registry application-collection --docker-server=dp.apps.rancher.io \ --docker-username=elajoie@suse.com --docker-password=XYZXYZ -n suseai helm pull oci://dp.apps.rancher.io/charts/milvus --version 4.2.2
Now create your milvus config file.
vi milvus_custom_overrides.yaml global: imagePullSecrets: - application-collection cluster: enabled: True standalone: persistence: persistentVolumeClaim: storageClass: local-path etcd: replicaCount: 1 persistence: storageClassName: local-path minio: mode: distributed replicas: 3 rootUser: "admin" rootPassword: "adminminio" persistence: storageClass: local-path resources: requests: memory: 512Mi kafka: enabled: true name: kafka replicaCount: 3 broker: enabled: true cluster: listeners: client: protocol: 'PLAINTEXT' controller: protocol: 'PLAINTEXT' persistence: enabled: true annotations: {} labels: {} existingClaim: "" accessModes: - ReadWriteOnce resources: requests: storage: 8Gi storageClassName: "local-path"
helm upgrade --install \ milvus oci://dp.apps.rancher.io/charts/milvus -n suseai \ --version 4.2.2 -f milvus_custom_overrides.yamlNow create your Ollama and OWUI customer overide file.
vi owui_custom_overrides.yaml global: imagePullSecrets: - application-collection image: registry: dp.apps.rancher.io repository: containers/open-webui tag: 0.3.32 pullPolicy: IfNotPresent ollamaUrls: - http://open-webui-ollama.suseai.svc.cluster.local:11434 persistence: enabled: true storageClass: local-path ollama: enabled: true image: registry: dp.apps.rancher.io repository: containers/ollama tag: 0.3.6 pullPolicy: IfNotPresent imagePullSecrets: XYZXYZ ingress: enabled: false defaultModel: "gemma:2b" ollama: models: - "gemma:2b" - "llama3.1" gpu: enabled: true type: 'nvidia' number: 1 persistentVolume: enabled: true storageClass: local-path pipelines: enabled: False persistence: storageClass: local-path ingress: enabled: true class: "" annotations: nginx.ingress.kubernetes.io/ssl-redirect: "true" host: suseai.apps.ai.lajoie.de tls: true extraEnvVars: - name: DEFAULT_MODELS value: "gemma:2b" - name: DEFAULT_USER_ROLE value: "user" - name: WEBUI_NAME value: "SUSE AI" - name: GLOBAL_LOG_LEVEL value: INFO - name: RAG_EMBEDDING_MODEL value: "sentence-transformers/all-MiniLM-L6-v2" - name: VECTOR_DB value: "milvus" - name: MILVUS_URI value: http://milvus.suseai.svc.cluster.local:19530
helm upgrade --install open-webui oci://dp.apps.rancher.io/charts/open-webui \\ -n suseai --version 3.3.2 -f owui_custom_overrides.yaml
Now you should be able to see your OWUI GUI available via https://suseai.apps.ai.lajoie.de or your own URL configured in your demo.