uyuni server 2024.xx , where xx >08 #9461

fritz0011 · 2024-11-12T13:46:31Z

Problem description

Please, continue to support uyuni server release as rpm based, or make the container/helm compatible with rancher/k8s.
So far, in case of clean install, is a damn pain to set/run it on a kubernetes cluster.

Steps to reproduce

1.See problem description
...

Uyuni version

2024.08

Uyuni proxy version (if used)

No response

Useful logs

No response

Additional information

No response

rjmateus · 2024-11-13T16:06:15Z

Did you try the podman installation method?

fritz0011 · 2024-11-26T16:59:07Z

# mgradm install kubernetes uyuni-ORG.apps.DOMAIN --organization ORG --helm-uyuni-namespace uyuni-master --logLevel debug
4:48PM INF mgradm/cmd/cmd.go:66 > Welcome to mgradm
4:48PM INF mgradm/cmd/cmd.go:67 > Executing command: kubernetes
4:48PM DBG shared/utils/exec.go:66 > Running: timedatectl show --value -p Timezone
Administrator password:
Confirm the password:
4:48PM DBG shared/utils/exec.go:66 > Running: kubectl get node -o jsonpath={.items[0].status.nodeInfo.kubeletVersion}
4:48PM DBG shared/utils/exec.go:40 > Running: kubectl explain ingressroutetcp
4:48PM DBG shared/kubernetes/kubernetes.go:76 > No ingressroutetcp resource deployed error="exit status 1"
4:48PM DBG shared/utils/exec.go:66 > Running: kubectl get pod -A -o jsonpath={range .items[*]}{.spec.containers[*].args[0]}{.spec.containers[*].command}{end}
4:48PM DBG shared/utils/exec.go:66 > Running: kubectl get -o jsonpath={.items[?(@.metadata.name=="cert-manager")].status.readyReplicas} deploy -A
4:48PM DBG shared/utils/exec.go:66 > Running: kubectl get pod -o jsonpath={.items[?(@.metadata.labels.app=="webhook")].metadata.name} -A
4:48PM INF shared/kubernetes/utils.go:74 > Waiting for image of cert-manager-webhook-7d4c676646-spsk4 pod in  namespace to be pulled
4:48PM DBG shared/utils/exec.go:66 > Running: kubectl get event -o jsonpath={range .items[?(@.reason=="Failed")]}{.message}{"\n"}{end} --field-selector involvedObject.name=cert-manager-webhook-7d4c676646-spsk4 -A
4:48PM DBG shared/utils/exec.go:66 > Running: kubectl get event -o jsonpath={.items[?(@.reason=="Pulled")].message} --field-selector involvedObject.name=cert-manager-webhook-7d4c676646-spsk4 -A
⠋ kubectl get event -o jsonpath={range .items[?(@.reason=="Failed")]}{.message}{"\n"}{end} --field-selector involvedObject.name=cert-manager-webhook-7d4c676646-spsk4 -A
4:48PM DBG shared/utils/exec.go:66 > Running: kubectl get event -o jsonpath={range .items[?(@.reason=="Failed")]}{.message}{"\n"}{end} --field-selector involvedObject.name=cert-manager-webhook-7d4c676646-spsk⠋ kubectl get event -o jsonpath={.items[?(@.reason=="Pulled")].message} --field-selector involvedObject.name=cert-manager-webhook-7d4c676646-spsk4 -A
⠋ kubectl get event -o jsonpath={range .items[?(@.reason=="Failed")]}{.message}{"\n"}{end} --field-selector involvedObject.name=cert-manager-webhook-7d4c676646-spsk4 -A

...and is getting into an infinite loop.....

P.S. cert-manager is installed and functional to , NS: cert-manager

cbosdo · 2025-01-06T09:30:13Z

@fritz0011 Kubernetes support is a work in progress and the problem in your case seem to be that the cert-manager start detection code isn't working correctly. I have more plans for Kubernetes, but those involve refactoring the Uyuni setup scripts and I have no idea when I'll be able to do it.

cbosdo · 2025-01-07T08:48:54Z

@fritz0011 I have recently refactored this part. Did you try with the latest version of mgradm?

fritz0011 · 2025-01-07T22:24:31Z

@cbosdo I have recently refactored this part. Did you try with the latest version of mgradm?

way better,
so, with new mgradm + lattest uyuni image::

mgradm install kubernetes uyuni-dev.apps.domain.local --organization <org> --volumes-database-size 150Gi --volumes-cache-size 100Gi --volumes-www-size 250Gi --volumes-packages-size 350Gi --kubernetes-uyuni-namespace uyuni

deployment completed
-- pvc created according to their designated size, uyuni pod is starting, but fails to get into running state ::

11:27PM ??? pod/ran-setup-check configured
11:27PM ??? Asserting correct java version...
11:27PM ??? postconf: fatal: open /etc/postfix/main.cf for reading: No such file or directory
11:27PM ??? Job for postfix.service failed because the control process exited with error code.
See "systemctl status postfix.service" and "journalctl -xeu postfix.service" for details.
11:27PM ??? /usr/lib/susemanager/bin/mgr-setup: line 150: /etc/sysconfig/postgresql: No such file or directory
11:27PM ??? ===============================================================================
!
! This shell operates within a container environment, meaning that not all
! modifications will be permanently saved in volumes.
!
! Please exercise caution when making changes, as some alterations may not
! persist beyond the current session.
!
===============================================================================
11:27PM ??? CREATE ROLE
11:27PM ??? cat: /pg_hba.conf: No such file or directory
11:27PM ??? mv: cannot stat '/pg_hba.conf': No such file or directory
11:27PM ??? sed: can't read /etc/apache2/conf.d/zz-spacewalk-www.conf: No such file or directory
11:27PM ??? sed: can't read /etc/apache2/listen.conf: No such file or directory
11:27PM ??? * Loading answer file: /root/spacewalk-answers.
11:27PM ??? ** Database: Setting up database connection for PostgreSQL backend.
11:27PM ??? ** Database: Populating database.
** Database: --clear-db option used.  Clearing database.
11:27PM ??? ** Database: Shutting down spacewalk services that may be using DB.
11:27PM ??? ** Database: Services stopped.  Clearing DB.
11:27PM ??? Running spacewalk-sql --select-mode-direct /usr/share/susemanager/db/postgres/deploy.sql
11:27PM ??? *** Progress: #
11:27PM ???
11:27PM ??? * Performing initial configuration.
11:27PM ??? There was a problem deploying the satellite configuration.  Exit value: 2.
Please examine /var/log/rhn/rhn_installation.log for more information.
11:27PM ??? CA Cert for OS Images: Packaging /etc/pki/trust/anchors/LOCAL-RHN-ORG-TRUSTED-SSL-CERT into /srv/susemanager/salt/images/rhn-org-trusted-ssl-cert-osimage-1.0-1.noarch.rpm
11:27PM ??? ERROR: spacewalk-setup failed
11:27PM ??? command terminated with exit code 2
Error: error running the setup script: exit status 2

However, a very annoying thing during uninstall...

mgradm uninstall --backend kubectl
10:05PM INF Welcome to mgradm
10:05PM INF Executing command: uninstall
10:05PM INF Would run kubectl delete -n uyuni job,deploy,svc,ingress,pvc,cm,secret,issuers,certificates -l app.kubernetes.io/part-of=uyuni
10:05PM INF Would remove file /var/lib/rancher/rke2/server/manifests/uyuni-ingress-nginx-config.yaml

10:05PM WRN Nothing has been uninstalled, run with --force to actually uninstall

Cluster becomes un-operational because of: Would remove file /var/lib/rancher/rke2/server/manifests/uyuni-ingress-nginx-config.yaml

cbosdo · 2025-01-08T08:24:26Z

@cbosdo I have recently refactored this part. Did you try with the latest version of mgradm?

way better, so, with new mgradm + lattest uyuni image::

good!

11:27PM ??? pod/ran-setup-check configured
11:27PM ??? Asserting correct java version...
11:27PM ??? postconf: fatal: open /etc/postfix/main.cf for reading: No such file or directory
11:27PM ??? Job for postfix.service failed because the control process exited with error code.
See "systemctl status postfix.service" and "journalctl -xeu postfix.service" for details.
11:27PM ??? /usr/lib/susemanager/bin/mgr-setup: line 150: /etc/sysconfig/postgresql: No such file or directory

Strange. It seems that the initContainer somehow didn't populate the empty PVs with the files from the image.
Did you start from empty PVs?

However, a very annoying thing during uninstall...

mgradm uninstall --backend kubectl 10:05PM INF Welcome to mgradm 10:05PM INF Executing command: uninstall 10:05PM INF Would run kubectl delete -n uyuni job,deploy,svc,ingress,pvc,cm,secret,issuers,certificates -l app.kubernetes.io/part-of=uyuni 10:05PM INF Would remove file /var/lib/rancher/rke2/server/manifests/uyuni-ingress-nginx-config.yaml

10:05PM WRN Nothing has been uninstalled, run with --force to actually uninstall

Cluster becomes un-operational because of: Would remove file /var/lib/rancher/rke2/server/manifests/uyuni-ingress-nginx-config.yaml

Having to use --force to effectively uninstall is by design... I wonder how that could break the cluster though as it's supposed to change nothing. Do you have any additional info to help me? Ideally running uninstall with --logLevel=debug would tell us what commands are actually done. Do you still have the uyuni-ingress-nginx-config.yaml file? Note that the file has been renamed during the refactoring...

fritz0011 · 2025-01-08T09:25:34Z

@cbosdo

So,
here it is:

restore cluster from snapshot => clean install for uyuni
same error ::

about this file :uyuni-ingress-nginx-config.yaml

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: |-
    controller:
      config:
        hsts: "false"
    tcp:
      80: "uyuni/uyuni-tcp:80"
      5432: "uyuni/uyuni-tcp:5432"
      9187: "uyuni/uyuni-tcp:9187"
      4505: "uyuni/uyuni-tcp:4505"
      4506: "uyuni/uyuni-tcp:4506"
      25151: "uyuni/uyuni-tcp:25151"
      5556: "uyuni/uyuni-tcp:5556"
      9800: "uyuni/uyuni-tcp:9800"
      5557: "uyuni/uyuni-tcp:5557"
    udp:
      69: "uyuni/uyuni-udp:69"

++ suggestion:
etc-apache2 Bound pvc-db44858c-1368-4960-a98d-e19d42f32a7c 10Mi RWO longhorn 9h
etc-cobbler Bound pvc-a1b9b7c1-90ef-466f-b822-1d88809a9abd 10Mi RWO longhorn 9h
etc-postfix Bound pvc-b6a32dee-5e8d-44af-a6f3-c78cadfaf6a7 10Mi RWO longhorn 9h
etc-tomcat Bound pvc-fbe1d8dc-34ed-4a23-ad93-1784a6d1f3f1 10Mi RWO longhorn 9h

to use one PVC like etc-configs to be mounted inside container /etc/configs

apache2 => customized rpm install to /etc/configs/apache2
tomcat => customized rpm install to /etc/configs/tomcat
postfix => customized rpm install to /etc/configs/tomcat

cbosdo · 2025-01-08T15:07:55Z

@cbosdo

So, here it is:

* restore cluster from  snapshot => clean install for uyuni
  same error ::

You mean you still have the No such file or directory errors?
You should have a setup job that has been started. This job's pod should have an initContainer filling the volumes with the files that are in the image before mounting them in the final container. It seems this script is failing somehow.

k get pod -A -lapp.kubernetes.io/component=server will give you the name of the setup pod.
Then run something like k logs -n <yourNS> uyuni-setup-<timestamp>-<ID> -c init-volumes to get the logs of that container.

about this file :uyuni-ingress-nginx-config.yaml

apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
  name: rke2-ingress-nginx
  namespace: kube-system
spec:
  valuesContent: |-
    controller:
      config:
        hsts: "false"
    tcp:
      80: "uyuni/uyuni-tcp:80"
      5432: "uyuni/uyuni-tcp:5432"
      9187: "uyuni/uyuni-tcp:9187"
      4505: "uyuni/uyuni-tcp:4505"
      4506: "uyuni/uyuni-tcp:4506"
      25151: "uyuni/uyuni-tcp:25151"
      5556: "uyuni/uyuni-tcp:5556"
      9800: "uyuni/uyuni-tcp:9800"
      5557: "uyuni/uyuni-tcp:5557"
    udp:
      69: "uyuni/uyuni-udp:69"

At least the file looks correct. I don't understand what is broken in your cluster after uninstall? Is there some errors / log to help me?

++ suggestion: etc-apache2 Bound pvc-db44858c-1368-4960-a98d-e19d42f32a7c 10Mi RWO longhorn 9h etc-cobbler Bound pvc-a1b9b7c1-90ef-466f-b822-1d88809a9abd 10Mi RWO longhorn 9h etc-postfix Bound pvc-b6a32dee-5e8d-44af-a6f3-c78cadfaf6a7 10Mi RWO longhorn 9h etc-tomcat Bound pvc-fbe1d8dc-34ed-4a23-ad93-1784a6d1f3f1 10Mi RWO longhorn 9h

to use one PVC like etc-configs to be mounted inside container /etc/configs
* apache2 => customized rpm install to /etc/configs/apache2

* tomcat    => customized rpm install to /etc/configs/tomcat

* postfix    => customized rpm install to /etc/configs/tomcat

I don't really understand your suggestion. Could you please phrase it completely? We have several mounts for those folders to avoid persisting files we don't care that much about.

fritz0011 added bug Something isn't working P5 labels Nov 12, 2024

mcalmer added P4 and removed P5 labels Jan 3, 2025

cbosdo added the kubernetes Kubernetes-related label Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uyuni server 2024.xx , where xx >08 #9461

uyuni server 2024.xx , where xx >08 #9461

fritz0011 commented Nov 12, 2024

rjmateus commented Nov 13, 2024

fritz0011 commented Nov 26, 2024

cbosdo commented Jan 6, 2025

cbosdo commented Jan 7, 2025 •

edited

Loading

fritz0011 commented Jan 7, 2025 •

edited

Loading

cbosdo commented Jan 8, 2025

fritz0011 commented Jan 8, 2025

cbosdo commented Jan 8, 2025

uyuni server 2024.xx , where xx >08 #9461

uyuni server 2024.xx , where xx >08 #9461

Comments

fritz0011 commented Nov 12, 2024

Problem description

Steps to reproduce

Uyuni version

Uyuni proxy version (if used)

Useful logs

Additional information

rjmateus commented Nov 13, 2024

fritz0011 commented Nov 26, 2024

cbosdo commented Jan 6, 2025

cbosdo commented Jan 7, 2025 • edited Loading

fritz0011 commented Jan 7, 2025 • edited Loading

cbosdo commented Jan 8, 2025

fritz0011 commented Jan 8, 2025

cbosdo commented Jan 8, 2025

cbosdo commented Jan 7, 2025 •

edited

Loading

fritz0011 commented Jan 7, 2025 •

edited

Loading