Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working on arm64 architecture #1568

Open
phiilu opened this issue Dec 30, 2023 · 17 comments
Open

Not working on arm64 architecture #1568

phiilu opened this issue Dec 30, 2023 · 17 comments
Labels
arm64 Arm64 support hardware required Hardware is required to support/test some feature or bug

Comments

@phiilu
Copy link

phiilu commented Dec 30, 2023

Describe the bug
I wanted to install mayastor via helm on my arm64 Talos 1.6.1 server, but I can't get it working. It seems like the etcd version that is used depends on an image docker.io/bitnami/bitnami-shell:11-debian-11-r63 which is not built for arm64 and therefore fails to succeed.

 ➜  ~ kubectl describe pods mayastor-etcd-0
(...)
Events:
  Type     Reason     Age                  From               Message
  ----     ------     ----                 ----               -------
  Normal   Scheduled  16m                  default-scheduler  Successfully assigned mayastor/mayastor-etcd-0 to n2-storage
  Normal   Pulled     15m (x5 over 16m)    kubelet            Container image "docker.io/bitnami/bitnami-shell:11-debian-11-r63" already present on machine
  Normal   Created    15m (x5 over 16m)    kubelet            Created container volume-permissions
  Normal   Started    15m (x5 over 16m)    kubelet            Started container volume-permissions
  Warning  BackOff    103s (x70 over 16m)  kubelet            Back-off restarting failed container volume-permissions in pod mayastor-etcd-0_mayastor(52798499-a0d7-42ef-acdd-5519341ed07f)
 ➜  ~ kubectl logs pods/mayastor-etcd-0 -c volume-permissions
exec /bin/bash: exec format error

To Reproduce

 helm repo add mayastor https://openebs.github.io/mayastor-extensions/
 ➜  ~ helm search repo mayastor --versions
NAME             	CHART VERSION	APP VERSION	DESCRIPTION
mayastor/mayastor	2.5.0        	2.5.0      	Mayastor Helm chart for Kubernetes
mayastor/mayastor	2.4.0        	2.4.0      	Mayastor Helm chart for Kubernetes
mayastor/mayastor	2.3.0        	2.3.0      	Mayastor Helm chart for Kubernetes
mayastor/mayastor	2.2.0        	2.2.0      	Mayastor Helm chart for Kubernetes
mayastor/mayastor	2.1.0        	2.1.0      	Mayastor Helm chart for Kubernetes
mayastor/mayastor	2.0.1        	2.0.1      	Mayastor Helm chart for Kubernetes
mayastor/mayastor	2.0.0        	2.0.0      	Mayastor Helm chart for Kubernetes

These are the helm values I used (the basePaths are specific for Talos):

# values.yaml
etcd:
  localpvScConfig:
    basePath: /var/openebs/local/{{ .Release.Name }}/etcd

loki-stack:
  localpvScConfig:
    basePath: /var/openebs/local/{{ .Release.Name }}/loki

io_engine:
  nodeSelector:
    openebs.io/engine: mayastor

nodeSelector: {}
 helm install mayastor mayastor/mayastor -n mayastor --create-namespace --version 2.5.0 --values values.yaml

Expected behavior
Mayastor works on arm64 server

OS info:

  • Distro: Talos
  • Kernel 1.6.1
  • MayaStor 2.5.0

Additional context

Client:
	Tag:         v1.6.0
	SHA:         eddd188c
	Built:
	Go version:  go1.21.5 X:loopvar
	OS/Arch:     darwin/arm64
Server:
	NODE:        94.XX.XX.XX (IPv4)
	Tag:         v1.6.1
	SHA:         0af17af3
	Built:
	Go version:  go1.21.5 X:loopvar
	OS/Arch:     linux/arm64
	Enabled:     RBAC
@phiilu phiilu added the NEW New issue label Dec 30, 2023
@tiagolobocastro
Copy link
Contributor

We don't currently provide arm64 installs for mayastor, only for the kubectl plugin.
This is not for a technical reason AFAIK but rather because we have no hardware available to test on arm64 :(
Though IIRC there's a couple of users that build their own arm64 images from our code, maybe they can help out, I think they might have either raised issues here or on slack.

@tiagolobocastro tiagolobocastro removed the NEW New issue label Jan 19, 2024
@tiagolobocastro tiagolobocastro added the arm64 Arm64 support label Jan 20, 2024
@tiagolobocastro
Copy link
Contributor

Leaving this open as the open issue for arm64 support.
This is not on the roadmap, but if some external contributor has some arm servers for us to test on, we'd be happy to consider arm64 support.

@tiagolobocastro tiagolobocastro added the hardware required Hardware is required to support/test some feature or bug label Jan 20, 2024
@felipesere
Copy link

What does testing look like here? I am about to setup a TuringPi 2 homelab with Talos and I wanted to use Mayastor for storage.

Given that I use an MacBook Air M1 I can probably also test on that in the interim?

@jphastings
Copy link

I have a TuringPi 2 set up (2x RK1s + 2x CM4s) with Talos (great minds @felipesere!) and I'm having this same problem.

I'm inexperienced with kubernetes, but I can learn fast & very happy to test anything other community members are able to create. (I'll be keeping an eye on this issue for others).

@tiagolobocastro, what hardware would you need to be able to support this officially? (I'm asking to see if I, or others in the TuringPi 2 + Talos community, could find a way to provide it for you).

@tiagolobocastro
Copy link
Contributor

What does testing look like here? I am about to setup a TuringPi 2 homelab with Talos and I wanted to use Mayastor for storage.

We have a bunch of repos for the different components of mayastor, though tbh the only place where the arch would affect things would be the dataplane (this repo).
Besides per repo CI we do lots of system testing on hetzner VM's (these are kubernetes clusters on VMs).

Given that I use an MacBook Air M1 I can probably also test on that in the interim?

M1 is a different kettle a fish as it's not linux, so things like udev won't work, no nvme-tcp initiator etc.
Still you might be able to build on this repo, not sure. Maybe @hrudaya21 and @dsharma-dc have tried it?

@tiagolobocastro
Copy link
Contributor

I have a TuringPi 2 set up (2x RK1s + 2x CM4s) with Talos (great minds @felipesere!) and I'm having this same problem.

I'm inexperienced with kubernetes, but I can learn fast & very happy to test anything other community members are able to create. (I'll be keeping an eye on this issue for others).

@tiagolobocastro, what hardware would you need to be able to support this officially? (I'm asking to see if I, or others in the TuringPi 2 + Talos community, could find a way to provide it for you).

We would need at least some VM's for the per-repo CI.
The tricky one would be system test, as that creates a swarm of VM's and runs for many hours. I guess we could perhaps run only the ones which we thing would affect the cpu architecture, CC @avishnu

@jphastings
Copy link

jphastings commented Mar 7, 2024

We would need at least some VM's for the per-repo CI.

If providing this could be as simple as reaching an initial and monthly funding goal on something like open collective then please let me know, and I can see if there's enough interest in the TuringPi forums!

(Naturally any target funding amounts would need to be large enough to cover not just the first month of the VMs, but enough months to make long term maintenance viable, even as monthly donation commitments fluctuate.)

@sebadob
Copy link

sebadob commented Mar 20, 2024

I can confirm that its not working with current versions.

I just tried it on a Raspberry Pi + K3s.

A few tiny things that I already fixed on the fly:

Get rid of docker.io/bitnami/bitnami-shell in the volume-permissions containers for loki and etcd, since it has been deprecated officially in favor of bitnami/os-shell which provides an arm64 version again.
I also had to modify the etcd image tag, because not all of them exist for arm64. I just picked the current latest just for testing, since it exists for arm64.

Sadly, etcd is the last thing that does not start up. The pods crash loop, exiting with

ERROR ==> Headless service domain does not have an IP per initial member in the cluster

after a while. This sounds like a config / setup issue, not being architecture related though.

Apart from this, just a question on the side:
Why use nats + etcd? If nats is being used with Mayastor already, why not just use jetstream kv store from nats and don't need etcd at all? This would simplify the deployment and save resources.

edit:

Btw I just used the helm command from the docs without any further configuration:

helm install openebs openebs/openebs --namespace openebs --create-namespace --set mayastor.enabled=true

Regarding the only left etcd error, I found #1421 which is exactly about that. So I guess arm support would be fixed (so far) by just doing the above changes to the helm charts.

@FriedCircuits
Copy link

FriedCircuits commented Apr 24, 2024

Any update on this? I have a mixed cluster with both x86 and TuringPi2 RK1 nodes. Suggestions by @sebadob worked great. Thanks.

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

@sebadob
Copy link

sebadob commented Apr 24, 2024

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

I don't have it running anymore, but they simply started. I did not do anything special there. This was an arm-only cluster.

But I moved away from it again because I did not like it that the Mayastor pod did busy waiting and consumed the full CPU all the time, even when the cluster was in idle. I get it that this will reduce latency when doing this sync instead of async, but my clusters are usually rather small and then the 100% cpu usage all the time is nothing I wanted.
So I cannot provide more information, sorry.

@tiagolobocastro
Copy link
Contributor

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

Try using --set nodeSelector={}

@FriedCircuits
Copy link

FriedCircuits commented Apr 24, 2024

Btw @sebadob What did you do to get csi node daemon set to run on ARM64? Mine are only scheduling on x86 nodes.

Try using --set nodeSelector={}

Thanks but it looks like the mayastor-csi-node image doesn't have a arm64 build.
https://hub.docker.com/r/openebs/mayastor-csi-node/tags

Darn, I bought some drives and was trying this out since Jiva was unreliable and has a memory leak in the NDM and it really doesn't like node restarts.

@tiagolobocastro
Copy link
Contributor

I'm afraid we don't currently build arm64 images.
If you have an arm64 system you could try building the csi-node yourself. I think there's a couple of users which build their own, maybe you can ask on slack.

@tidux
Copy link

tidux commented Jun 5, 2024

Can you guys just get a Mac Mini or something if you need an ARM build host?

@ThorbenJ
Copy link

ThorbenJ commented Jul 6, 2024

Leaving this open as the open issue for arm64 support. This is not on the roadmap, but if some external contributor has some arm servers for us to test on, we'd be happy to consider arm64 support.

Hi, And OrangePi 5 Plus http://www.orangepi.org/html/hardWare/computerAndMicrocontrollers/details/Orange-Pi-5-plus-32GB.html has:

  • 8x CPU cores (4x 2.4GHz + 4x 1.8GHz)
  • 32GB LPDDR4X !
  • M.2 2280 slot for NVMe SSDs (PCIe 3.0 x4)
  • Another M.2 E slot, that with an adapter gets you another M.2 M with 2x PCIe
  • 2x 2.5Gbps NICs

Its not a big grunt server, but it only cost about USD 180 - and so spoke to me as a great home cluster option.

This not an advert, its what I got six of (plus another three 16GB RAM version for k8s control plane) and would love to run OpenEBS on it. I wanted to use Local PV LVM on the M.2 SSDs for fast local storage, and Mayastor on Large USB attacked SSDs for long term redundnat/resiliant storage (Think NextCloud with TBs of photos and videos; care more about recovery than speed)

I would be happy to donate some money so that someone could buy one or two of these, if it meant official arm64 support soon (since it appears most of the work as already be done, but one has to build their own images). Please let me know.

@tiagolobocastro
Copy link
Contributor

That's very kind of you, thank you.
I'll talk to the team about the possibility - though don't think we're geared up in any way to accept donations, sending us the hardware might be easier perhaps.

@maxwnewcomer
Copy link
Contributor

maxwnewcomer commented Dec 14, 2024

Just dropping in here because I have began digging into getting the OEP 3817 started. I figured building on aarch64-apple-darwin wouldn't work, but wanted to give it a shot anyways. Unfortunately, confirmed my suspicion when running the docker build steps here. and when simply running nix-shell natively (no-docker) raises a libaio dep issue in spdk + i'm sure there would be many more issues down the build pipeline.

During the spdk-dev build step with --platform linux/arm64 on the first docker build step, the sm3_mb/aarch64/sm3_mb_sve.lo lib object file seems to be causing issues in the build (as noted in linked initial issue on the OEP, here #1751, and confirmed today on my arm machine running a non-darwin docker container).

I did get the first docker command linked above working on --platform linux/amd64 after running echo "filter-syscalls = false" >> /etc/nix/nix.conf (some qemu docker on apple virtualization bpf nonsense see here). Will test the rest of the docker build steps and then maybe make a pr to the documentation as a work around for running on apple silicon.

This may be a little bit of a grind to get working on arm for everyone, but I know that first time we get helm install openebs --namespace openebs openebs/openebs --create-namespace --set mayastor.enabled=true working on an arm cluster will be ~magical~!!

@ThorbenJ That OrangePi 5 looks sick, but I (not officially involved with the openebs maintenance team) will probably just try to get some stuff rolling on a Hetzner arm server. My Hetzner account got all screwed up last month, but should hopefully up and running sometime next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arm64 Arm64 support hardware required Hardware is required to support/test some feature or bug
Projects
None yet
Development

No branches or pull requests

10 participants