-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add checkpoint uds-core slim package #818
Conversation
Checkpoint task passed in this PR (except for the actual publish task) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not an approver but the code does look good to me. I would like to see more information on how to use this package though so it's more clear on how/why/when someone would want to use it.
Probably ignore all of the following, I tried testing CRIU ( Did you try If you use docker rm -f count
sudo rm -rf /tmp/checkpoint
docker run -d --name=count busybox /bin/sh -c 'for i in $(seq 9999999); do echo "$i" && sleep 1; done'
docker checkpoint create --checkpoint-dir=/tmp/checkpoint count checkpoint1
docker rm count
docker create --name count busybox
# Apparently `docker start --checkpoint-dir` is broken, use workaround: https://github.com/moby/moby/issues/37344#issuecomment-450782189
# docker start --checkpoint-dir /tmp/checkpoint --checkpoint checkpoint1 count
sudo mv /tmp/checkpoint/checkpoint1 "/var/lib/docker/containers/$(docker ps -aq --no-trunc --filter name=count)/checkpoints/"
docker start --checkpoint=checkpoint1 count
docker ps
docker logs -f count The biggest downside would be this is near impossible to use with Docker Desktop. A big advantage is the cluster never actually "stops", it's magically paused and resumed elsewhere. Podman seems to support this too, and seems to be a bit more fully supported. k3d (somewhat) supports Podman too. Unlike docker, Podman's CRIU support includes volumes, and capturing multiple containers at once. It can apparently pack the checkpoint into an OCI image too (useful for publishing to GHCR?) Except... this whole idea may be useless because don't think CRIU supports checkpointing nested namespaces (which is how k3d works to embed sub containers inside it's parent docker container for the k8s node) limactl start template://podman-rootful
export DOCKER_HOST=unix://$HOME/.lima/podman-rootful/sock/podman.sock
k3d cluster create
limactl shell podman-rootful sudo podman container checkpoint --export=/tmp/lima/checkpoint.tgz k3d-k3s-default-server-0 k3d-k3s-default-serverlb
# Error:
# Can't dump nested pid namespace for 4663 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few more comments + need an update to the release-please config to ensure the checkpoint zarf.yaml is versioned properly: https://github.com/defenseunicorns/uds-core/blob/gotta-go-fast/release-please-config.json#L14
Co-authored-by: Micah Nagel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the work on this!!! Would be great to revisit the macOS support at some point and look at other places we could checkpoint things as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops missed this one - need an update to the release-please config to ensure the checkpoint zarf.yaml is versioned properly: https://github.com/defenseunicorns/uds-core/blob/gotta-go-fast/release-please-config.json#L14
🤖 I have created a release *beep* *boop* --- ## [0.32.0](v0.31.2...v0.32.0) (2024-11-22) ### Features * add ability to add custom netpols for prometheus-stack package ([#997](#997)) ([472f9c5](472f9c5)) * add checkpoint uds-core slim package ([#818](#818)) ([d95f6be](d95f6be)) * allow additional network rules for grafana and neuvector ([#1038](#1038)) ([5c84007](5c84007)) ### Bug Fixes * keycloak upgrade wait ([#1037](#1037)) ([1207812](1207812)) ### Miscellaneous * add variables for pepr memory requests in dev/demo bundles ([#1021](#1021)) ([867501c](867501c)) * architecture diagrams ([#1024](#1024)) ([d0bca43](d0bca43)) * **deps:** update grafana helm chart ([#998](#998)) ([25d4c29](25d4c29)) * **deps:** update grafana to v11.3.1 ([#1023](#1023)) ([8d3cf3a](8d3cf3a)) * **deps:** update husky to v9.1.7 ([#1014](#1014)) ([0d9a854](0d9a854)) * **deps:** update kfc for jest to v3.3.3 ([#1015](#1015)) ([eba189e](eba189e)) * **deps:** update neuvector to 5.4.0 ([#778](#778)) ([ccd0a32](ccd0a32)) * **deps:** update pepr to v0.40.1 ([#1025](#1025)) ([871bdad](871bdad)) * **deps:** update support-deps ([#1006](#1006)) ([bfb66a4](bfb66a4)) * **deps:** update support-deps ([#1019](#1019)) ([82dfb32](82dfb32)) * **deps:** update velero helm chart to v8 ([#999](#999)) ([e8187be](e8187be)) * fix duplicative checkpoint publish location ([#1020](#1020)) ([b497fc5](b497fc5)) * update diagrams ([#1035](#1035)) ([cca5e2c](cca5e2c)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Description
This adds a ~75% faster way to deploy or reset a full uds-core cluster (theoretically would work for other preloaded things like testing GitLab Runner w/GitLab too).
Normal:
Checkpoint:
Tradeoffs:
sudo
- not sure of a great way around this without mangling volume permissions for containerdRelated Issue
Fixes #N/A
Type of change
Checklist before merging