[v0.11] Backport of Add jitter and resync to polling #3195

manno · 2025-01-09T14:50:40Z

Backport of #3151, refers to #3138

… to 0.10.5 (#3053) * chore: Update Fleet asset URL * chore: Update Fleet CRD asset URL Made with ❤️️ by updatecli --------- Co-authored-by: fleet-bot <[email protected]>

* Improve namespace target customization tests These tests now verify that the created namespace does bear expected labels and annotations. This commit also paves the way for additional tests with customizations over unconfigured namespace labels and annotations, which currently cause a panic. * Initialise options maps when empty This prevents panics when namespace labels or annotations are configured as target customizations over nonexistent defaults. * Use main branch of `rancher/fleet-test-data The required changes made in that repo have been merged.

This fixes a linter error.

* Fix charts repo name population This simplifies reuse of variables across steps and jobs by making use of output variables, eliminating the need for additional environment variables. * Fix base and target branch uses When reusing a variable computed in another step, it is now explicitly sourced through outputs.

Made with ❤️️ by updatecli

When releasing test Fleet charts, the test release workflow looks for the latest existing Fleet release, to use it as a base before making a few edits. The previous logic used to find the latest available chart was buggy, in that it would list releases in alphabetical order, which could differ from semver. For instance, chart version `103.1.10+up0.9.11` would be listed between versions `103.1.0+up0.9.0` and `103.1.2+up0.9.2`. Instead, this commit simplifies resolution by first looking at the `package.yaml` file, extracting the chart version from there and looking for the corresponding Fleet version in the charts repository. Resolution would then fail if no corresponding version is found in the repository, but that is far less likely to happen than with the previous logic and would typically be a symptom of a broken state of the charts repository.

* Inject registered cluster name into multi-cluster tests Depending on the context in which multi-cluster end-to-end tests are run, there may not be any registered downstream cluster called `second`. In such cases, such as when testing Fleet in Rancher, the CI workflow computes the name of the registered cluster and exports it as an environment variable, for the tests to use instead of `second`. * Refer explicitly to Fleet API when labeling clusters This prevents ambiguity by specifying the `fleet.cattle.io` domain when labeling clusters in multi-cluster end-to-end tests, with `kubectl` looking for `clusters.cluster.x-k8s.io` by default. * Use non-managed downstream cluster in multi-cluster label tests This makes the multi-cluster end-to-end test suite compatible with setups in which only one, non-managed downstream cluster exists, such as the test-Fleet-in-Rancher CI workflow. * Use`dev-v2.10` as default base for test Fleet charts This updates the default base test charts branch to match the current state of Rancher releases.

* Improve sharding end-to-end error reporting This adds expectations on existing pods and clearer error messages to ease troubleshooting in case of failing of flaky tests. * Run node selector check earlier Sharding end-to-end tests exhibited flakiness, caused by the git job pod not being present in the cluster by the time checks were run to validate its node selector against that of the relevant shard. To prevent this, tests on the node selector are now run first, which incidentally also prevents `Eventually` from running more times awaiting a config map to be deployed.

Co-authored-by: renovate-rancher[bot] <119870437+renovate-rancher[bot]@users.noreply.github.com>

* Apply defaults from gitrepo restrictions * Add unit tests covering defaults auth and assign This covers a few happy and error cases with unit tests, uncovering a few typos in error messages in the process. --------- Co-authored-by: Corentin Néau <[email protected]>

Made with ❤️️ by updatecli

Using the environment variable FLEET_E2E_DS_CLUSTER_COUNT, an arbitrary amount of downstream clusters can be spawned, e.g.: ``` FLEET_E2E_DS_CLUSTER_COUNT=4 ./dev/setup-multi-cluster ``` This environment variable affects - dev/setup-k3ds - dev/import-images-k3d - dev/setup-multi-cluster

This fixes error messages shown in case of namespace or release name mismatch.

Add a new custom resource `HelmApp` (resource name open to debate) that describes a helm chart to be deployed. The resource contains all the fields from the classic `fleet.yaml` file plus a few new from the `GitRepo` resource. `HelmApp` YAML example: ```yaml apiVersion: fleet.cattle.io/v1alpha1 kind: HelmApp metadata: name: sample1 namespace: fleet-local spec: helm: releaseName: testhelm repo: https://charts.bitnami.com/bitnami chart: postgresql version: 16.2.1 insecureSkipTLSVerify: true ``` The implementation tries to share as much as possible from a `Bundle` spec inside the new resource, because it helps to "transform" the `HelmApp` into a deployment (no conversion is needed for most of the spec). The new controller was also implemented splitting the functionality into 2 controllers (similar to what we did for the `GitRepo` controller). This allows us to reuse most of the status handling code, as display fields in the status of the new resource are as similar as possible to have consistent user experience and to integrate with the UI in the same way the `GitRepo` does. When a new `HelmApp` resource is applied it is transformed into a single `Bundle`, adding some extra fields to let the `Bundle` reconciler know that this is not a regular `Bundle` coming from a `GitRepo`. Similar as we did for OCI storage, the `Bundle` created from a `HelmApp` does not contain resources. The helm chart to be deployed is downloaded by the agent. Code for downloading the helm chart is reused from gitops, so the same formats are supported. Insecure TLS skipping was added the the ChartURL and LoadDirectory functions in order to support this for gitops and helmops. If we need a secret to access the helm repository we can use the `helmSecretName` field. This secret will be cloned to secrets under the `BundleDeployment` namespace (same as we did for the OCI storage secret handling). The PR includes unit, integration (most of code is tested this way) and just one single e2e test so far just to test the whole feature together in a real cluster. Note: This is an experimental feature. In order to activate the `HelmApp` reconciling and `Bundle` deployment you need to the the environment variable: `EXPERIMENTAL_HELM_OPS=true` Refers to: #2962 * Add Insecure TLS option when downloading from OCI registry * Upgrades zot to version 2.1.1 So we can enable the UI and browse artifacts * Adds metrics e2e tests for HelmApps * Removes BundleSpecBase as it was not compatible when building rancher * Add unit test case when getting an error retrieving the secret * changes after 2nd review, fix flaky test Signed-off-by: Xavi Garcia <[email protected]>

* remove init() for setting up test data per file, as test data is shared across files now. * helpers don't Expect so missing resources are retried

This enables agent worker counts to be configured when installing the `fleet` chart, which is easier than tweaking individual releases of the `fleet-agent` chart. This still needs work to enable worker count updates through `helm upgrade --reuse-values` though, as this updates the `fleet-agent` `StatefulSet` _twice_, the second time with default values (50 workers per reconciler).

* Exports Metrics URLs to test with external IPs It also deletes the HELM_PATH env variable as it is no longer used. Fixes errors in metrics tests when the test is trying to check for metrics when the service exporting them it's still not fully up. Adds an extra check in OCI e2 tests to verify that the `CI_OCI_USERNAME` and `CI_OCI_PASSWORD` are set. Deletes the `GI_GIT_REPO_URL` env variable and uses the already existing `external_ip`. --------- Signed-off-by: Xavi Garcia <[email protected]> Co-authored-by: Mario Manno <[email protected]>

* k3d-act-clean cleanup - Clean up downstream clusters if FLEET_E2E_DS_CLUSTER_COUNT is set - Remove cleaning up any docker containers by name, including from nektos/act, therefore renaming k3d-act-clean to k3d-clean. * Keep compatibility with prevous version of setup-k3ds

3f61ba5 removed the usage of the HELM_PATH environment variable.

* Separate k3d dev scripts for upstream/downstream * support PORT_OFFSET for upstream in both, to avoid conflicts with host ports * for simplicity downstreams always have a number in their name * fixup! Separate k3d dev scripts for upstream/downstream * fixup! fixup! Separate k3d dev scripts for upstream/downstream

…atible

Signed-off-by: Xavi Garcia <[email protected]>

* Import v1alpha1 package as fleet * Show bundle errors in Bundle and GitRepo Refers to #2943 * Add E2E tests Refers to #2943

Same tests run in e2e-nightly

If gitrepos are lost from the requeueAfter polling, resync should add them again.

p-se and others added 30 commits November 6, 2024 17:21

Deduplicate status messages (#3042)

a8cafe4

[updatecli] Bump Fleet version used within installation documentation…

78024f5

… to 0.10.5 (#3053) * chore: Update Fleet asset URL * chore: Update Fleet CRD asset URL Made with ❤️️ by updatecli --------- Co-authored-by: fleet-bot <[email protected]>

Rename target customization test

c5d75d4

fixup! Rename target customization test

12cb7e4

Remove unused function (#3058)

a42f7aa

This fixes a linter error.

Update module github.com/rancher/fleet/pkg/apis to v0.11.0

9c2171b

Un-Rc Wrangler 3.1

f05abf2

Update module golang.org/x/sync to v0.9.0

f2a7f5c

Update module github.com/onsi/gomega to v1.35.1

695f2cc

Update module golang.org/x/crypto to v0.29.0

7aee47c

chore: Update Fleet CRD asset URL

6285fd9

Made with ❤️️ by updatecli

chore: Update Fleet asset URL

a471aab

Made with ❤️️ by updatecli

Update gomod-sigsk8sio-dependencies

226d54f

Run go generate

5f1d733

chore(deps): update module helm.sh/helm/v3 to v3.16.3 (#3076)

b28fe41

Co-authored-by: renovate-rancher[bot] <119870437+renovate-rancher[bot]@users.noreply.github.com>

chore(deps): update module github.com/onsi/ginkgo/v2 to v2.21.0

b4222ff

chore(deps): update module sigs.k8s.io/controller-runtime to v0.19.2

4e874eb

chore(deps): update module github.com/stretchr/testify to v1.10.0

fc67981

chore(deps): update module github.com/onsi/ginkgo/v2 to v2.22.0

fb3d19c

chore(deps): update module github.com/masterminds/semver/v3 to v3.3.1

4704691

chore(deps): update github.com/rancher/lasso digest to cbc3210

bc3c4f0

chore: Update Fleet CRD asset URL

1c93a1c

Made with ❤️️ by updatecli

chore: Update Fleet asset URL

f92da98

Made with ❤️️ by updatecli

chore: Update Fleet asset URL

f7af075

Made with ❤️️ by updatecli

p-se and others added 27 commits December 16, 2024 12:59

Fix fleet-agent chart validation (#3150)

df4783e

This fixes error messages shown in case of namespace or release name mismatch.

Cleanup agent integrationtests (#3142)

93801e5

* remove init() for setting up test data per file, as test data is shared across files now. * helpers don't Expect so missing resources are retried

chore(deps): update module helm.sh/helm/v3 to v3.16.4

8ba46bb

Remove HELM_PATH, tests no longer use it (#3165)

12cfa79

3f61ba5 removed the usage of the HELM_PATH environment variable.

chore(deps): update module github.com/onsi/ginkgo/v2 to v2.22.1

883dbc4

Fix typo in targetsyaml.go

67bb1e0

chore(deps): update module github.com/onsi/ginkgo/v2 to v2.22.2

0463909

chore(deps): update module github.com/docker/docker to v27.4.1+incomp…

6ee98e3

…atible

chore(deps): update module github.com/go-git/go-git/v5 to v5.13.1

74a5b02

chore(scripts): Sort keys in release.yaml after updating chart versions

8f08aed

chore(scripts): Add validation step for charts release

6ef14b9

chore(deps): update module golang.org/x/crypto to v0.32.0

b353dab

chore(deps): update module github.com/otiai10/copy to v1.14.1

597b44b

Update golang.org/x/net to v0.33.0 in pkg/apis

c41a446

Adds logs for new commit or error checking for the latest (#3182)

42f0d20

Signed-off-by: Xavi Garcia <[email protected]>

[SURE-9137] Add template errors to bundle and gitrepo status (#3114)

235e8ef

* Import v1alpha1 package as fleet * Show bundle errors in Bundle and GitRepo Refers to #2943 * Add E2E tests Refers to #2943

Remove AKS/GKE workflows (#3192)

b98f0e0

Same tests run in e2e-nightly

Add jitter to the pollingInterval of GitRepos

e9f095d

Gitops controller uses shorter resync interval

db9c50a

If gitrepos are lost from the requeueAfter polling, resync should add them again.

GenerationChangedPredicate prevented Cache Sync to Trigger Reconciler

440d246

manno requested a review from a team as a code owner January 9, 2025 14:50

manno closed this Jan 9, 2025

manno deleted the add-jitter-to-polling branch January 9, 2025 15:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[v0.11] Backport of Add jitter and resync to polling #3195

[v0.11] Backport of Add jitter and resync to polling #3195

manno commented Jan 9, 2025

[v0.11] Backport of Add jitter and resync to polling #3195

[v0.11] Backport of Add jitter and resync to polling #3195

Conversation

manno commented Jan 9, 2025