Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental support of drift detection #617

Merged
merged 9 commits into from
Mar 1, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 1 addition & 16 deletions api/v2beta1/helmrelease_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ type HelmReleaseSpec struct {
// a controller level fallback for when HelmReleaseSpec.ServiceAccountName
// is empty.
// +optional
KubeConfig *KubeConfig `json:"kubeConfig,omitempty"`
KubeConfig *meta.KubeConfigReference `json:"kubeConfig,omitempty"`

// Suspend tells the controller to suspend reconciliation for this HelmRelease,
// it does not apply to already started reconciliations. Defaults to false.
Expand Down Expand Up @@ -215,21 +215,6 @@ func (in HelmReleaseSpec) GetUninstall() Uninstall {
return *in.Uninstall
}

// KubeConfig references a Kubernetes secret that contains a kubeconfig file.
type KubeConfig struct {
// SecretRef holds the name to a secret that contains a key with
// the kubeconfig file as the value. If no key is specified the key will
// default to 'value'. The secret must be in the same namespace as
// the HelmRelease.
// It is recommended that the kubeconfig is self-contained, and the secret
// is regularly updated if credentials such as a cloud-access-token expire.
// Cloud specific `cmd-path` auth helpers will not function without adding
// binaries and credentials to the Pod that is responsible for reconciling
// the HelmRelease.
// +required
SecretRef meta.SecretKeyReference `json:"secretRef,omitempty"`
}

// HelmChartTemplate defines the template from which the controller will
// generate a v1beta2.HelmChart object in the same namespace as the referenced
// v1beta2.Source.
Expand Down
18 changes: 1 addition & 17 deletions api/v2beta1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

18 changes: 10 additions & 8 deletions config/crd/bases/helm.toolkit.fluxcd.io_helmreleases.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -274,14 +274,14 @@ spec:
is empty.
properties:
secretRef:
description: SecretRef holds the name to a secret that contains
a key with the kubeconfig file as the value. If no key is specified
the key will default to 'value'. The secret must be in the same
namespace as the HelmRelease. It is recommended that the kubeconfig
is self-contained, and the secret is regularly updated if credentials
such as a cloud-access-token expire. Cloud specific `cmd-path`
auth helpers will not function without adding binaries and credentials
to the Pod that is responsible for reconciling the HelmRelease.
description: SecretRef holds the name of a secret that contains
a key with the kubeconfig file as the value. If no key is set,
the key will default to 'value'. It is recommended that the
kubeconfig is self-contained, and the secret is regularly updated
if credentials such as a cloud-access-token expire. Cloud specific
`cmd-path` auth helpers will not function without adding binaries
and credentials to the Pod that is responsible for reconciling
Kubernetes resources.
properties:
key:
description: Key in the Secret, when not specified an implementation-specific
Expand All @@ -293,6 +293,8 @@ spec:
required:
- name
type: object
required:
- secretRef
type: object
maxHistory:
description: MaxHistory is the number of revisions saved by Helm for
Expand Down
52 changes: 48 additions & 4 deletions controllers/helmrelease_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ import (
"k8s.io/client-go/rest"
kuberecorder "k8s.io/client-go/tools/record"
"k8s.io/client-go/tools/reference"
"sigs.k8s.io/cli-utils/pkg/kstatus/polling"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/builder"
"sigs.k8s.io/controller-runtime/pkg/client"
Expand All @@ -53,13 +54,15 @@ import (
eventv1 "github.com/fluxcd/pkg/apis/event/v1beta1"
"github.com/fluxcd/pkg/apis/meta"
"github.com/fluxcd/pkg/runtime/acl"
fluxClient "github.com/fluxcd/pkg/runtime/client"
runtimeClient "github.com/fluxcd/pkg/runtime/client"
"github.com/fluxcd/pkg/runtime/metrics"
"github.com/fluxcd/pkg/runtime/predicates"
"github.com/fluxcd/pkg/runtime/transform"
sourcev1 "github.com/fluxcd/source-controller/api/v1beta2"

v2 "github.com/fluxcd/helm-controller/api/v2beta1"
"github.com/fluxcd/helm-controller/internal/diff"
"github.com/fluxcd/helm-controller/internal/features"
"github.com/fluxcd/helm-controller/internal/kube"
"github.com/fluxcd/helm-controller/internal/runner"
"github.com/fluxcd/helm-controller/internal/util"
Expand All @@ -83,8 +86,11 @@ type HelmReleaseReconciler struct {
MetricsRecorder *metrics.Recorder
DefaultServiceAccount string
NoCrossNamespaceRef bool
ClientOpts fluxClient.Options
KubeConfigOpts fluxClient.KubeConfigOptions
ClientOpts runtimeClient.Options
KubeConfigOpts runtimeClient.KubeConfigOptions
StatusPoller *polling.StatusPoller
PollingOpts polling.Options
ControllerName string
}

func (r *HelmReleaseReconciler) SetupWithManager(mgr ctrl.Manager, opts HelmReleaseReconcilerOptions) error {
Expand All @@ -103,7 +109,7 @@ func (r *HelmReleaseReconciler) SetupWithManager(mgr ctrl.Manager, opts HelmRele
r.requeueDependency = opts.DependencyRequeueInterval

// Configure the retryable http client used for fetching artifacts.
// By default it retries 10 times within a 3.5 minutes window.
// By default, it retries 10 times within a 3.5 minutes window.
httpClient := retryablehttp.NewClient()
httpClient.RetryWaitMin = 5 * time.Second
httpClient.RetryWaitMax = 30 * time.Second
Expand Down Expand Up @@ -319,6 +325,44 @@ func (r *HelmReleaseReconciler) reconcileRelease(ctx context.Context,
releaseRevision := util.ReleaseRevision(rel)
valuesChecksum := util.ValuesChecksum(values)
hr, hasNewState := v2.HelmReleaseAttempted(hr, revision, releaseRevision, valuesChecksum)

// Run diff against current cluster state.
if !hasNewState {
if ok, _ := features.Enabled(features.DetectDrift); ok {
differ := diff.NewDiffer(runtimeClient.NewImpersonator(
r.Client,
r.StatusPoller,
r.PollingOpts,
hr.Spec.KubeConfig,
r.KubeConfigOpts,
r.DefaultServiceAccount,
hr.Spec.ServiceAccountName,
hr.GetNamespace(),
), r.ControllerName)

changeSet, drift, err := differ.Diff(ctx, rel)
if err != nil {
if changeSet == nil {
msg := "failed to diff release against cluster resources"
r.event(ctx, hr, rel.Chart.Metadata.Version, eventv1.EventSeverityError, err.Error())
return v2.HelmReleaseNotReady(hr, "DiffFailed", fmt.Sprintf("%s: %s", msg, err.Error())), err
}
log.Error(err, "diff of release against cluster resources finished with error")
}

msg := "no diff in cluster resources compared to release"
if drift {
hasNewState = true
msg = "diff in cluster resources compared to release"
}
if changeSet != nil {
msg = fmt.Sprintf("%s:\n\n%s", msg, changeSet.String())
r.event(ctx, hr, rel.Chart.Metadata.Version, eventv1.EventSeverityInfo, msg)
}
log.Info(msg)
}
}

if hasNewState {
hr = v2.HelmReleaseProgressing(hr)
if updateStatusErr := r.patchStatus(ctx, &hr); updateStatusErr != nil {
Expand Down
50 changes: 4 additions & 46 deletions docs/api/helmrelease.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ Kubernetes meta/v1.Duration
<td>
<code>kubeConfig</code><br>
<em>
<a href="#helm.toolkit.fluxcd.io/v2beta1.KubeConfig">
KubeConfig
<a href="https://godoc.org/github.com/fluxcd/pkg/apis/meta#KubeConfigReference">
github.com/fluxcd/pkg/apis/meta.KubeConfigReference
</a>
</em>
</td>
Expand Down Expand Up @@ -823,8 +823,8 @@ Kubernetes meta/v1.Duration
<td>
<code>kubeConfig</code><br>
<em>
<a href="#helm.toolkit.fluxcd.io/v2beta1.KubeConfig">
KubeConfig
<a href="https://godoc.org/github.com/fluxcd/pkg/apis/meta#KubeConfigReference">
github.com/fluxcd/pkg/apis/meta.KubeConfigReference
</a>
</em>
</td>
Expand Down Expand Up @@ -1460,48 +1460,6 @@ no retries remain. Defaults to &lsquo;false&rsquo;.</p>
</table>
</div>
</div>
<h3 id="helm.toolkit.fluxcd.io/v2beta1.KubeConfig">KubeConfig
</h3>
<p>
(<em>Appears on:</em>
<a href="#helm.toolkit.fluxcd.io/v2beta1.HelmReleaseSpec">HelmReleaseSpec</a>)
</p>
<p>KubeConfig references a Kubernetes secret that contains a kubeconfig file.</p>
<div class="md-typeset__scrollwrap">
<div class="md-typeset__table">
<table>
<thead>
<tr>
<th>Field</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>
<code>secretRef</code><br>
<em>
<a href="https://godoc.org/github.com/fluxcd/pkg/apis/meta#SecretKeyReference">
github.com/fluxcd/pkg/apis/meta.SecretKeyReference
</a>
</em>
</td>
<td>
<p>SecretRef holds the name to a secret that contains a key with
the kubeconfig file as the value. If no key is specified the key will
default to &lsquo;value&rsquo;. The secret must be in the same namespace as
the HelmRelease.
It is recommended that the kubeconfig is self-contained, and the secret
is regularly updated if credentials such as a cloud-access-token expire.
Cloud specific <code>cmd-path</code> auth helpers will not function without adding
binaries and credentials to the Pod that is responsible for reconciling
the HelmRelease.</p>
</td>
</tr>
</tbody>
</table>
</div>
</div>
<h3 id="helm.toolkit.fluxcd.io/v2beta1.Kustomize">Kustomize
</h3>
<p>
Expand Down
86 changes: 86 additions & 0 deletions docs/spec/v2beta1/helmreleases.md
Original file line number Diff line number Diff line change
Expand Up @@ -1270,6 +1270,92 @@ spec:
crds: CreateReplace
```

### Drift detection

**Note:** This feature is experimental and can be enabled by setting `--feature-gates=DetectDrift=true`.

When a HelmRelease is in-sync with the Helm release object in the storage, the controller will
compare the manifests from the Helm storage with the current state of the cluster using a
[server-side dry-run apply](https://kubernetes.io/docs/reference/using-api/server-side-apply/).
If this comparison detects a drift (either due resource being created or modified during the
dry-run), the controller will perform an upgrade for the release, restoring the desired state.

### Excluding resources from drift detection

The drift detection feature can be configured to exclude certain resources from the comparison
by labeling or annotating them with `helm.toolkit.fluxcd.io/driftDetection: disabled`. Using
[post-renderers](#post-renderers), this can be applied to any resource rendered by Helm.

```yaml
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: app
namespace: webapp
spec:
postRenderers:
- kustomize:
patches:
- target:
version: v1
kind: Deployment
name: my-app
patch: |
- op: add
path: /metadata/annotations/helm.toolkit.fluxcd.io~1driftDetection
value: disabled
```

**Note:** For some charts, we have observed the drift detection feature can detect spurious
changes due to Helm not properly patching an object, which seems to be related to
[Helm#5915](https://github.com/helm/helm/issues/5915) and issues alike. In this case (and
when possible for your workload), configuring `.spec.upgrade.force` to `true` might be a
more fitting solution than ignoring the object in full.

#### Drift exclusion example Prometheus Stack

```yaml
---
apiVersion: helm.toolkit.fluxcd.io/v2beta1
kind: HelmRelease
metadata:
name: kube-prometheus-stack
spec:
interval: 5m
chart:
spec:
version: "45.x"
chart: kube-prometheus-stack
sourceRef:
kind: HelmRepository
name: prometheus-community
interval: 60m
upgrade:
crds: CreateReplace
# Force recreation due to Helm not properly patching Deployment with e.g. added port,
# causing spurious drift detection
force: true
postRenderers:
- kustomize:
patches:
- target:
# Ignore these objects from Flux diff as they are mutated from chart hooks
kind: (ValidatingWebhookConfiguration|MutatingWebhookConfiguration)
name: kube-prometheus-stack-admission
patch: |
- op: add
path: /metadata/annotations/helm.toolkit.fluxcd.io~1driftDetection
value: disabled
- target:
# Ignore these objects from Flux diff as they are mutated at apply time but not
# at dry-run time
kind: PrometheusRule
patch: |
- op: add
path: /metadata/annotations/helm.toolkit.fluxcd.io~1driftDetection
value: disabled
```

## Status

When the controller completes a reconciliation, it reports the result in the status sub-resource.
Expand Down
Loading