Spike: Unmanaged Agents #1735

manno · 2023-08-23T12:20:25Z

Fleet supports agent-initiated and manager-initiated cluster registration. Is it possible to register an agent without going through the full registration? It seems we only need a cluster resource, a token for the agent to get access to the upstream cluster's bundledeployments and a running agent binary.

This would be like a "registration 2.0". An unmanaged agent with a "dumb" setup will be easier to maintain manually. Whereas the current registration flow is very opinionated and automated. The existing registration is probably fine for small fleets of clusters.

In the future we want to update agents with more control, e.g. in batches for large fleets of clusters. A dumb, unmanaged flow, will also help with other use cases, like running the agent out of cluster or extending the agent's deployer.

Is it possible to register an agent/cluster with just kubectl? What if we skip the "big registration" loop that runs before the agent starts?

Note: there is already a unmanaged label for agents, what does it do?

manno · 2023-11-17T10:58:33Z

Idea 1 - Reusing current resources, without the *-initiated registration flow

disable https://github.com/rancher/fleet/tree/master/internal/cmd/controller/agentmanagement/controllers
kubectl create cluster, sa, role, rb, ... on upstream
install agent on downstream with fleet-agent secret (needs token from 2.), bypass agent's registration by creating the right resources before install. This step could use a modified fleet-agent helm chart.

-> upstream doesn't need to connect to downstream, since configuration is external, e.g. a human operator does that.

Future Design for Registration?

How does future automation look like? Scenarios

Airgap
Rancher with automatic cluster registration: https://github.com/rancher/rancher/blob/release/v2.8/pkg/controllers/provisioningv2/fleetcluster/fleetcluster.go#L163-L179
Update the agent?

manno · 2023-12-18T12:18:59Z

Quick notes:

https://github.com/manno/fleet/tree/registration2.0-spike

Registration always starts with manager by creating resources there
How to update clients?
- Pull (bundle / manageagent.go)
- Push (force redeploy / import.go)
Reduce created RBAC resources per cluster
how would fleetcluster.go use reg2?

manno · 2023-12-21T10:40:20Z

Let's collect some final notes and the scripts and finish this spike. We'll need a design doc for the next stories.

p-se · 2024-01-11T13:19:23Z

Motivation

Research the possibility of unmanaged agents.

Status

These changes introduce a way to disable agent management in both, the fleet controller and
the fleet agent in order to test the deployment of an unmanaged agent.

An unamanaged agent is an agent is created without having to communicate with
the fleet controller once. It is created without the fleet controller but is
considered valid by the fleet controller and works just as well as a managed
agent. However, it does not have to connect to the fleet controller and can
deploy bundles locally.

Disabling agent management in the fleet controller means to prevent the
agentmanagement controller in the fleet controller from starting.
Disabling agent management in the fleet agent means to prevent the agent from
starting the initialization container. It even disables the validation for the
apiServerURL, so that agents can be created which do not have to connect to
any fleet controller, which is the basic idea for unmanaged agents. Those should
be able to deploy locally provided bundles without having to connect to any
fleet controller.

For this purpose two files in dev/ have been created:

./dev/test-unmanaged-agent-registration
./dev/create-fleet-secret # this is being used in the former script

Note: Deployment of a bundle has not successfully been tested.

Setup

./dev/test-unmanaged-agent-registration

Description

The dev/create-fleet-secret scripts takes care of creating the necessary resources
for the registration of a fleet agent without involving a fleet agent. On the
upstream cluster those are:

Creating namespaces
- fleet-default
- cluster-fleet-default-foo-1a3d67d0a899
Static RBAC resources (once per cluster)
- ClusterRole
  - fleet-bundle-deployment
  - fleet-content
Cluster resource
- Role
- RoleBinding
- ClusterRoleBinding
- ServiceAccount
- Secret
At which point there might be some potential for optimization, since it'd be
probably be better if the role would be reusable and not had to be recreated
for every downstream cluster. But that's not the point of this PR and may also
not be the point of the succeeding work, since we want to preserve the
compatibility with the current implementation until the successor completely
took over.

On the downstream cluster the script creates a Secret object which contains a
valid Kubeconfig file to the upstream cluster as well as

clusterName
clusterNamespace and
deploymentNamespace.

https://gist.github.com/p-se/909b02db4d999afa405a644864022f6c

manno added this to Fleet Aug 23, 2023

manno converted this from a draft issue Aug 23, 2023

manno added the kind/spike label Aug 23, 2023

github-actions bot added team/fleet labels Aug 23, 2023

manno changed the title ~~Draft: Spike: Unmanaged Agents~~ Spike: Unmanaged Agents Aug 24, 2023

manno added this to the 2023-Q4-v2.8x milestone Aug 24, 2023

kkaempf modified the milestones: v2.8.0, 2024-Q1-2.8x Sep 28, 2023

kkaempf added the kind/chore label Oct 25, 2023

p-se self-assigned this Nov 17, 2023

manno moved this from 📋 Backlog to 🏗 In progress in Fleet Nov 24, 2023

manno modified the milestones: 2024-Q1-2.8x, v2.9.0 Nov 27, 2023

manno mentioned this issue Jan 11, 2024

Split agent management into separate controller #1732

Closed

p-se closed this as completed Jan 11, 2024

manno moved this from 🏗 In progress to ✅ Done in Fleet Jan 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Spike: Unmanaged Agents #1735

Spike: Unmanaged Agents #1735

manno commented Aug 23, 2023 •

edited

Loading

manno commented Nov 17, 2023

manno commented Dec 18, 2023

manno commented Dec 21, 2023

p-se commented Jan 11, 2024 •

edited

Loading

Spike: Unmanaged Agents #1735

Spike: Unmanaged Agents #1735

Comments

manno commented Aug 23, 2023 • edited Loading

manno commented Nov 17, 2023

Idea 1 - Reusing current resources, without the *-initiated registration flow

Future Design for Registration?

manno commented Dec 18, 2023

manno commented Dec 21, 2023

p-se commented Jan 11, 2024 • edited Loading

Motivation

Status

Setup

Description

manno commented Aug 23, 2023 •

edited

Loading

p-se commented Jan 11, 2024 •

edited

Loading