Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Unmanaged Agents #1735

Closed
manno opened this issue Aug 23, 2023 · 4 comments
Closed

Spike: Unmanaged Agents #1735

manno opened this issue Aug 23, 2023 · 4 comments
Assignees
Milestone

Comments

@manno
Copy link
Member

manno commented Aug 23, 2023

Fleet supports agent-initiated and manager-initiated cluster registration. Is it possible to register an agent without going through the full registration? It seems we only need a cluster resource, a token for the agent to get access to the upstream cluster's bundledeployments and a running agent binary.

This would be like a "registration 2.0". An unmanaged agent with a "dumb" setup will be easier to maintain manually. Whereas the current registration flow is very opinionated and automated. The existing registration is probably fine for small fleets of clusters.

In the future we want to update agents with more control, e.g. in batches for large fleets of clusters. A dumb, unmanaged flow, will also help with other use cases, like running the agent out of cluster or extending the agent's deployer.

Is it possible to register an agent/cluster with just kubectl? What if we skip the "big registration" loop that runs before the agent starts?

Note: there is already a unmanaged label for agents, what does it do?

@manno manno added this to Fleet Aug 23, 2023
@manno manno converted this from a draft issue Aug 23, 2023
@manno manno changed the title Draft: Spike: Unmanaged Agents Spike: Unmanaged Agents Aug 24, 2023
@manno manno added this to the 2023-Q4-v2.8x milestone Aug 24, 2023
@kkaempf kkaempf modified the milestones: v2.8.0, 2024-Q1-2.8x Sep 28, 2023
@manno
Copy link
Member Author

manno commented Nov 17, 2023

Idea 1 - Reusing current resources, without the *-initiated registration flow

  1. disable https://github.com/rancher/fleet/tree/master/internal/cmd/controller/agentmanagement/controllers
  2. kubectl create cluster, sa, role, rb, ... on upstream
  3. install agent on downstream with fleet-agent secret (needs token from 2.), bypass agent's registration by creating the right resources before install. This step could use a modified fleet-agent helm chart.

-> upstream doesn't need to connect to downstream, since configuration is external, e.g. a human operator does that.

Future Design for Registration?

How does future automation look like? Scenarios

@p-se p-se self-assigned this Nov 17, 2023
@manno manno moved this from 📋 Backlog to 🏗 In progress in Fleet Nov 24, 2023
@manno manno modified the milestones: 2024-Q1-2.8x, v2.9.0 Nov 27, 2023
@manno
Copy link
Member Author

manno commented Dec 18, 2023

Quick notes:

https://github.com/manno/fleet/tree/registration2.0-spike

  • Registration always starts with manager by creating resources there
  • How to update clients?
    • Pull (bundle / manageagent.go)
    • Push (force redeploy / import.go)
  • Reduce created RBAC resources per cluster
  • how would fleetcluster.go use reg2?

@manno
Copy link
Member Author

manno commented Dec 21, 2023

Let's collect some final notes and the scripts and finish this spike. We'll need a design doc for the next stories.

@p-se
Copy link
Contributor

p-se commented Jan 11, 2024

Motivation

Research the possibility of unmanaged agents.

Status

These changes introduce a way to disable agent management in both, the fleet controller and
the fleet agent in order to test the deployment of an unmanaged agent.

An unamanaged agent is an agent is created without having to communicate with
the fleet controller once. It is created without the fleet controller but is
considered valid by the fleet controller and works just as well as a managed
agent. However, it does not have to connect to the fleet controller and can
deploy bundles locally.

  • Disabling agent management in the fleet controller means to prevent the
    agentmanagement controller in the fleet controller from starting.

  • Disabling agent management in the fleet agent means to prevent the agent from
    starting the initialization container. It even disables the validation for the
    apiServerURL, so that agents can be created which do not have to connect to
    any fleet controller, which is the basic idea for unmanaged agents. Those should
    be able to deploy locally provided bundles without having to connect to any
    fleet controller.

For this purpose two files in dev/ have been created:

  • ./dev/test-unmanaged-agent-registration
  • ./dev/create-fleet-secret # this is being used in the former script

Note: Deployment of a bundle has not successfully been tested.

Setup

./dev/test-unmanaged-agent-registration

Description

The dev/create-fleet-secret scripts takes care of creating the necessary resources
for the registration of a fleet agent without involving a fleet agent. On the
upstream cluster those are:

  • Creating namespaces

    • fleet-default
    • cluster-fleet-default-foo-1a3d67d0a899
  • Static RBAC resources (once per cluster)

    • ClusterRole
      • fleet-bundle-deployment
      • fleet-content
  • Cluster resource

    • Role
    • RoleBinding
    • ClusterRoleBinding
    • ServiceAccount
    • Secret

    At which point there might be some potential for optimization, since it'd be
    probably be better if the role would be reusable and not had to be recreated
    for every downstream cluster. But that's not the point of this PR and may also
    not be the point of the succeeding work, since we want to preserve the
    compatibility with the current implementation until the successor completely
    took over.

On the downstream cluster the script creates a Secret object which contains a
valid Kubeconfig file to the upstream cluster as well as

  • clusterName
  • clusterNamespace and
  • deploymentNamespace.

https://gist.github.com/p-se/909b02db4d999afa405a644864022f6c

@p-se p-se closed this as completed Jan 11, 2024
@manno manno moved this from 🏗 In progress to ✅ Done in Fleet Jan 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

3 participants