Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prow: Alerting #930

Open
lentzi90 opened this issue Dec 13, 2024 · 1 comment
Open

Prow: Alerting #930

lentzi90 opened this issue Dec 13, 2024 · 1 comment
Labels
triage/accepted Indicates an issue is ready to be actively worked on.

Comments

@lentzi90
Copy link
Member

Split out from #896

When #896 is done, we will have the basics in place that will allow us to define alerts.
This issue is for doing that.
Define some basic alerts (or use those that comes with kube-prometheus), set up receivers for slack and/or email.
Useful alerts would be for example

  • generic kubernetes issues like unhealthy nodes (see this)
  • unhealthy workloads (except tests, they can be expected to fail), for example, prow,
@metal3-io-bot metal3-io-bot added the needs-triage Indicates an issue lacks a `triage/foo` label and requires one. label Dec 13, 2024
@lentzi90
Copy link
Member Author

/triage accepted

@metal3-io-bot metal3-io-bot added triage/accepted Indicates an issue is ready to be actively worked on. and removed needs-triage Indicates an issue lacks a `triage/foo` label and requires one. labels Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage/accepted Indicates an issue is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

2 participants