Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

webhook: add options to disable resource_namespace tag in metrics #2931

Merged

Conversation

zhouhaibing089
Copy link
Contributor

To add some context, historically, resource_name was removed from this tag list due to its high potential of causing high metrics cardinality. See knative/pkg#1464 for more information.

While that's great, but it might not be sufficient for large scale use cases where namespaces can be super dynamic (with generateName, too) or grows fase enough. There is an issue report from
tektoncd/pipeline#3171 which talks about this.

This proposal makes it possible to disable resource_namespace tag via an option function. The default behavior is not changed, so no user impact if any of existing users rely on this tag. There is no API contract change either due to the beauty of variadic functions.

Now downstream projects can consume this by override StatsReporter in webhook context options with their own preference. As a caveat here, if downstream project does choose to override StatsReporter, the default ReportMetrics function shouldn't be called by default as they may now have a different set of tag keys to report. As such, this function is only called if the default StatsReporter is used.

webhook: add options to disable resource_namespace tag in metrics

Copy link

knative-prow bot commented Jan 10, 2024

Welcome @zhouhaibing089! It looks like this is your first PR to knative/pkg 🎉

@knative-prow knative-prow bot requested review from kauana and Leo6Leo January 10, 2024 18:33
@knative-prow knative-prow bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 10, 2024
Copy link

knative-prow bot commented Jan 10, 2024

Hi @zhouhaibing089. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dprotaso
Copy link
Member

/hold we have a release in the next week

Will revisit this after

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 14, 2024
@dprotaso
Copy link
Member

/ok-to-test

@knative-prow knative-prow bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 14, 2024
@zhouhaibing089
Copy link
Contributor Author

/retest

(The failed tests reported seem to be fine locally)

@dprotaso
Copy link
Member

(The failed tests reported seem to be fine locally)

They're consistent in CI - try using a docker container with ubuntu?

@zhouhaibing089 zhouhaibing089 force-pushed the options-disable-namespace-tag branch from b85215c to 57ecce6 Compare January 18, 2024 17:16
@knative-prow knative-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 18, 2024
Copy link

codecov bot commented Jan 18, 2024

Codecov Report

Attention: Patch coverage is 95.74468% with 4 lines in your changes are missing coverage. Please review.

Project coverage is 78.75%. Comparing base (6b13f01) to head (46a2e06).
Report is 20 commits behind head on main.

Files Patch % Lines
webhook/stats_reporter.go 95.69% 2 Missing and 2 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2931      +/-   ##
==========================================
+ Coverage   78.68%   78.75%   +0.07%     
==========================================
  Files         188      188              
  Lines       11051    11107      +56     
==========================================
+ Hits         8695     8747      +52     
- Misses       2092     2094       +2     
- Partials      264      266       +2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@zhouhaibing089
Copy link
Contributor Author

Kindly ping, may I get some attention here?

@skonto
Copy link
Contributor

skonto commented Mar 6, 2024

Hi @zhouhaibing089 sorry for the delay, just to clarify here we have two changes:
a) avoid resourceNamespaceKey if someone sets a custom reporter using WithoutResourceNamespace() (in Webhook options)
b) register metrics only when the default reporter is used.

I think it would be more flexible if we can exclude an arbitrary tag key instead of having if statements like:

if r.opts.resourceNamespace {
	mutators = append(mutators, tag.Insert(resourceNamespaceKey, req.Namespace))
}

We could use:

type options struct {
	tagsToExclude    []tag.Key
}

wdyth?

@zhouhaibing089 zhouhaibing089 force-pushed the options-disable-namespace-tag branch 2 times, most recently from 707ab02 to bb36b9d Compare March 6, 2024 22:17
@zhouhaibing089
Copy link
Contributor Author

I think it would be more flexible if we can exclude an arbitrary tag key..

Thanks. I completely agree (PR updated with your suggestion).

webhook/stats_reporter.go Outdated Show resolved Hide resolved
webhook/webhook_test.go Outdated Show resolved Hide resolved
@zhouhaibing089 zhouhaibing089 force-pushed the options-disable-namespace-tag branch 2 times, most recently from d3a896c to cb50ed6 Compare March 10, 2024 06:45
@zhouhaibing089
Copy link
Contributor Author

/retest

webhook/stats_reporter.go Outdated Show resolved Hide resolved
webhook/stats_reporter.go Outdated Show resolved Hide resolved
webhook/stats_reporter.go Outdated Show resolved Hide resolved
webhook/stats_reporter.go Outdated Show resolved Hide resolved
webhook/stats_reporter.go Outdated Show resolved Hide resolved
@dprotaso
Copy link
Member

Overall looks great thanks for the changes - some minor stuff

@zhouhaibing089 zhouhaibing089 force-pushed the options-disable-namespace-tag branch 3 times, most recently from b7ac4a9 to e153848 Compare March 31, 2024 23:15
To add some context, historically, `resource_name` was removed from this
tag list due to its high potential of causing high metrics cardinality.
See [knative#1464][1] for more information.

While that's great, but it might not be sufficient for large scale use
cases where namespaces can be super dynamic (with generateName, too) or
grows fase enough. There is an issue report from
[tektoncd/pipeline#3171][2] which talks about this.

This proposal makes it possible to disable `resource_namespace` tag via
an option function. The default behavior is not changed, so no user
impact if any of existing users rely on this tag. There is no API
contract change either due to the beauty of variadic functions.

Now downstream projects can consume this by override `StatsReporter` in
webhook context options with their own preference. As a caveat here, if
downstream project does choose to override `StatsReporter`, the default
`ReportMetrics` function shouldn't be called by default as they may now
have a different set of tag keys to report. As such, this function is
only called if the default `StatsReporter` is used.

[1]: knative#1464
[2]: tektoncd/pipeline#3171
@zhouhaibing089 zhouhaibing089 force-pushed the options-disable-namespace-tag branch from e153848 to 927e9ff Compare March 31, 2024 23:45
@dprotaso
Copy link
Member

dprotaso commented Apr 1, 2024

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 1, 2024
Copy link
Member

@dprotaso dprotaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the changes - just some minor nil checks left

injection/sharedmain/main.go Show resolved Hide resolved
webhook/webhook.go Show resolved Hide resolved
There are two ways to customize StatsReporter:

1. Use a whole new StatsReporter implementation.
1. Or pass Option funcs to customize the default StatsReporter.

Option 1 is less practical at this time due to the metrics registration
conflict. `webhook.RegisterMetrics()` is called regardless which
StatsReporter implementation is used (which is a problem by itself). The
second option is more practical since it works well without dealing with
metrics conflicts.

The `webhook.Option` in particular allows people to discard certain
metrics tags.
@zhouhaibing089 zhouhaibing089 force-pushed the options-disable-namespace-tag branch from 927e9ff to 46a2e06 Compare April 1, 2024 18:39
@dprotaso
Copy link
Member

dprotaso commented Apr 1, 2024

/lgtm
/approve

thanks @zhouhaibing089 🎉

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Apr 1, 2024
Copy link

knative-prow bot commented Apr 1, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, zhouhaibing089

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 1, 2024
@knative-prow knative-prow bot merged commit 03bf3de into knative:main Apr 1, 2024
39 checks passed
@zhouhaibing089 zhouhaibing089 deleted the options-disable-namespace-tag branch April 1, 2024 20:36
zhouhaibing089 added a commit to zhouhaibing089/pkg that referenced this pull request Apr 2, 2024
This is a followup from [knative#2931][1]. RegisterMetrics may
register unwanted metrics if the default stats reporter is not used.

[1]: knative#2931
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants