Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

h2: Rapid reset mitigations (7.3) #4011

Merged
merged 29 commits into from
Oct 24, 2023
Merged

Conversation

dridi
Copy link
Member

@dridi dridi commented Oct 18, 2023

Port of #3997, #3998, #3999 and adjacent commits to the 7.3 branch.

One trivial conflict, and a more consequential conflict for the lack of $Restrict in this branch.

nigoroll and others added 19 commits October 18, 2023 18:50
This adds parameters h2_rst_allowance and h2_rst_allowance_period,
which govern the rate of which we allow clients to reset h/2 streams.

If the limit is exceeded the connection is closed.

Mitigates: varnishcache#3996
Only RST frames received earlier than this duration will be considered
rapid.
It was particularly hard to follow once we reach client c3.
The goal is for top-level transports to report whether the client is
still present or not.
Once a client is reportedly gone, processing its VCL task(s) is just a
waste of resources. The execution of client-facing VCL is intercepted
and an artificial return(fail) is returned in that scenario.

Thanks to the introduction of the universal return(fail) proper error
handling and resource tear down is already in place, which makes this
change safe modulus unknown bugs. This adds a circuit breaker anywhere
in the client state machine where there is VCL execution.

A new Reset time stamp is logged to convey when a task does not complete
because the client is gone. This is a good complement to the walk away
feature and its original circuit breaker for the waiting list, but this
has not been integrated yet.

While the request is technically failed, it won't increase the vcl_fail
counter, and a new req_reset counter is incremented. This new behavior
is guarded by a new vcl_req_reset feature flag, enabled by default.

Refs varnishcache#3835
Refs 61a15cb
Refs e5efc2c
Refs ba54dc9
Refs 6f50a00
Refs b881699

Conflicts:
	include/tbl/feature_bits.h
The error check is not performed in a critical section to avoid
contention, at the risk of not seeing the error until the next
transport poll.
With varnishcache#3998 we need to ensure streams are not going to skip vcl_recv if
reset faster than reaching this step for the request task.

The alternative to prevent the vcl_req_reset feature from interfering
is to simply disable it.
Noticed while porting varnishcache#3998 to the 6.0 branch with a varnishtest more
sensitive to timing.
This will allow per-session adjustments and also significantly
lower the risk of inconsistent calculations in the rate limit
code during parameter changes.

Ref varnishcache#3996
Conflicts:
	vmod/automake_boilerplate_h2.am
	vmod/vmod_h2.vcc
we can not make the parameter const because API.
Copy link
Member

@nigoroll nigoroll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally approve, the missing documentation detail is minor

vmod/vmod_h2.vcc Show resolved Hide resolved
bsdphk and others added 6 commits October 23, 2023 19:18
Control characters will be caught by vct_ishdrval() anyways, but this
condition would also reject allowed obs-text non-ASCII characters.

Signed-off-by: Dridi Boukelmoune <[email protected]>
walid-git and others added 2 commits October 23, 2023 19:18
Adds coverage for tab characters at start/end of field value.

Regarding the "fo o" " bar" header, it cumulates an error in the name
and another in the value, but only one of them will trigger the expected
PROTOCOL_ERROR. Only the invalid "fo o" is checked now, and the other
error is part of the new coverage.

Signed-off-by: Dridi Boukelmoune <[email protected]>
@dridi
Copy link
Member Author

dridi commented Oct 23, 2023

Added patch series containing the patch requested in #3996 (comment).

@dridi dridi merged commit a6223de into varnishcache:7.3 Oct 24, 2023
1 check passed
@dridi dridi deleted the rapid_reset_7.3 branch October 24, 2023 15:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants