expire: Force a cache MISS when req.ttl = 0s #4041

AlveElde · 2024-01-18T15:12:54Z

This commit changes the undefined behavior of setting req.ttl = 0s. Setting req.ttl = 0s would previously cause req.ttl to have no effect, but this breaks with setting req.grace = 0s, which does have an effect.

The use case is to achieve hash_always_miss semantics but with coalescing (waitinglist) (edit @nigoroll during bugwash)

The new behavior of setting req.ttl = 0s is similar to setting req.hash_always_miss = true, the main difference is that req.ttl allows for request coalescing. Useful for short-lived non-private objects.

I would like to define this behavior in the documentation somewhere, but didn't find a good place to put it.

Both req.ttl and req.grace are initialized to -1, which is why this change does not break everything. But setting req.ttl/grace to -1s in VCL ends up setting them to 0s, as these two req attributes have extra logic in their respective definitions:

varnish-cache/bin/varnishd/cache/cache_vrt_var.c

Lines 566 to 569 in 5f67d3a

    
           REQ_VAR_L(ttl, d_ttl, VCL_DURATION, if (!(arg>0.0)) arg = 0;) 
        
           REQ_VAR_R(ttl, d_ttl, VCL_DURATION) 
        
           REQ_VAR_L(grace, d_grace, VCL_DURATION, if (!(arg>0.0)) arg = 0;) 
        
           REQ_VAR_R(grace, d_grace, VCL_DURATION)

Since this commit effectively prevents VCL writers from "clearing" req.ttl by setting it to 0s, maybe we should consider allowing users to clear req.ttl/grace by setting them to -1s?

dridi · 2024-01-18T16:14:22Z

Since this commit effectively prevents VCL writers from "clearing" req.ttl by setting it to 0s, maybe we should consider allowing users to clear req.ttl/grace by setting them to -1s?

Maybe this is for the best? What about this instead?

unset req.ttl;
unset req.grace;

nigoroll · 2024-01-22T15:02:13Z

My take: Integrate Dridis suggestion and update the docs, then it's a good improvement!

dridi · 2024-01-25T14:20:50Z

A thought occurred to me, shouldn't we use NAN when req.{ttl,grace} are unset?

hermunn · 2024-01-31T09:20:09Z

The problem with this PR is that it does not do what the title says. In the event of a waiting list, and there is a queue of clients waiting for an object, a newly inserted object will be "fresh", even if the "waiting list clients" all have req.ttl = 0s; and req.grace = 0s;, so these clients will get hits on the inserted object (the one which the clients on the waiting list were waiting for). This should be documented in a test case. Of course, this might be desired or undesired for the user, which is a problem.

The question is if we should special case the situation req.ttl = 0s; and req.grace = 0s; to mean I don't want a hit, grace or not, but do go on the waiting list if there is a busy object, and do grab the stale_oc if you find somthing. I don't like the idea of adding a new symbol for this behavior, in the way we have hash_always_miss and hash_ignore_busy, but it might be the "most correct" solution.

dridi · 2024-01-31T09:34:07Z

I think it would be more accurate to say "consider all objects stale when req.ttl = 0s".

Aren't we creating a perpetual waiter if the client disregards both TTL and grace? That is, when there is a busy object, the outcome will always be to disembark the request. So until the request wins the miss race, it will compete with other "req.ttl = req.grace = 0s" requests.

This is waiting list serialization once again, except that we swapped beresp with req to trigger it. I can try combining this change with #4032 to check whether it would eventually be mitigated.

dridi · 2024-01-31T10:00:49Z

Aren't we creating a perpetual waiter if the client disregards both TTL and grace?

We are not since we treat zero as the lack of TTL limit for the client. Let's revisit that specific point once unset is implemented and we treat NAN as the lack of limit instead.

hermunn · 2024-01-31T10:14:39Z

This is waiting list serialization once again

Yeah, you are right, my comment does not add much, and I think we should simply go with the PR as it is now, with all the timestamp confusion which exist.

hermunn · 2024-01-31T10:16:04Z

Sorry, strike that, I still think we need a test case demonstrating how you can get a hit, even when the request set both req.ttl and req.grace to 0s because of the waiting list and timestamps.

hermunn

I have looked at the test case again, and I think it does what it should do. I think we can merge this.

AlveElde · 2024-01-31T10:56:20Z

If you, like me, were wondering why the test case provided does not return a grace hit, this is the reason: #2422. The object in cache is still fresh, and since we ignore req.ttl, we get no candidates for a grace hit. Whether this is the correct behavior is a subject for future discussion.

dridi · 2024-01-31T11:21:27Z

I think the trade off from #2422 can be let go in favor of the trade off from #4032. You should end up with cache hits again, for a different reason.

nigoroll

Overall, I am happy, but the one question I have might be relevant.

nigoroll · 2024-01-31T11:58:26Z

bin/varnishd/cache/cache_vrt_var.c

+	ctx->req->elem = val;						\
+	ctx->req->elem = val;						\
+}
+
 REQ_VAR_R(backend_hint, director_hint, VCL_BACKEND)

 REQ_VAR_L(ttl, d_ttl, VCL_DURATION, if (!(arg>0.0)) arg = 0;)


do we still want to set negative arguments to zero? Could it be a better option to fail?

IMO it's reasonable to expect that req.ttl can be set to a negative duration, we even do so in a test case:

varnish-cache/bin/varnishtest/tests/v00020.vtc

Line 150 in e0b164f

set req.ttl = -1s;

Failing on a negative duration has a good chance of breaking existing VCL.

That is a coverage test.
What are the semantics of a negative duration for req.ttl and req.grace?

The same as beresp.ttl and beresp.grace:

varnish-cache/bin/varnishd/cache/cache_vrt_var.c

Lines 743 to 744 in e0b164f

if (a < 0.0) \

a = 0.0; \

I understand that this "has always been like that", but also please remember that for quite some time, varnish had no way of properly failing at runtime (VRT_fail()).

The particular lines you are referring to are from 10 years ago and I am still wondering what a negative ttl/grace is supposed to mean. ttl: already expired before it even went into the cache? If those were the semantics, then we would not be allowed to take the respective objects as "just refreshed". grace: I would think a negative value should move an object into grace mode earlier than ttl.

So, as we come across these questions, we should answer them.

I totally agree that "has always been like that" is a weak argument, but I personally prefer to limit breakage if I can. It's not obvious to me what a negative TTL/grace should mean, and I think that question is bigger than the scope of this PR.

Let's try to cover this quickly during the next bugwash. I am also fine with continuing to ignore negative values, but I'd like to at least ask for opinions.

I think the point here is to cap the TTL, and once we run out of TTL the object goes in grace mode if applicable, so once we run out of TTL req.grace applies.

This is why in my opinion req.ttl should only limit the ability to look up a fresh object, and that negative values that could be the result of some arithmetic expression should be bumped to zero.

@dridi maybe I do not understand your answer, but why should we not fail immediately if an attempt is made to set any of the timers to a negative value?

If you use an expression instead of a constant to compute the acceptable TTL limit and the result is negative, I think it would be too harsh to fail the transaction.

If your "clever" req.ttl formula can yield something negative, then negative should be considered as allowing no TTL, just like zero.

AlveElde · 2024-01-31T15:58:02Z

I force pushed for two reasons:

Squash the squashme commit
Drop the commit changing the default value of req.ttl and req.grace to NAN

CCI alerted me to the fact that a VTC was failing due to a NAN being converted to string in the following line:

set resp.http.X-req-grace = req.grace;

Unless we want to convert NAN to some number when reading req.ttl and req.grace, I suggest we stick with -1 as the default value.

nigoroll · 2024-01-31T16:38:43Z

Unless we want to convert NAN to some number when reading req.ttl and req.grace, I suggest we stick with -1 as the default value.

I think we should print NAN

dridi · 2024-03-05T09:05:24Z

This pull request didn't strictly need #4069 to move forward, but #4069 cemented the semantics for NAN in a vtim_dur. We should use that for req.ttl and req.grace now.

AlveElde · 2024-07-15T19:01:26Z

Sorry for the noise, ready for review

dridi

LGTM, see small suggestions too.

edit: and conflicts to resolve.

dridi · 2024-09-16T13:17:00Z

bin/varnishd/cache/cache_vrt_var.c

+	ctx->req->elem = val;						\
+	ctx->req->elem = val;						\
+}
+
 REQ_VAR_R(backend_hint, director_hint, VCL_BACKEND)

 REQ_VAR_L(ttl, d_ttl, VCL_DURATION, if (!(arg>0.0)) arg = 0;)


I think the point here is to cap the TTL, and once we run out of TTL the object goes in grace mode if applicable, so once we run out of TTL req.grace applies.

This is why in my opinion req.ttl should only limit the ability to look up a fresh object, and that negative values that could be the result of some arithmetic expression should be bumped to zero.

dridi · 2024-09-16T13:17:04Z

doc/sphinx/reference/vcl_var.rst

 	During lookup the minimum of req.grace and the object's stored
-	grace value will be used as the object's grace.
+	grace value will be used as the object's grace. Setting req.grace
+	to 0s prevents hits on stale objects.


Maybe now would be a good time to document that negative values are swallowed.

dridi · 2024-09-16T13:18:12Z

doc/sphinx/reference/vcl_var.rst

+	Unsetable from: client

 	Upper limit on the object age for cache lookups to return hit.
+	Lookups always miss when req.ttl is set to 0s.


Maybe now would be a good time to document that negative values are swallowed.

If. Or document that they are invalid and will trigger a failure.

dridi · 2024-09-16T13:19:36Z

bin/varnishd/cache/cache_expire.c

-	if (req != NULL && req->d_ttl >= 0. && req->d_ttl < r)
+	if (req != NULL && !isnan(req->d_ttl) && req->d_ttl < r)
 		r = req->d_ttl;


assert(req->d_ttl >= 0.); ?

dridi · 2024-09-16T13:19:48Z

bin/varnishd/cache/cache_expire.c

-	if (req != NULL && req->d_grace >= 0. && req->d_grace < g)
+	if (req != NULL && !isnan(req->d_grace) && req->d_grace < g)
 		g = req->d_grace;


assert(req->d_grace >= 0.); ?

dridi · 2024-09-16T13:23:48Z

bin/varnishd/cache/cache_vrt.c

+	if (isnan(num))
+		return (WS_Printf(ctx->ws, "NAN"));


Why allocate a static string from the workspace instead of returning a static string?

how could my efficiency fetish miss this ;)

gquintard · 2024-09-23T19:58:58Z

doc/sphinx/reference/vcl_var.rst

@@ -510,8 +512,10 @@ req.ttl

 	Writable from: client

+	Unsetable from: client


Suggested change

Unsetable from: client

Unsettable from: client

AlveElde · 2024-10-04T14:31:54Z

I think I have addressed all review comments except @nigoroll's request to fail the transaction when either req.ttl or req.grace is set to a negative value. The patch series currently treats negative values as 0s. It is not obvious to me that we should fail when a negative value is set, but if that gets the PR merged, I'm fine with it ;)

dridi

LGTM!

I think you could squash the 4th commit in the 2nd and move the result last to reduce diff churn.

dridi · 2024-10-04T15:54:09Z

@nigoroll the sanitizer job is failing on a legit leak in the VDP stack unrelated to this change.

AlveElde · 2024-10-08T09:05:43Z

Squashed ✔️

doc/sphinx/reference/vcl_var.rst

hermunn

I think this is a good change which represents a necessary and positive improvement of the code. The heading in the first commit might be slightly misleading (the waiting-list is an option, not only a miss), but I will leave it to the author do decide if this should be amended.

doc/sphinx/reference/vcl_var.rst

This commit changes the undefined behavior of setting req.ttl = 0s. Setting req.ttl = 0s would previously cause req.ttl to have no effect, but this breaks with setting req.grace = 0s, which does have an effect. The new behavior of setting req.ttl = 0s is similar to setting req.hash_always_miss = true, the main difference is that req.ttl allows for request coalescing. Useful for short-lived non-private objects. This aligns the behavior of setting req.ttl = 0s with req.grace = 0s.

req.ttl and req.grace are NAN when unset

nigoroll · 2024-11-18T14:54:06Z

@AlveElde @dridi pointed out that I pushed a change which created a merge conflict with this PR, and I am sorry for having done that unintentionally. I see no semantic conflict, though, whatever the value of "unset" is, we can have it with or without this PR.
That said, I take review of this ticket as homework for next week.

hermunn approved these changes Jan 31, 2024

View reviewed changes

nigoroll requested changes Jan 31, 2024

View reviewed changes

AlveElde force-pushed the req-ttl-0s branch from dadd3f1 to 6a108fd Compare January 31, 2024 15:51

nigoroll added the a=Bugwash Today label Jan 31, 2024

dridi mentioned this pull request Feb 9, 2024

Setting backend timeouts to zero does not wait forever #3045

Closed

nigoroll removed the a=Bugwash Today label Jul 15, 2024

AlveElde force-pushed the req-ttl-0s branch 3 times, most recently from b433ea1 to 7309008 Compare July 15, 2024 18:56

dridi approved these changes Sep 16, 2024

View reviewed changes

dridi reviewed Sep 16, 2024

View reviewed changes

gquintard reviewed Sep 23, 2024

View reviewed changes

AlveElde force-pushed the req-ttl-0s branch from 7309008 to 6f9f43f Compare October 4, 2024 14:23

dridi approved these changes Oct 4, 2024

View reviewed changes

AlveElde force-pushed the req-ttl-0s branch from 6f9f43f to ecf9223 Compare October 8, 2024 09:04

hermunn reviewed Oct 8, 2024

View reviewed changes

doc/sphinx/reference/vcl_var.rst Outdated Show resolved Hide resolved

hermunn reviewed Oct 8, 2024

View reviewed changes

doc/sphinx/reference/vcl_var.rst Outdated Show resolved Hide resolved

AlveElde added 3 commits October 9, 2024 11:42

cache_vrt: Print NAN when converting NAN to string

596d08a

vrt_var: Make req.ttl and req.grace unsettable

b8a4e18

req.ttl and req.grace are NAN when unset

AlveElde force-pushed the req-ttl-0s branch from ecf9223 to b8a4e18 Compare October 9, 2024 09:45

	REQ_VAR_L(ttl, d_ttl, VCL_DURATION, if (!(arg>0.0)) arg = 0;)
	REQ_VAR_R(ttl, d_ttl, VCL_DURATION)
	REQ_VAR_L(grace, d_grace, VCL_DURATION, if (!(arg>0.0)) arg = 0;)
	REQ_VAR_R(grace, d_grace, VCL_DURATION)

		@@ -510,8 +512,10 @@ req.ttl

		Writable from: client

		Unsetable from: client

expire: Force a cache MISS when req.ttl = 0s #4041

Are you sure you want to change the base?

expire: Force a cache MISS when req.ttl = 0s #4041

Conversation

AlveElde commented Jan 18, 2024 • edited by nigoroll Loading

dridi commented Jan 18, 2024

nigoroll commented Jan 22, 2024

dridi commented Jan 25, 2024

hermunn commented Jan 31, 2024

dridi commented Jan 31, 2024

dridi commented Jan 31, 2024

hermunn commented Jan 31, 2024

hermunn commented Jan 31, 2024

hermunn left a comment

Choose a reason for hiding this comment

AlveElde commented Jan 31, 2024

dridi commented Jan 31, 2024

nigoroll left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlveElde commented Jan 31, 2024

nigoroll commented Jan 31, 2024

dridi commented Mar 5, 2024

AlveElde commented Jul 15, 2024

dridi left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlveElde commented Oct 4, 2024

dridi left a comment

Choose a reason for hiding this comment

dridi commented Oct 4, 2024

AlveElde commented Oct 8, 2024

hermunn left a comment

Choose a reason for hiding this comment

nigoroll commented Nov 18, 2024

AlveElde commented Jan 18, 2024 •

edited by nigoroll

Loading

dridi left a comment •

edited

Loading