Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Throttle config updates #66

Open
L3tum opened this issue Mar 25, 2021 · 4 comments
Open

Throttle config updates #66

L3tum opened this issue Mar 25, 2021 · 4 comments
Labels
enhancement New feature or request help-wanted Extra attention is needed

Comments

@L3tum
Copy link

L3tum commented Mar 25, 2021

Is your feature request related to a problem? Please describe.
Aggressive autoscaling can trigger config updates multiple times rendering Varnish unable to respond.

Describe the solution you'd like
Have a configurable grace period before issuing a config update and discard old ones if another change was registered.

Describe alternatives you've considered
Less aggressive autoscaling or just living with it, but both don't seem that hot.

Additional context
We have pretty aggressive autoscaling since we have to get from 0-10000 Requests per Second in 2 Minutes. However, because that autoscaling is so aggressive, new varnish instances keep coming up, which results in config updates. Each config update adds a bit of latency, and if a lot of varnishes go online in a short period, the latency can shoot up to ~1s, while Varnish itself can serve the request in ~3ms.

@L3tum L3tum added the enhancement New feature or request label Mar 25, 2021
@mittwald-machine
Copy link
Collaborator

There has not been any activity to this issue in the last 14 days. It will automatically be closed after 7 more days. Remove the stale label to prevent this.

@martin-helmich
Copy link
Member

Sorry for the late reaction! That's a great suggestion and would certainly be useful in highly dynamic environments. 👍 I cannot make any promises as to when (or even if) any of us might get on this, but in the meantime I'd happy review any PR coming my way concerning this feature.

@martin-helmich martin-helmich added the help-wanted Extra attention is needed label Apr 9, 2021
@L3tum
Copy link
Author

L3tum commented Apr 11, 2021

Hey, that's good to hear! I'm not really the Go expert, but I'll see if I can take a jab at it.
Does Go offer any niceties for this behaviour? In C# for example I'd start a background thread that is sleeping and checking on a shared object every ~100ms (or more or less often, depending how well this works) and the endpoint listener would then update that shared object with the new endpoint configuration (and maybe signal the background thread with a boolean that it was updated).

@martin-helmich
Copy link
Member

I think the best point to start would probably be the VarnishController.watchConfigUpdated and/or VarnishController.rebuildConfig functions. The former contains the main loop which single-threadedly joins the various configuration updates from multiple sources and would probably be a reasonable place to introduce a rate-limiting of config updates, whereas the latter contains the actual VCL update logic.

func (v *VarnishController) watchConfigUpdates(ctx context.Context, c *exec.Cmd, errors chan<- error) {
i := 0
for {
i++
select {
case tmplContents := <-v.vclTemplateUpdates:
glog.Infof("VCL template was updated")
tmpl, err := template.New("vcl").Parse(string(tmplContents))
if err != nil {
errors <- err
continue
}
v.vclTemplate = tmpl
errors <- v.rebuildConfig(ctx, i)
case newConfig := <-v.frontendUpdates:
glog.Infof("received new frontend configuration: %+v", newConfig)
v.frontend = newConfig
if v.varnishSignaller != nil {
v.varnishSignaller.SetEndpoints(v.frontend)
}
errors <- v.rebuildConfig(ctx, i)
case newConfig := <-v.backendUpdates:
glog.Infof("received new backend configuration: %+v", newConfig)
v.backend = newConfig
errors <- v.rebuildConfig(ctx, i)
case <-ctx.Done():
errors <- ctx.Err()
return
}
}
}

For the actual rate limiting, a structure like rate.Limiter looks like it might do the job. However, any implemented solution should assert that...

  1. ...it doesn't block the upstream config updates. For that reason, simply introducing a time.Sleep in the main loop won't work, as that will clog the channels that feed into that function, and also the goroutines/threads that write into those.
  2. ...it doesn't drop any configuration updates entirely (i.e. using the ratelimiter to arbitrarily drop configuration updates when the rate limit is exceeded), since there's no guarantee that there'll ever be a subsequent configuration update within an acceptable time frame (which might result in new configuration not being applied at all). Luckily, the updated configuration will always be snapshotted within the VarnishController itself (since all updated structures are copied there within the main loop, which should thus remain unthrottled), meaning that even when a rebuildConfig call is delayed, it will have a current config when eventually executed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help-wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants