Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VR Healthcheck Fails when a VPC is created, but there are no VMs created in the VPC yet. #10088

Open
btzq opened this issue Dec 11, 2024 · 12 comments

Comments

@btzq
Copy link

btzq commented Dec 11, 2024

ISSUE TYPE
  • Bug Report
COMPONENT NAME
Virtual Router
CLOUDSTACK VERSION
4.19.1.1
CONFIGURATION
OS / ENVIRONMENT
SUMMARY

When a VPC is created, a VR is spawned.
But, when the VPC does not have any VMs created in it yet, the VR Healthcheck fails with the following error: webserver.service service down at last check

This gives lots of false alarms for the Cloud Operator because we cant control when the user has created a network or vm or not. The healthcheck fail should only fire if there really is an issue with the router.

STEPS TO REPRODUCE

EXPECTED RESULTS
- Create any VPC
- Create any subnet (Eg 1 Tier)
- Do not create any VM
- Monitor alerts, will see VR Healthcheck fail: webserver.service service down at last check 
ACTUAL RESULTS
- Should not fire healthcheck fail unless there is a real issue.
@btzq btzq changed the title VR Healthcheck Fails when a VPC is created, but there are no network tiers and/or VMs created in the VPC yet. VR Healthcheck Fails when a VPC is created, but there are no VMs created in the VPC yet. Dec 11, 2024
@DaanHoogland DaanHoogland added this to the 4.19.2 milestone Dec 11, 2024
@DaanHoogland
Copy link
Contributor

@btzq , did you fix a healthcheck script for this?

@btzq
Copy link
Author

btzq commented Dec 11, 2024

@DaanHoogland what do you mean by that?

@DaanHoogland
Copy link
Contributor

@DaanHoogland what do you mean by that?

sorry for the hasty vague question.

What is the health check you saw an error in?
Did you hack the applicable check-script to make it work/ignore the condition?

@weizhouapache
Copy link
Member

it looks like "Expected results" and "actual results" should be swapped.

webserver/apache2 (use for metadata/userdata service) is not started if there is no vpc tiers.
it should not be checked in that case.

@DaanHoogland
Copy link
Contributor

webserver/apache2 (use for metadata/userdata service) is not started if there is no vpc tiers. it should not be checked in that case.

yes, sounds like at least one of the check script could be improved. I am trying to figure out which one though. (Maybe it is the starting of the entire run from the java code that should be inhibited)

@btzq
Copy link
Author

btzq commented Dec 15, 2024

@DaanHoogland no we didnt make any changes to the healthcheck script.

Is there any workaround for us at the moment? We’re getting lots of false alarms.

@DaanHoogland
Copy link
Contributor

@btzq , that would involve finding the right check script and adjust it to not cause an alarm, or disable the specific check alltogether.

@weizhouapache
Copy link
Member

@btzq , that would involve finding the right check script and adjust it to not cause an alarm, or disable the specific check alltogether.

yes, there is a global setting network.router.EnableServiceMonitoring which is default to true
try with disabling it

@btzq
Copy link
Author

btzq commented Dec 16, 2024

@btzq , that would involve finding the right check script and adjust it to not cause an alarm, or disable the specific check alltogether.

yes, there is a global setting network.router.EnableServiceMonitoring which is default to true try with disabling it

@weizhouapache will this disable all my healthchecks? I prefer to have the healthchecks, maybe just with the webserver one adjusted.

@weizhouapache
Copy link
Member

@btzq , that would involve finding the right check script and adjust it to not cause an alarm, or disable the specific check alltogether.

yes, there is a global setting network.router.EnableServiceMonitoring which is default to true try with disabling it

@weizhouapache will this disable all my healthchecks? I prefer to have the healthchecks, maybe just with the webserver one adjusted.

@btzq
it will disable the monitoring of services like ssh,dnsmasq, webserver,
you can check the file /etc/monitor.conf in the VRs

ps: I have not tested it yet

@btzq
Copy link
Author

btzq commented Dec 16, 2024

Ah, that wont do. Actually healthchecks are pretty useful to us, cause usually when it shoots, it really does present something that is wrong. Its only the webserver part thats misleading to us right now. So we cant afford to turn off all health checks.

@weizhouapache
Copy link
Member

Ah, that wont do. Actually healthchecks are pretty useful to us, cause usually when it shoots, it really does present something that is wrong. Its only the webserver part thats misleading to us right now. So we cant afford to turn off all health checks.

@btzq
it will not turn off all health checks, just skip the check on status of services (ssh,dnsmasq,apache2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants