Skip to content

Latest commit

 

History

History
456 lines (286 loc) · 23.9 KB

CNFCertification-1.0-beta.md

File metadata and controls

456 lines (286 loc) · 23.9 KB

CNF Certification List of Tests - v1.0-beta

Summary

This document provides a summary of the tests included in the CNF Certification. Each test lists a general overview of what the test does, a link to the test code for that test, and links to additional information when relevant/available.

To learn how to run these tests, see the "Instructions." For further details see the USAGE guide

To learn why these tests were written, see the RATIONALE.md

Types of Tests (Currently 57 total for the Certification)

  • Essential: 15 total
  • Normal: 24 total
  • Bonus: 18 total

The first level of certification requires the passing of 10 of the 15 total essential tests.

The List of Essential Tests are:

List of Workload Tests

Compatibility, Installability, and Upgradability Category

  • Added to CNF Certification in v1.0
  • Expectation: The number of replicas for a Pod increases, then the number of replicas for a Pod decreases

What's tested: The pod is increased and replicated to 3 for the CNF image or release being tested. After increase_capacity increases the replicas to 3, it decreases back to 1.

The increase and decrease capacity tests: HPA (horizonal pod autoscale) will autoscale replicas to accommodate when there is an increase of CPU, memory or other configured metrics to prevent disruption by allowing more requests by balancing out the utilisation across all of the pods.

Decreasing replicas works the same as increase but rather scale down the number of replicas when the traffic decreases to the number of pods that can handle the requests.

You can read more about horizonal pod autoscaling to create replicas here and in the K8s scaling cheatsheet.

  • Added to CNF Certification in v1.0
  • Expectation: Helm chart is published

What's tested: Checks if a Helm chart is published

  • Added to CNF Certification in v1.0
  • Expectation: Helm chart is valid

What's tested: This runs helm lint against the helm chart being tested. You can read more about the helm lint command at helm.sh

  • Added to CNF Certification in v1.0
  • Expectation: Helm deploy is successful

What's tested: This checks if the CNF was deployed using Helm

  • Added to CNF Certification in v1.0
  • Expectation: test if the CNF can perform a rolling update, [Rolling version change], and [Rolling downgrade]

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: CNF rollback is successful

What's tested: To check if a CNF version can be rolled back

  • Added to CNF Certification in v1.0
  • Expectation: CNF should be compatible with multiple and different CNIs

What's tested: This installs temporary kind clusters and will test the CNF against both Calico and Cilium CNIs.

Microservice Category

  • Added to CNF Certification in v1.0
  • Expectation: CNF image size is under 5 gigs

What's tested: Checks the size of the image used.

  • Added to CNF Certification in v1.0
  • Expectation: CNF starts up under 30 seconds

What's tested: This counts how many seconds it takes for the CNF to startup.

  • Added to CNF Certification in v1.0
  • Expectation: CNF container has one process type

What's tested: This verifies that there is only one process type within one container. This does not count against child processes. Example would be nginx or httpd could have a parent process and then 10 child processes but if both nginx and httpd were running, this test would fail.

  • Added to CNF Certification in v1.0
  • Expectation: CNFs should not expose their containers as a service

What's tested: This tests and checks if a container for the CNF has services exposed. Application access for microservices within a cluster should be exposed via a Service. Read more about K8s Service here.

  • Added to CNF Certification in v1.0
  • Expectation: Multiple microservices should not share the same database.

What's tested: This tests if multiple CNFs are using the same database.

State Category

  • Added to CNF Certification in v1.0
  • Expectation: A node will be drained and rescheduled onto other available node(s).

What's tested: A node is drained and rescheduled to another node, passing with a liveness and readiness check. This will skip when the cluster only has a single node.

  • Added to CNF Certification in v1.0
  • Expectation: Local storage should not be used or configured.

What's tested: This tests if local volumes are being used for the CNF.

  • Added to CNF Certification in v1.0
  • Expectation: Elastic persistent volumes should be configured for statefulness.

What's tested: This checks for elastic persistent volumes in use by the CNF.

Reliability, Resilience and Availability Category

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should continue to function when network latency occurs

What's tested: This experiment causes network degradation without the pod being marked unhealthy/unworthy of traffic by kube-proxy (unless you have a liveness probe of sorts that measures latency and restarts/crashes the container). The idea of this experiment is to simulate issues within your pod network OR microservice communication across services in different availability zones/regions etc.

The applications may stall or get corrupted while they wait endlessly for a packet. The experiment limits the impact (blast radius) to only the traffic you want to test by specifying IP addresses or application information. This experiment will help to improve the resilience of your services over time.

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should continue to function when disk fill occurs

What's tested: Stressing the disk with continuous and heavy IO for example can cause degradation in reads written by other microservices that use this shared disk for example modern storage solutions for Kubernetes to use the concept of storage pools out of which virtual volumes/devices are carved out. Another issue is the amount of scratch space eaten up on a node which leads to the lack of space for newer containers to get scheduled (Kubernetes too gives up by applying an "eviction" taint like "disk-pressure") and causes a wholesale movement of all pods to other nodes. Similarly with CPU chaos, by injecting a rogue process into a target container, we starve the main microservice process (typically PID 1) of the resources allocated to it (where limits are defined) causing slowness in application traffic or in other cases unrestrained use can cause the node to exhaust resources leading to the eviction of all pods. So this category of chaos experiment helps to build the immunity on the application undergoing any such stress scenario.

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should continue to function when pod delete occurs

What's tested: This experiment helps to simulate such a scenario with forced/graceful pod failure on specific or random replicas of an application resource and checks the deployment sanity (replica availability & uninterrupted service) and recovery workflow of the application.

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should continue to function when pod memory hog occurs

What's tested: The pod-memory hog experiment launches a stress process within the target container - which can cause either the primary process in the container to be resource constrained in cases where the limits are enforced OR eat up available system memory on the node in cases where the limits are not specified.

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should continue to function when pod io stress occurs

What's tested: This test stresses the disk with with continuous and heavy IO to cause degradation in reads/ writes by other microservices that use this shared disk.

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should continue to function when pod network corruption occurs

What's tested: This test uses the LitmusChaos pod_network_corruption experiment.

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should continue to function when pod network duplication occurs

What's tested: This test uses the LitmusChaos pod_network_duplication experiment.

  • Added to CNF Certification in v1.0
  • Expectation: A liveness probe should be found in the CNF cluster

What's tested: This test checks for livenessProbe in the resource and container

  • Added to CNF Certification in v1.0
  • Expectation: A readiness probe should be found in the CNF cluster

What's tested: This test check for readinessProbe in the resource and container

Observability and Diagnostic Category

  • Added to CNF Certification in v1.0
  • Expectation: Resource output logs should be sent to STDOUT/STDERR

What's tested: This checks and verifies that STDOUT/STDERR is configured for logging.

For example, running kubectl get logs returns useful information for diagnosing or troubleshooting issues.

  • Added to CNF Certification in v1.0
  • Expectation: Prometheus is being used for the cluster and CNF for metrics.

What's tested: Tests for the presence of Prometheus or if the CNF emit prometheus traffic.

  • Added to CNF Certification in v1.0
  • Expectation: Fluentd is capturing logs.

What's tested: Checks for fluentd presence and if logs are being captured for fluentd.

  • Added to CNF Certification in v1.0
  • Expectation: CNF should emit OpenMetrics compatible traffic.

What's tested: Checks if OpenMetrics is being used and or compatible.

  • Added to CNF Certification in v1.0
  • Expectation: The CNF should use tracing

What's tested: Checks if Jaeger is configured and tracing is being used.

Security Category

  • Added to CNF Certification in v1.0
  • Expectation: Container engine daemon sockets should not be mounted as volumes

What's tested This test uses the Kyverno policy called Disallow CRI socket mounts

[Sysctls test]

  • Added to CNF Certification in v1.0
  • Expectation: TBD

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: A CNF should not run services with external IPs

What's tested: Checks if the CNF has services with external IPs configured

  • Added to CNF Certification in v1.0
  • Expectation: Containers should not allow for privilege escalation

What's tested: TBD Privilege Escalation: Check that the allowPrivilegeEscalation field in securityContext of container is set to false.

See more at ARMO-C0016

  • Added to CNF Certification in v1.0
  • Expectation: No containers allow a symlink attack

What's tested: This control checks the vulnerable versions and the actual usage of the subPath feature in all Pods in the cluster.

See more at ARMO-C0058

  • Added to CNF Certification in v1.0
  • Exepectation: Application credentials should not be found in configuration files

What's tested: Check if the pod has sensitive information in environment variables, by using list of known sensitive key names. Check if there are configmaps with sensitive information.

See more at ARMO-C0012

  • Added to CNF Certification in v1.0
  • Expectation: PODs should not have access to the host systems network.

What's tested: Checks if there is a host network attached to a pod. See more at ARMO-C0041

  • Added to CNF Certification in v1.0
  • Expectation: The automatic mapping of service account tokens should be disabled.

What's tested: Check if service accounts are automatically mapped. See more at ARMO-C0034.

What's tested: Checks Ingress and Egress traffic policy

  • Added to CNF Certification in v1.0
  • Expectation: Containers should not allow privilege escalation

What's tested: Check in POD spec if securityContext.privileged == true. Read more at ARMO-C0057

  • Added to CNF Certification in v1.0
  • Expectation: Containers should not have insecure capabilities enabled

What's tested: Checks for insecure capabilities. See more at ARMO-C0046

This test checks against a blacklist of insecure capabilities.

  • Added to CNF Certification in v1.0
  • Expectation: Containers should not have dangerous capabilities enabled

What's tested: This test checks against a denylist of dangerous capabilities.

See more at ARMO-C0028

  • Added to CNF Certification in v1.0
  • Expectation: Namespaces should have network policies defined

What's tested: Checks if network policies are defined for namespaces. Read more at ARMO-C0011.

  • Added to CNF Certification in v1.0
  • Expectation: Containers should run with non-root user with non-root group membership

What's tested: Checks if containers are running with non-root user with non-root membership. Read more at ARMO-C0013

  • Added to CNF Certification in v1.0
  • Expectation: Containers should not have hostPID and hostIPC privileges

What's tested: Checks if containers are running with hostPID or hostIPC privileges. Read more at ARMO-C0038

[SELinux options]

  • Added to CNF Certification in v1.0
  • Expectation: SELinux options should not be used

What's tested: Checks if CNF resources use custom SELinux options that allow privilege escalation (selinux_options)

  • Added to CNF Certification in v1.0
  • Expectation: Security services are being used to harden application

What's tested: Checks if security services are being used to harden the application. Read more at ARMO-C0055

  • Added to CNF Certification in v1.0
  • Expectation: Containers should have resource limits defined

What's tested: Check for each container if there is a ‘limits’ field defined. Check for each limitrange/resourcequota if there is a max/hard field defined, respectively. Read more at ARMO-C0009.

  • Added to CNF Certification in v1.0
  • Expectation: Containers should have immutable file system

What's tested: Checks whether the readOnlyRootFilesystem field in the SecurityContext is set to true. Read more at ARMO-C0017

  • Added to CNF Certification in v1.0
  • Expectation: Containers should not have hostPath mounts

What's tested: TBD Read more at ARMO-C0045

[Default namespaces]

  • Added to CNF Certification in v1.0
  • Expectation: To check if resources of the CNF are not in the default namespace

What's tested: TBD

Configuration Category

[Latest tag]

  • Added to CNF Certification in v1.0
  • Expectation: Checks if a CNF is using 'latest' tag instead of a version.

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: Checks if pods are using the 'app.kubernetes.io/name' label

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: Checks for configured node ports in the service configuration.

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: Checks for configured host ports in the service configuration.

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: Checks for hardcoded IP addresses or subnet masks in the K8s runtime configuration.

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: Checks for K8s secrets.

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: Checks for K8s version and if immutable configmaps are enabled.

What's tested: TBD

  • Added to CNF Certification in v1.0
  • Expectation: Test if the CNF crashes when pod dns error occurs

What's tested: TBD