Skip to content

Commit

Permalink
main
Browse files Browse the repository at this point in the history
  • Loading branch information
GitLab Runner committed Feb 2, 2024
1 parent aeef06c commit ebec785
Show file tree
Hide file tree
Showing 16 changed files with 730 additions and 187 deletions.
40 changes: 22 additions & 18 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
# What is Qbo?
# What is QBO?

QBO Kubernetes Engine (QKE) is an AsyncAPI designed to streamline the deployment of Docker in Docker (DinD) Kubernetes services. It empowers users to effortlessly deploy, manage, and scale containerized applications using Kubernetes, eliminating the need for intricate infrastructure management. With QKE, users can unlock the full potential of metal performance, particularly beneficial for compute-intensive workloads demanding high performance and minimal latency, such as databases and AI/ML models.

QBO offers **QBO Kubernetes Engine Community Edition (QKE CE)** at no cost, compatible with Linux or Windows (WSL2), enabling local execution. Additionally, QBO provides a fully managed cloud solution through **QBO Cloud (QKE Cloud)**.

# Technology

> Designed for AI and ML
Expand All @@ -7,45 +13,43 @@ Bare metal with the flexibility of the cloud, powered by qbo Kubernetes, excels

> Pure containers
Qbo operates as an AsyncAPI that oversees Docker in Docker (DinD) Kubernetes clusters. In this unique setup, the traditional notion of a Kubernetes node, typically associated with a virtual or physical machine, is redefined as a Docker container within the Qbo framework. Containerd runs within Docker in the Qbo environment, facilitating the deployment of an entire Kubernetes infrastructure as a self-contained process directly on the hardware.
QBO operates as an AsyncAPI that oversees Docker in Docker (DinD) Kubernetes clusters. In this unique setup, the traditional notion of a Kubernetes node, typically associated with a virtual or physical machine, is redefined as a Docker container within the QBO framework. Containerd runs within Docker in the QBO environment, facilitating the deployment of an entire Kubernetes infrastructure as a self-contained process directly on the hardware.

> Metal Performance
This deployment method offers an instant boost in performance, leveraging direct access to all hardware resources, including CPU, memory, and disk. Tasks like cluster creation, deletion, starting, stopping, or scaling demonstrate remarkable speed compared to virtual machines. Qbo sustains the efficiency of metal performance while retaining the benefits of resource isolation through pure container technology.
This deployment method offers an instant boost in performance, leveraging direct access to all hardware resources, including CPU, memory, and disk. Tasks like cluster creation, deletion, starting, stopping, or scaling demonstrate remarkable speed compared to virtual machines. QBO sustains the efficiency of metal performance while retaining the benefits of resource isolation through pure container technology.

> Native Kernel Functions
Administering a single kernel for each host, encompassing the Kubernetes nodes, offers a distinct advantage in terms of security, network, and storage operations. All tasks can be seamlessly executed through a unified interface—the Linux Kernel. Qbo harnesses these capabilities by utilizing native kernel features such as IPVS, netfilter, iproute2, and eBPF. Achieving observability and enhancing security becomes feasible by instrumenting a single kernel through eBPF.
Administering a single kernel for each host, encompassing the Kubernetes nodes, offers a distinct advantage in terms of security, network, and storage operations. All tasks can be seamlessly executed through a unified interface—the Linux Kernel. QBO harnesses these capabilities by utilizing native kernel features such as IPVS, netfilter, iproute2, and eBPF. Achieving observability and enhancing security becomes feasible by instrumenting a single kernel through eBPF.

> Unified AsyncAPI
Qbo operates as a proxy AsyncAPI, overseeing not just Kubernetes clusters but also various cloud components. It boasts a high-performance AsyncAPI capable of real-time reporting on the status of Kubernetes nodes (represented as Docker containers), pods, processes, and threads within the host. The data structure is formatted in JSON, and interactions are executed through commands.
QBO operates as a proxy AsyncAPI, overseeing not just Kubernetes clusters but also various cloud components. It boasts a high-performance AsyncAPI capable of real-time reporting on the status of Kubernetes nodes (represented as Docker containers), pods, processes, and threads within the host. The data structure is formatted in JSON, and interactions are executed through commands.

> Real Time Communication System
Given the dynamic nature of operations and the multitude of data sources involved, websockets play a vital role in capturing real-time system states. Qbo introduces the concept of 'mirrors' as focal points for websocket messages, enabling subscribers within the same 'mirror' to receive updates from relevant systems. Regardless of data origin — be it Kubernetes, Docker, Linux OS, threads, processes, registries, or authentication providers — Qbo consolidates pertinent data for mirror subscribers, ensuring an accurate real-time system representation. For example, users monitoring Kubernetes pod operations through a 'mirror' can observe instantaneous updates. Similarly, actions such as pthread or process executions are immediately visible to relevant mirror subscribers. Thus, Qbo facilitates real-time communication via a proxy AsyncAPI, catering to all cloud components.
Given the dynamic nature of operations and the multitude of data sources involved, websockets play a vital role in capturing real-time system states. QBO introduces the concept of 'mirrors' as focal points for websocket messages, enabling subscribers within the same 'mirror' to receive updates from relevant systems. Regardless of data origin — be it Kubernetes, Docker, Linux OS, threads, processes, registries, or authentication providers — QBO consolidates pertinent data for mirror subscribers, ensuring an accurate real-time system representation. For example, users monitoring Kubernetes pod operations through a 'mirror' can observe instantaneous updates. Similarly, actions such as pthread or process executions are immediately visible to relevant mirror subscribers. Thus, QBO facilitates real-time communication via a proxy AsyncAPI, catering to all cloud components.

# Kubernetes Engine

> Conformance
> Watch the video
Qbo Kubernetes aligns with the Cloud Native Computing Foundation (CNCF) standards, ensuring adherence to best practices in cloud-native computing. This conformance establishes a solid foundation for scalability, interoperability, and performance, making qbo Kubernetes a reliable choice for AI workloads.
[![Watch the video](https://i.ytimg.com/vi/s2pItFe8IwU/maxresdefault.jpg)](https://www.youtube.com/embed/s2pItFe8IwU?rel=0")


[![CNCF Certified](img/certified-kubernetes-color.svg)](https://landscape.cncf.io/card-mode?category=certified-kubernetes-distribution,certified-kubernetes-hosted,certified-kubernetes-installer&grouping=category&selected=qbo)
## Conformance

Compatible with `Kind` images
[https://hub.docker.com/r/kindest/node/tags](https://hub.docker.com/r/kindest/node/tags)
QBO Kubernetes aligns with the Cloud Native Computing Foundation (CNCF) standards, ensuring adherence to best practices in cloud-native computing. This conformance establishes a solid foundation for scalability, interoperability, and performance, making qbo Kubernetes a reliable choice for AI workloads.

# Kubernetes Engine

> Watch the video
[![Watch the video](https://i.ytimg.com/vi/s2pItFe8IwU/maxresdefault.jpg)](https://www.youtube.com/embed/s2pItFe8IwU?rel=0")

[![CNCF Certified](img/certified-kubernetes-color.svg)](https://landscape.cncf.io/card-mode?category=certified-kubernetes-distribution,certified-kubernetes-hosted,certified-kubernetes-installer&grouping=category&selected=qbo)

Compatible with `Kind` images
[https://hub.docker.com/r/kindest/node/tags](https://hub.docker.com/r/kindest/node/tags)


# Features
## Features

?>
○ Multi cluster management<br>
Expand Down Expand Up @@ -88,7 +92,7 @@ Compatible with `Kind` images
■ Neural Graphs<br> -->


# Commands
## Commands
|Command | Argument | Options | Paraemeter | Admin | Example | Description |
|-------------------|-------------------------------------|----------|------------|-------|-----------------------------------------------------------------|------------------------------|
| qbo add cluster | char[64] | -i | char[64] | N | qbo add cluster `alex` -i `hub.docker.com/kindest/node:v1.27.2` | Add cluster |
Expand Down
2 changes: 1 addition & 1 deletion docs/_coverpage.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
[![Tag](https://img.shields.io/badge/prod-4.3.2--49c0db762-black)](https://github.com/alexeadem/qbo-docs/tags)


> QBO CLOUD DOCUMENTATION
> QBO DOCUMENTATION
- Unlocking the power of cloud computing for anyone, anywhere.


Expand Down
27 changes: 17 additions & 10 deletions docs/_sidebar.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,24 @@
&nbsp;&nbsp;&nbsp;&nbsp; [![Version](https://img.shields.io/badge/qbo-cloud-blue)](https://github.com/alexeadem/qbo-docs/blob/main/LICENSE)
&nbsp;&nbsp;&nbsp;&nbsp; [![Version](https://img.shields.io/badge/qbo-docs-blue)](https://github.com/alexeadem/qbo-docs/blob/main/LICENSE)
[![Tag](https://img.shields.io/badge/prod-4.3.2--49c0db762-black)](https://github.com/alexeadem/qbo-docs/tags)

- Getting started
- [Quick start](quick_start.md)
- Kubernetes Engine
- [Authentication](auth.md)
- [Overview](README.md)
- [Getting started](quick_start.md)
- Kubernetes Engine (QKE)
- CE
- [Installation](installation.md)
- [Registry](registry.md)
- [Custom Images](custom_images.md)
- CLOUD
- [Authentication](auth.md)
- [User Interface](user_iface.md)
- [CLI](cli.md)
- [API](https://spec.qbo.io/)
- Tutorials
- [AI & ML](https://docs.qbo.io/#/ai_and_ml)
- [Service Mesh](https://docs.qbo.io/#/istio)
- [Ingress Controller](https://docs.qbo.io/#/nginx)
- [Persisten Storage](https://docs.qbo.io/#/persistent_storage)
- [Quick Start](cluster_ops.md)
- [QBOT](qbot.md)
- [AI & ML](ai_and_ml.md)
- [Service Mesh](istio.md)
- [Ingress Controller](nginx.md)
- [Persistent Storage](persistent_storage.md)
- Conformance
- [CNCF](https://docs.qbo.io/#/conformance)
- [CNCF](conformance.md)
182 changes: 154 additions & 28 deletions docs/ai_and_ml.md
Original file line number Diff line number Diff line change
@@ -1,39 +1,40 @@
# Add Vector with NVIDIA GPU

## Requirements
* [CLI](cli.md) Configuration
* [NVIDIA GPU Operator verison v23.9.1](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/platform-support.html#operator-platform-support)
* [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
* [NVIDIA Driver 535.129.03](https://www.nvidia.com/download/driverResults.aspx/213194/en-us/)

## qbot
```
git clone https://github.com/alexeadem/qbot
cd qbot
## Nvidia GPU Operator
### Prerequsites

<!-- * [CLI](cli.md) Configuration -->
| Dependency | Validated or Included Version(s) | Notes
|-----------|----------| |
|[Kubernetes](https://kubernetes.io/docs/home/) | [v1.25.11](https://github.com/kubernetes/kubernetes/tree/release-1.25) | |
|[NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)| [v1.14.3](https://github.com/NVIDIA/nvidia-container-toolkit/releases/tag/v1.14.3) ||
|[NVIDIA GPU Operator](https://docs.nvidia.com/datacenter/cloud-native/gpu-operator/latest/index.html) | [v23.9.1](https://github.com/NVIDIA/gpu-operator/releases/tag/v23.9.1) ||
|NVIDIA Driver | [535.129.03](https://www.nvidia.com/download/driverResults.aspx/213194/en-us/) [546.01](https://www.nvidia.com/download/driverResults.aspx/216365/en-us/)||
|[NVIDIA CUDA](https://docs.nvidia.com/cuda/)| [12.2](https://developer.nvidia.com/cuda-12-2-0-download-archive)||
|OS | Linux, Windows 10, 11 (WSL2)| |

### qbot
[1. Install qbot](qbot)

#### 2. Run qbot
```bash
./qbot gpu-operator
```

> Start qbot Nvidia demo
```
./qbot
>>> ./qbot {istio | nginx | kubeconfig | nvidia} -- Demo to run
./qbot nvidia
```

## Clreate K8s Cluster
### 1. Create K8s Cluster

> For this tutorial we are using `NVIDIA` as our cluster name
> For this tutorial we are using `nvidia` as our cluster name
```bash
export NAME=nvidia
export NAME=nvidia
```

> Get qbo version to make sure we have access to qbo API
```bash
qbo version | jq .version[]?
```

> Add a K8s cluster with default image v1.27.3
> Add a K8s cluster with image v1.25.11. See [Kubeflow compatibility](ai_and_ml?id=kubeflow)
```bash
qbo add cluster $NAME | jq
qbo add cluster $NAME -i hub.docker.com/kindest/node:v1.25.11 | jq
```

> Get nodes information using qbo API
Expand All @@ -53,8 +54,9 @@ kubectl get nodes
```


## Nvidia GPU Operatator
> Deploy Nvidia GPU Operatator helm chart
### 2. Deploy Nvidia GPU Operatator
#### 2.1 Linux
> Nvidia GPU Operatator helm chart
```bash
helm repo add nvidia https://helm.ngc.nvidia.com/nvidia || true
Expand All @@ -63,8 +65,29 @@ helm install --wait --generate-name -n gpu-operator --create-namespace nvidia/gp

```

#### 2.2 Windows (WSL2)

## Vector Add Application
#### 2.2.1 Add PCI Labels
```bash
for i in $(kubectl get no --selector '!node-role.kubernetes.io/control-plane' -o json | jq -r '.items[].metadata.name'); do
kubectl label node $i feature.node.kubernetes.io/pci-10de.present=true
done
```

#### 2.2.2 Deploy Chart Templates
```bash
git clone https://github.com/alexeadem/qbot
cd qbot/gpu-operator
OUT=templates
kubectl apply -f $OUT/gpu-operator/crds.yaml
kubectl apply -f $OUT/gpu-operator/templates/
kubectl apply -f $OUT/gpu-operator/charts/node-feature-discovery/templates/
watch kubectl get pods

```


### 3. Deploy Vector Add
```
cat cuda/vectoradd.yaml
```
Expand All @@ -85,13 +108,116 @@ spec:

```

> Deploy vectore add application
```bash
kubectl apply -f cuda/vectoradd.yaml
```

> Verify that operation was successful
### 4. Get Vector Add Logs

```bash
kubectl logs cuda-vectoradd
```
```
[Vector addition of 50000 elements]
Copy input data from the host memory to the CUDA device
CUDA kernel launch with 196 blocks of 256 threads
Copy output data from the CUDA device to the host memory
Test PASSED
Done
```


## Kubeflow
### Prerequisites
#### Kubeflow v1.7.0 with Nvidia GPU support

| Dependency | Validated or Included Version(s) | Notes
|-----------|----------|---|
|[Kubernetes](https://github.com/kubernetes/kubernetes/tree/v1.25.11) | v1.25.11 | |
|[Kubeflow](https://www.kubeflow.org/docs/releases/kubeflow-1.7/) | v1.7.0 | The autoscaling/v2beta2 API version of HorizontalPodAutoscaler is no longer served as of v1.26.Migrate manifests and API clients to use the autoscaling/v2 API version, available since v1.23. All existing persisted objects are accessible via the new API v1.25 [HorizontalPodAutoscaler not found on minikube when installing kubeflow](https://stackoverflow.com/questions/76502195/horizontalpodautoscaler-not-found-on-minikube-when-installing-kubeflow)|
|OS | Linux, Windows 10, 11 (WSL2)| |


#### Kubeflow v1.8.0 with Nvidia GPU support


| Dependency | Validated or Included Version(s) | Notes
|-----------|----------|---|
|[Kubernetes](https://github.com/kubernetes/kubernetes/tree/v1.25.11) | v1.25.11 | |
|[Kubeflow](https://www.kubeflow.org/docs/releases/kubeflow-1.8/) | v1.8.0 | [GPU Vendor not available error #7273](https://github.com/kubeflow/kubeflow/issues/7273)|
|OS | Linux, Windows 10, 11 (WSL2)| |

### qbot
#### [1. Install qbot](qbot)

#### 2. Run qbot
```bash
./qbot kubeflow v1.7.0
```


### [1. Install Nvidia GPU Operator](ai_and_ml?id=nvidia-gpu-operator)

### 2. Install Kubeflow

```bash
cd $HOME
git clone https://github.com/kubeflow/manifests.git"
cd manifests/"
```

```bash
git checkout v1.7.0"
curl -s "https://raw.githubusercontent.com/kubernetes-sigs/kustomize/master/hack/install_kustomize.sh" | bash
while ! ./kustomize build example | kubectl apply -f -; do echo "Retrying to apply resources"; sleep 10; done
```
### 3. Configure Kubeflow
#### 3.1 Patch Deployment for DinD
> Once this finishes we also need to patch the Kubeflow Pipelines service to not use Docker, otherwise our pipelines will get stuck and report Docker socket errors. This happens because despite us using Docker the Docker docket isn’t made available inside the kind cluster. So from Kubeflow’s perspective we are using containerd directly instead of Docker.
```bash
./kustomize build apps/pipeline/upstream/env/platform-agnostic-multi-user-pns | kubectl apply -f -
watch kubectl get pods -A
```
<!-- > Node port
> CORS issue
```
kubectl patch svc istio-ingressgateway --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]' -n istio-system
``` -->
### 4. Access Kubeflow UI
> Port forward
```bash
kubectl port-forward svc/istio-ingressgateway -n istio-system 8080:80
```
#### 4.1 Linux
> You can then open your browser and navigate to http://127.0.0.1:8080 and login with the default credentials
#### 4.2 Windows (WSL2)
> Under Windows Subsystem for Linux (WSL) you can install Google Chrome to access the product page
>
```bash
wget -O $HOME/google-chrome-stable_current_amd64.deb https://dl.>google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo apt install $HOME/google-chrome-stable_current_amd64.deb
```
```
wsl.exe -e google-chrome http://127.0.0.1:8080
```
#### 4.3 Login
> Default Credentials
>
> username: `[email protected]`
>
> password: `12341234`
![kubeflow nvidia-smi](img/kubeflow_nvidia_smi.png)
Ref.
[Running Kubeflow inside Kind with GPU support](https://jacobtomlinson.dev/posts/2022/running-kubeflow-inside-kind-with-gpu-support/)
8 changes: 4 additions & 4 deletions docs/auth.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Authentication

Qbo can authenticate access to the API using a web interface or a CLI interface. The web interface uses oauth2 Google authentication and the CLI uses either a temporary universally unique token (oauth2 authentication) or a service account. Both methods as described below
QBO can authenticate access to the API using a web interface or a CLI interface. The web interface uses oauth2 Google authentication and the CLI uses either a temporary universally unique token (oauth2 authentication) or a service account. Both methods as described below

> Note that when you login to the Web console @ https://console.cloud.qbo.io the web console is already configured and there is no further configuration needed for authentication. See [Configuration priority](#configuration-priority) for more info.

> If you are using a Linux, Mac or Windows shell outside Qbo's web console you can authenticate in two ways:
> If you are using a Linux, Mac or Windows shell outside QBO's web console you can authenticate in two ways:
## Temporary web token
> This token is generated upon successful authentication with your Google account
Expand Down Expand Up @@ -34,7 +34,7 @@ export QBO_UID=33820cc1-d513-4fa8-88ac-1adb008c3864

## Service Account

> Qbo service accounts use Elliptical curve cryptography (ECC) P-521 for encryption. A public key in json compact format `qbo_uid` is shared as well as an auxiliary token `qbo_aux` for authentication.
> QBO service accounts use Elliptical curve cryptography (ECC) P-521 for encryption. A public key in json compact format `qbo_uid` is shared as well as an auxiliary token `qbo_aux` for authentication.
> Before we can obtain and configure a service account a temporary web token is needed as described in [Temporary web token](#temporary-web-token) to retrieve the service account.
Expand All @@ -56,7 +56,7 @@ qbo version | jq .version[]?


> Download the CLI by cloning the repo @ https://git.eadem.com/alex/qbo-cli.git.
> Qbo CLI runs inside a container and can be used in Linux, Mac and Windows OSes.
> QBO CLI runs inside a container and can be used in Linux, Mac and Windows OSes.

```bash
Expand Down
Loading

0 comments on commit ebec785

Please sign in to comment.