Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved Computer Vision webinar logic / code #106

Merged
merged 138 commits into from
May 7, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
138 commits
Select commit Hold shift + click to select a range
8a5ec3d
Addded CV project
htahir1 Apr 22, 2024
95c2dc8
Added all the logic for splitting
htahir1 Apr 22, 2024
c52b8eb
Added more logic and formatted
htahir1 Apr 22, 2024
852445b
latest
htahir1 Apr 22, 2024
d3b8995
latest
htahir1 Apr 22, 2024
5581aaa
Added matrix:
htahir1 Apr 22, 2024
f62459f
More
htahir1 Apr 22, 2024
570390f
Utils
htahir1 Apr 22, 2024
aaf43e7
Latest
htahir1 Apr 22, 2024
69e7639
Latest
htahir1 Apr 22, 2024
b523d6c
Initial implementation for separate data loader pipeline
AlexejPenner Apr 22, 2024
2c4f280
add ultralytics to requirements
strickvl Apr 22, 2024
feb070c
fix
strickvl Apr 22, 2024
ac21d91
fix data export and materialization
strickvl Apr 22, 2024
acd1aeb
Split training pipeline up into steps
AlexejPenner Apr 22, 2024
ae505b5
extra cleaning
strickvl Apr 22, 2024
b8cd264
Merge branch 'project/cv-webinar' of https://github.com/zenml-io/zenm…
strickvl Apr 22, 2024
84241d7
Working data loader pipeline
AlexejPenner Apr 22, 2024
58f1014
Training works
AlexejPenner Apr 22, 2024
5e4bb28
Fixed config files
AlexejPenner Apr 22, 2024
424f877
add notebooks
strickvl Apr 23, 2024
111ff94
remove aws requirements
strickvl Apr 23, 2024
881cd4a
update notebooks
strickvl Apr 23, 2024
009bc74
add env vars
strickvl Apr 23, 2024
e66a23e
updates
strickvl Apr 23, 2024
c2abece
Update environment variables and add new model configuration
strickvl Apr 23, 2024
e7a1101
add inference and add to README
strickvl Apr 23, 2024
4fc5c6f
update inference
strickvl Apr 23, 2024
54363c6
formatting
strickvl Apr 23, 2024
1754c26
update inference
strickvl Apr 23, 2024
32e63c2
update readme and requirements
strickvl Apr 23, 2024
adec3b3
update notebook
strickvl Apr 23, 2024
02b102c
Further progress
AlexejPenner Apr 23, 2024
c135764
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 23, 2024
ddf2e21
formatting
strickvl Apr 23, 2024
026fb1d
add inference to run.py
strickvl Apr 23, 2024
34fefde
update inference
strickvl Apr 23, 2024
cbe3588
read becomes load
strickvl Apr 23, 2024
ce04c8c
remove unneccessary temp file
strickvl Apr 23, 2024
b406149
add constants
strickvl Apr 23, 2024
3d7c714
pull out constants and remove unnecessary image call
strickvl Apr 23, 2024
ad08269
Further steps towards step operator
AlexejPenner Apr 23, 2024
0fc8e28
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 23, 2024
f281332
constants
strickvl Apr 23, 2024
27d1b1d
Merge branch 'project/cv-webinar' of https://github.com/zenml-io/zenm…
strickvl Apr 23, 2024
11428d6
formatting
strickvl Apr 23, 2024
10f637c
run.py becomes click script
strickvl Apr 23, 2024
689107a
fix inference import issues
strickvl Apr 23, 2024
e93e336
inference cache
strickvl Apr 23, 2024
922348c
update fiftyone inference
strickvl Apr 23, 2024
766503b
Adjusted configs
AlexejPenner Apr 23, 2024
ee85356
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 23, 2024
adb9437
Parametrized training
AlexejPenner Apr 23, 2024
c5133c1
Remove label-studio dependency in training
AlexejPenner Apr 23, 2024
37d1484
update dataloading script
strickvl Apr 23, 2024
fc4c168
add initial dataset location
strickvl Apr 23, 2024
5327ce2
First running pipeline on vertex
AlexejPenner Apr 23, 2024
21aa3b7
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 23, 2024
7c99630
Flags for run.py
AlexejPenner Apr 23, 2024
9f3447e
Fix bug in login functionality
strickvl Apr 23, 2024
f2c3184
Merge branch 'project/cv-webinar' of https://github.com/zenml-io/zenm…
strickvl Apr 23, 2024
5a0a512
Update pipeline options in run.py
strickvl Apr 23, 2024
3a8019f
Add cloud inference pipeline
strickvl Apr 23, 2024
efe15bd
formatting and cloud inference
strickvl Apr 23, 2024
b7ff028
fix reference to yaml config
strickvl Apr 23, 2024
e2bb549
add local fiftyone pipeline + step
strickvl Apr 23, 2024
718eb99
everyone gets their own stack
strickvl Apr 23, 2024
b38f87f
format
strickvl Apr 23, 2024
fd2c31c
add fiftyone step
strickvl Apr 23, 2024
c2df775
add licenses
strickvl Apr 23, 2024
5137c85
update scripts
strickvl Apr 23, 2024
b9d9b77
formatting and docstrings
strickvl Apr 23, 2024
afa18ba
renaming for diagram
strickvl Apr 23, 2024
9e9be2a
renaming task_ids -> new_ids
strickvl Apr 23, 2024
397d713
renaming train_model -> training
strickvl Apr 23, 2024
9fb7e53
final rename
strickvl Apr 23, 2024
abd1405
initial README skeleton
strickvl Apr 23, 2024
c86d016
Update README formatting
strickvl Apr 23, 2024
ff4b969
Update fifty_one_launcher function
strickvl Apr 23, 2024
f8a1095
add link
strickvl Apr 23, 2024
f0da68c
refactor using annotator method
strickvl Apr 23, 2024
a608d72
update data export
strickvl Apr 23, 2024
dfd280f
Implemented data ingestion
AlexejPenner Apr 23, 2024
d45458e
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 23, 2024
a92e90e
Small improvements
AlexejPenner Apr 23, 2024
eb63fbe
add fileio copying for gcp
strickvl Apr 23, 2024
1ca6814
Merge branch 'project/cv-webinar' of https://github.com/zenml-io/zenm…
strickvl Apr 23, 2024
02fbc30
formatting
strickvl Apr 23, 2024
3fc94e2
Work in progress on ingestion
AlexejPenner Apr 23, 2024
3ad948e
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 23, 2024
15d6b6f
Strides towards running with gcp data source
AlexejPenner Apr 23, 2024
365074f
Some more cleanup
AlexejPenner Apr 23, 2024
acc52b4
fixes
strickvl Apr 24, 2024
2bf94c6
Smaller changes
AlexejPenner Apr 24, 2024
acef07e
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 24, 2024
9a589bd
Working training
AlexejPenner Apr 24, 2024
0bc9f49
Get a pipeline to run
AlexejPenner Apr 24, 2024
175d78d
Putting this to the side as it might introduce to much complexity
AlexejPenner Apr 24, 2024
1244392
Further small improvements
AlexejPenner Apr 24, 2024
67de170
Attempt upload of test dataset
AlexejPenner Apr 24, 2024
ae6b3a5
Get inference to work
AlexejPenner Apr 24, 2024
4e87f8b
Raw upload works
AlexejPenner Apr 24, 2024
36a6fbe
docstrings
strickvl Apr 24, 2024
92d8e62
formatting
strickvl Apr 24, 2024
023a863
add default values
strickvl Apr 24, 2024
585c09e
split for breakpoint
strickvl Apr 24, 2024
304b646
fix local training yaml
strickvl Apr 24, 2024
c045bf6
Small local changes
AlexejPenner Apr 24, 2024
94e40cf
add dill
strickvl Apr 24, 2024
5be68c0
update training yaml
strickvl Apr 24, 2024
6946355
Update YOLO materializer to use ONNX format
strickvl Apr 24, 2024
3de41ea
scratch
strickvl Apr 24, 2024
429d2ee
add logging to train step
strickvl Apr 24, 2024
3c75cbd
add number of classes
strickvl Apr 24, 2024
7ba998f
revert materializer
strickvl Apr 24, 2024
6283f66
Add export metadata
AlexejPenner Apr 24, 2024
96afd30
Merge branch 'project/cv-webinar' of github.com:zenml-io/zenml-projec…
AlexejPenner Apr 24, 2024
ca2f13c
Webinar settings
AlexejPenner Apr 25, 2024
80e1560
Initial working version for tiling
AlexejPenner Apr 25, 2024
a769b6f
Data ingestion now in 1000pxl tiles
AlexejPenner Apr 26, 2024
1e7536f
Merge branch 'project/cv-webinar' into project/cv-webinar-improved-da…
AlexejPenner Apr 26, 2024
763ff7f
Cleanup progressing
AlexejPenner Apr 26, 2024
daa461f
Further small improvements
AlexejPenner Apr 26, 2024
9230ff1
Structured README
AlexejPenner Apr 26, 2024
3d5adf5
Fixed small issues
AlexejPenner Apr 26, 2024
1fbcb3a
COntinued on readme
AlexejPenner Apr 29, 2024
8cb0ba8
add logging for download
strickvl Apr 29, 2024
bd76c52
formatting
strickvl Apr 29, 2024
09393e6
Removed paths, improved README
AlexejPenner Apr 29, 2024
e95387f
Merge branch 'project/cv-webinar-improved-datagen' of github.com:zenm…
AlexejPenner Apr 29, 2024
4b87eb9
Merge branch 'main' into project/cv-webinar-improved-datagen
AlexejPenner May 3, 2024
c69827d
Fixed configuration files and README
AlexejPenner May 3, 2024
71914fb
fix typos
strickvl May 6, 2024
21c55df
fix README
strickvl May 6, 2024
6b985df
Update end-to-end-computer-vision/steps/process_hf_dataset.py
strickvl May 6, 2024
2e62f9b
Merge branch 'main' into project/cv-webinar-improved-datagen
strickvl May 6, 2024
c26af1b
Added iamge
AlexejPenner May 7, 2024
bbb9f47
Merge branch 'project/cv-webinar-improved-datagen' of github.com:zenm…
AlexejPenner May 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion .typos.toml
Original file line number Diff line number Diff line change
@@ -1,5 +1,14 @@
[files]
extend-exclude = ["*.csv", "sign-language-detection-yolov5/*", "orbit-user-analysis/steps/report.py", "customer-satisfaction/pipelines/deployment_pipeline.py", "customer-satisfaction/streamlit_app.py", "nba-pipeline/Building and Using An MLOPs Stack With ZenML.ipynb", "customer-satisfaction/tests/data_test.py"]
extend-exclude = [
"*.csv",
"sign-language-detection-yolov5/*",
"orbit-user-analysis/steps/report.py",
"customer-satisfaction/pipelines/deployment_pipeline.py",
"customer-satisfaction/streamlit_app.py",
"nba-pipeline/Building and Using An MLOPs Stack With ZenML.ipynb",
"customer-satisfaction/tests/data_test.py",
"end-to-end-computer-vision/**/*.ipynb"
]

[default.extend-identifiers]
# HashiCorp = "HashiCorp"
Expand All @@ -14,6 +23,9 @@ lenght = "lenght"
preprocesser = "preprocesser"
Preprocesser = "Preprocesser"
Implicitly = "Implicitly"
fo = "fo"
mapp = "mapp"
polution = "polution"

[default]
locale = "en-us"
1 change: 1 addition & 0 deletions end-to-end-computer-vision/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,4 @@
data/
runs/
**/tmp*
runs_dir
226 changes: 211 additions & 15 deletions end-to-end-computer-vision/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ This is a project that demonstrates an end-to-end computer vision pipeline using
ZenML. The pipeline is designed to be modular and flexible, allowing for easy
experimentation and extension.

![diagram.png](_assets/diagram.png)

The project showcases the full lifecycle of a computer vision project, from data
collection and preprocessing to model training and evaluation. The pipeline also
incorporates a human-in-the-loop (HITL) component, where human annotators can
Expand All @@ -12,36 +14,230 @@ label images to improve the model's performance, as well as feedback using

The project uses the [Ship Detection
dataset](https://huggingface.co/datasets/datadrivenscience/ship-detection) from
[DataDrivenScience](https://datadrivenscience.com/) on the Hugging Face Hub, which contains images of ships in
satellite imagery. The goal is to train a model to detect ships in the images.
Note that this isn't something that our YOLOv8 model is particularly good at out
of the box, so it serves as a good example of how to build a pipeline that can
be extended to other use cases.
[DataDrivenScience](https://datadrivenscience.com/) on the Hugging Face Hub,
which contains images of ships in satellite imagery. The goal is to train a
model to detect ships in the images. Note that this isn't something that our
YOLOv8 model is particularly good at out of the box, so it serves as a good
example of how to build a pipeline that can be extended to other use cases.

This project needs some infrastructure and tool setup to work. Here is a list of
things that you'll need to do.

## Run this pipeline
## ZenML

### Setup
We recommend using our [ZenML Cloud offering](https://cloud.zenml.io/) to get a
deployed instance of zenml:

You'll need to run the following:
### Set up your environment

```bash
zenml integration install label_studio torch gcp mlflow -y
pip install -r requirements.txt
pip uninstall wandb
zenml integration install label_studio torch gcp mlflow -y
pip uninstall wandb # This comes in automatically
```

And to use the Albumentations and annotation plugins in the last step, you'll
need to install them:

```bash
fiftyone plugins download https://github.com/jacobmarks/fiftyone-albumentations-plugin

fiftyone plugins download https://github.com/voxel51/fiftyone-plugins --plugin-names @voxel51/annotation
```

You can also set the following environment variables:
You should also set up the following environment variables:

```bash
export DATA_UPLOAD_MAX_NUMBER_FILES=1000000
export WANDB_DISABLED=True
```

And to use the Albumentations and annotation plugins, you'll need to install
them:
### Connect to your deployed ZenML instance

```bash
fiftyone plugins download https://github.com/jacobmarks/fiftyone-albumentations-plugin
zenml connect --url <INSERT_ZENML_URL_HERE>
```

fiftyone plugins download https://github.com/voxel51/fiftyone-plugins --plugin-names @voxel51/annotation
## Cloud Provider

We will use GCP in the commands listed below, but it will work for other cloud
providers.

### Follow our guide to set up your credential for GCP

[Set up a GCP service
connector](https://docs.zenml.io/stacks-and-components/auth-management/gcp-service-connector)

### Set up a bucket to persist your training data

### Set up a bucket to use as artifact store within ZenML

[Learn how to set up a GCP artifact store stack component within zenml
here](https://docs.zenml.io/stacks-and-components/component-guide/artifact-stores)
### Set up vertex for pipeline orchestration

[Learn how to set up a Vertex orchestrator stack component within zenml
here](https://docs.zenml.io/stacks-and-components/component-guide/orchestrators/vertex)
### For training on accelerators like GPUs/TPUs set up Vertex

[Learn how to set up a Vertex step operator stack component within zenml
here](https://docs.zenml.io/stacks-and-components/component-guide/step-operators/vertex)
### Set up Container Registry

[Learn how to set up a google cloud container registry component within zenml
here](https://docs.zenml.io/stacks-and-components/component-guide/container-registries/gcp)

## Label Studio

### [Start Label Studio locally](https://labelstud.io/guide/start)
### [Follow these ZenML instructions to set up Label Studio as a stack component](https://docs.zenml.io/stacks-and-components/component-guide/annotators/label-studio)
### Create a project within Label Studio and name it `ship_detection_gcp`
### [Set up Label Studio to use external storage](https://labelstud.io/guide/storage)
use the first bucket that you created to data persistence

## ZenML Stacks

### Local Stack

The local stack should use the `default` orchestrator, a gcp remote artifact
store that we'll call `gcp_artifact_store` here and a local label-studio
annotator that we'll refer to as `label_studio_local`.

```bash
# Make sure to replace the names with the names that you choose for your setup
zenml stack register <local_stack> -o default -a <gcp_artifact_store> -an <label_studio_local>
```

### Remote Stack

The remote stack should use the `vertex_orchestrator` , a `gcp_artifact_store`,
a `gcp_container_registry` and a `vertex_step_operator`.


```bash
# Make sure to replace the names with the names that you choose for your setup
zenml stack register <gcp_stack> -o <vertex_orchestrator> -a <gcp_artifact_store> -c <gcp_container_registry> -s <vertex_step_operator>
```

The project consists of the following pipelines:

## data_ingestion_pipeline

This pipeline downloads the [Ship Detection
dataset](https://huggingface.co/datasets/datadrivenscience/ship-detection). This
dataset contains some truly huge images with a few hundred million pixels. In
order to make these useable, we break down all source images into manageable
tiles with a maximum height/width of 1000 pixels. After this preprocessing is
done, the images are uploaded into a cloud bucket and the ground truth
annotations are uploaded to a local Label Studio instance.

### Configure this pipeline
The configuration file for this pipeline lives at `./configs/ingest_data.yaml`.
Make sure in particular to change `data_source` to point at the GCP bucket which
is dedicated to be the storage location for the data. Also make sure to adjust
the `ls_project_id` to correspond to the id of your project within Label Studio.

### Run this pipeline

Label Studio should be up and running for the whole duration of this pipeline
run.

```bash
zenml stack set <local_stack>
python run.py --ingest
```

## data_export_pipeline

This pipeline exports the annotations from Label Studio and loads it into the
ZenML artifact store to make them accessible to downstream pipelines.

### Configure this pipeline
The configuration file for this pipeline lives at `./configs/data_export.yaml`.
Make sure in particular to change `dataset_name` to reflect the name of the
dataset within Label Studio.

### Run this pipeline

Label Studio should be up and running for the whole duration of this pipeline
run.

```bash
zenml stack set <local_stack>
python run.py --export
```

## training_pipeline

This pipeline trains a YOLOv8 object detection model.

### Configure this pipeline
You can choose to run this pipeline locally or on the cloud. These two options
use two different configuration files. For local training:
`./configs/training_pipeline.yaml`. For training on the cloud:
`./configs/training_pipeline_remote_gpu.yaml`. Make sure `data_source` points to
your cloud storage bucket.

### Run this pipeline

This pipeline requires the associated model (see the model section of the
configuration yaml file) to have a version in the `staging` stage. In order to
promote the model produced by the latest run of the `data_export_pipeline`, run
the following code:

```bash
zenml model version update <MODEL_NAME> latest -s staging
```

For local training run the following code:

```bash
zenml stack set <local_stack>
python run.py --training --local
```

For remote training run the following code:

```bash
zenml stack set <remote_stack>
python run.py --training
```

## inference_pipeline

This pipeline performs inference on the object detection model.

### Configure this pipeline
You can configure this pipeline at the following yaml file
`./configs/inference_pipeline.yaml`. Make sure `data_source` points to your
cloud storage bucket that contains images that you want to perform batch
inference on

### Run this pipeline

This pipeline requires the associated model (see the model section of the
configuration yaml file) to have a version in the `production` stage. In order
to promote the model produced by the latest run of the `training_pipeline`, run
the following code:

```bash
zenml model version update <MODEL_NAME> staging -s production
```

```bash
zenml stack set <local_stack>
python run.py --inference
```


## Analyze and Curate your data through FiftyOne

Now to close the loop, we will import the predictions into FiftyOne. All you'll
need to do is run:

```bash
python run.py --fiftyone
```

Within FiftyOne, you can now analyze all the predictions and export them back to
Label Studio for finetuned labeling and retraining.
Binary file added end-to-end-computer-vision/_assets/diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added end-to-end-computer-vision/bus.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ enable_cache: False

# pipeline configuration
parameters:
dataset_name: "ship_detection_gcp" # "ship_detection"
dataset_name: "ship_detection_gcp" # This is the name of the dataset in label studio

# Configuration of the Model Control Plane
model:
Expand Down
12 changes: 0 additions & 12 deletions end-to-end-computer-vision/configs/data_export_alex.yaml

This file was deleted.

22 changes: 0 additions & 22 deletions end-to-end-computer-vision/configs/fiftyone.yaml

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ steps:
enable_cache: False
enable_step_logs: False
parameters:
inference_data_source: "gs://zenml-internal-artifact-store/inference_cv_webinar"
inference_data_source: # Insert your bucket path here where the inference images live e.g. "gs://foo/bar"

# configuration of the Model Control Plane
model:
Expand Down
5 changes: 3 additions & 2 deletions end-to-end-computer-vision/configs/ingest_data.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@

steps:
download_dataset_from_hf:
download_and_tile_dataset_from_hf:
enable_cache: True
enable_step_logs: False
parameters:
dataset: "datadrivenscience/ship-detection"
data_source: "gs://zenml-internal-artifact-store/label_studio_cv_webinar"
data_source: # Insert your bucket path here where the training images will live e.g. "gs://foo/bar"
upload_labels_to_label_studio:
enable_cache: False
parameters:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,10 @@ steps:
train_model:
enable_cache: False
parameters:
data_source: "gs://zenml-internal-artifact-store/label_studio_cv_webinar"
batch_size: 16
imgsz: 640
epochs: 30
data_source: # Insert your bucket path here where the training images lives e.g. "gs://foo/bar"
batch_size: 8
imgsz: 720
epochs: 1

settings:
docker:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,14 @@ steps:
step_operator: gcp_a100
enable_step_logs: False
parameters:
data_source: "gs://zenml-internal-artifact-store/label_studio_cv_webinar"
data_source: # Insert your bucket path here where the training images lives e.g. "gs://foo/bar"
batch_size: 8
imgsz: 480
epochs: 5000
imgsz: 720
epochs: 50000
is_quad_gpu_env: True
settings:
step_operator.vertex:
accelerator_type: NVIDIA_TESLA_T4 # NVIDIA_TESLA_A100 # see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/MachineSpec#AcceleratorType
accelerator_type: NVIDIA_TESLA_T4 # see https://cloud.google.com/vertex-ai/docs/reference/rest/v1/MachineSpec#AcceleratorType
accelerator_count: 4
disk_size_gb: 25
docker:
Expand Down
4 changes: 4 additions & 0 deletions end-to-end-computer-vision/data/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Ignore everything in this directory
*
# Except this file
!.gitignore
1 change: 1 addition & 0 deletions end-to-end-computer-vision/data/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
This directory serves as a place to store and access temporary datafiles.
Loading
Loading