Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ml): ML on Rockchip NPUs #15241

Open
wants to merge 78 commits into
base: main
Choose a base branch
from

Conversation

yoni13
Copy link
Contributor

@yoni13 yoni13 commented Jan 11, 2025

Goals: ML on Rockchip NPUs.
Testing on board: #13243 (reply in thread)

TODO:

  • It's works on my OrangePi 3B (RK3566).
  • Build Docker images.
  • Build models for more SOCs.
  • Able to set thread numbers for each model type (e.g. visual -> 2 threads) via environment variables.
  • NPU core masks for RK3576/3588
  • Decide model path (immich-app/ViT-B-32__openai/textual/rk3566.rknn) .
  • Export script that accepts CLI arguments for the Immich model name and SoC, then exports the model.
  • Able to maximize NPU usage by using rknnpool.py
  • Documentation.
  • Write tests.
  • Model on Huggingface.
  • Make Docker images work out of the box (without downloading models manually).
  • Test on PC and other arm-based boards to ensure it doesn't break something.

Nice to have:

  • Rebase my commits (sorry for ugly commit messages).
  • Test if it's working on RK3588 (I don't have one).
  • Support more models.

Issues:

  • Higher ram usage because we need to load the onnx model for the input & output face for facial models.

#13243

@yoni13
Copy link
Contributor Author

yoni13 commented Jan 11, 2025

Docker launch command:

docker run --security-opt systempaths=unconfined --security-opt apparmor=unconfined --device /dev/dri --device /dev/dma_heap --device /dev/rga --device /dev/mpp_service -v /cache:/cache:ro  -p 3004:3003 -v /sys/kernel/debug/:/sys/kernel/debug/:ro --name rknnimmich_name -d rknnimmich

and it works (if you download model to cache ofc)

ViT-B-32 and buffalo_l are loaded with two threads rerunning jobs: 2.7G, and peak 3.5G RAM. (its like running 4 models at the same time)
Update: this statistic is before we only load onnx when required, will update mem usage when I got time

Copy link
Contributor

@mertalev mertalev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! This is a lot to get through and will need to go through more testing, but the core prediction logic looks pretty simple.

machine-learning/app/models/base.py Outdated Show resolved Hide resolved
machine-learning/app/sessions/rknn.py Outdated Show resolved Hide resolved
docker/hwaccel.ml.yml Outdated Show resolved Hide resolved
machine-learning/rknn/rknnpool.py Outdated Show resolved Hide resolved
machine-learning/app/sessions/rknn.py Outdated Show resolved Hide resolved
machine-learning/app/config.py Outdated Show resolved Hide resolved
machine-learning/Dockerfile Outdated Show resolved Hide resolved
machine-learning/app/sessions/rknn.py Show resolved Hide resolved
machine-learning/rknn/export/build_rknn.py Show resolved Hide resolved
@yoni13
Copy link
Contributor Author

yoni13 commented Jan 19, 2025

Most issues mentioned in the review should be fixed now.

@yoni13 yoni13 requested a review from mertalev January 19, 2025 14:15
@yoni13
Copy link
Contributor Author

yoni13 commented Jan 21, 2025

Wrote a matrix build script to see what model is not supported, all of them are nlib and XLM.
https://github.com/yoni13/immich_to_rknn2/actions/runs/12892750773

@mertalev
Copy link
Contributor

Wrote a matrix build script to see what model is not supported, all of them are nlib and XLM. yoni13/immich_to_rknn2/actions/runs/12892750773

Oh wow, that is pretty cool!

I have an RK3588, so I can do some testing with this PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog:feature documentation Improvements or additions to documentation 🧠machine-learning
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants