Skip to content

yformer/EfficientTAM

Repository files navigation

Efficient Track Anything

[📕Project][🤗Gradio Demo][📕Paper][🤗Checkpoints]

The Efficient Track Anything Model(EfficientTAM) takes a vanilla lightweight ViT image encoder. An efficient memory cross-attention is proposed to further improve the efficiency. Our EfficientTAMs are trained on SA-1B (image) and SA-V (video) datasets. EfficientTAM achieves comparable performance with SAM 2 with improved efficiency. Our EfficientTAM can run >10 frames per second with reasonable video segmentation performance on iPhone 15. Try our demo with a family of EfficientTAMs at [🤗Gradio Demo].

Efficient Track Anything design

News

[Jan.5 2025] We add the support for running Efficient Track Anything on Macs with MPS backend. Check the example app.py.

[Jan.3 2025] We update the codebase of Efficient Track Anything, adpated from the latest SAM2 codebase with improved inference efficiency. Check the latest SAM2 update on Dec. 11 2024 for details. Thanks to SAM 2 team!

Efficient Track Anything Speed Update

[Dec.22 2024] We release 🤗Efficient Track Anything Checkpoints.

[Dec.4 2024] 🤗Efficient Track Anything for segment everything. Thanks to @SkalskiP!

[Dec.2 2024] We provide the preliminary version of Efficient Track Anything for demonstration.

Online Demo & Examples

Online demo and examples can be found in the project page.

EfficientTAM Video Segmentation Examples

SAM 2 SAM2
EfficientTAM EfficientTAM

EfficientTAM Image Segmentation Examples

Input Image, SAM, EficientSAM, SAM 2, EfficientTAM

Point-prompt point-prompt
Box-prompt box-prompt
Segment everything segment everything

Model

EfficientTAM checkpoints are available at the Hugging Face Space.

Getting Started

Installation

git clone https://github.com/yformer/EfficientTAM.git
cd EfficientTAM
conda create -n efficient_track_anything python=3.12
conda activate efficient_track_anything
pip install -e .

Download Checkpoints

cd checkpoints
./download_checkpoints.sh

We can benchmark FPS of efficient track anything models on GPUs and model size.

FPS Benchmarking and Model Size

cd ..
python efficient_track_anything/benchmark.py

Launching Gradio Demo Locally

For efficient track anything video, run

python app.py

For efficient track anything image, run

python app_image.py

Building Efficient Track Anything

You can build efficient track anything model with a config and initial the model with a checkpoint,

import torch

from efficient_track_anything.build_efficienttam import (
    build_efficienttam_video_predictor,
)

checkpoint = "./checkpoints/efficienttam_s.pt"
model_cfg = "configs/efficienttam/efficienttam_s.yaml"

predictor = build_efficienttam_video_predictor(model_cfg, checkpoint)

Efficient Track Anything Notebook Example

The notebook is shared here

License

Efficient track anything checkpoints and codebase are licensed under Apache 2.0.

Acknowledgement

If you're using Efficient Track Anything in your research or applications, please cite using this BibTeX:

@article{xiong2024efficienttam,
  title={Efficient Track Anything},
  author={Yunyang Xiong, Chong Zhou, Xiaoyu Xiang, Lemeng Wu, Chenchen Zhu, Zechun Liu, Saksham Suri, Balakrishnan Varadarajan, Ramya Akula, Forrest Iandola, Raghuraman Krishnamoorthi, Bilge Soran, Vikas Chandra},
  journal={preprint arXiv:2411.18933},
  year={2024}
}