This document details the steps of HASSOD for reproducing the pseudo-labels, model training, and evaluation results.
You can download the pseudo-labels generated by the first stage of HASSOD here, and move it into HASSOD/datasets/coco/annotations/hassod/
for training the object detector later. Or, if you want to redo this stage on your own, follow these steps:
- First, we use the hierarchical adaptive clustering (HAC) algorithm (
hac/hac.py
) to cluster image patches into regions with different merging thresholds. GPUs are required to extract DINO features in this step. We split the whole dataset into batches, so that multiple jobs can be launched in parallel to reduce the total processing time.example_jobs/1_hac.sh
shows one example of the batch jobs with more detailed explanation. - After that, we merge the JSON files (
hac/merge_jsons.py
) produced by the batch jobs into one annotation file for each merging threshold. For each image, we also need to ensemble the results from multiple merging thresholds and analyze the hierarchical levels of annotations (hac/post_process.py
). These steps would only require CPUs. Check the example inexample_jobs/2_post_process.sh
for more details. - Finally, when the pseudo-label annotation file (
coco_fixsize480_thresh0.1_0.2_0.4_hier.json
) is produced, don't forget to move it intoHASSOD/datasets/coco/annotations/hassod/
.
You can download the trained Cascade Mask R-CNN object detector here. To reproduce the object detector training, follow the two steps:
- We first directly train an object detector solely using the pseudo-labels generated in Stage 1.
example_jobs/3_detector_training.sh
shows an example of the training configuration that we use. - Then, we leverage Mean Teacher training with adaptive targets to refine the object detector by itself. In Mean Teacher, we need to learn a pair of teacher and student models (both initialized from the detector trained above), so it is necessary to convert the model checkpoint (
detector_training/convert_model.py
) before and after the Mean Teacher training.example_jobs/4_mean_teacher.sh
illustrates the detailed instructions.
You can evaluate the trained object detectors using a script similar to the one in Stage 1, and additionally set --eval-only
, MODEL.WEIGHTS
, DATASETS.TEST
, etc. You may check the example in example_jobs/5_evaluate.sh
.