diff --git a/doc/jsk_perception/nodes/vqa_node.md b/doc/jsk_perception/nodes/vqa_node.md index f31f0c0498..aa32d6667d 100644 --- a/doc/jsk_perception/nodes/vqa_node.md +++ b/doc/jsk_perception/nodes/vqa_node.md @@ -74,10 +74,12 @@ make In the remote GPU machine, ```shell cd jsk_recognition/jsk_perception/docker -./run_jsk_vil_api --port (Your vacant port) --ofa_task caption --ofa_model_scale huge +./run_jsk_vil_api ofa --port (Your vacant port) --ofa_task caption --ofa_model_scale huge ``` +You should set a model argument. It should be `ofa` or `clip`. + `--ofa_task` should be `caption` or `vqa`. Empirically, the output results are more natural for VQA tasks with the Caption model than with the VQA model in OFA.