You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The vision model fails during inference with the error message: "Pixel values were specified for a non-prompt."
Steps to Reproduce
Use the reproducer code provided below.
Fetch an image using reqwest.
Load the image into memory using the image crate.
Create VisionMessages with an image and a user prompt.
Send a chat request to the model.
Code Reproducer
use image;use mistralrs::{IsqType,TextMessageRole,VisionLoaderType,VisionMessages,VisionModelBuilder};use reqwest;#[tokio::main]asyncfnmain(){let model =
VisionModelBuilder::new("HuggingFaceTB/SmolVLM-Instruct",VisionLoaderType::Idefics3).with_isq(IsqType::Q4_0).with_logging().build().await.expect("Failed to build model");let response = reqwest::get("http://farm1.staticflickr.com/32/53895647_9ff594a688_z.jpg").await.expect("Failed to fetch image");let image = image::load_from_memory(&response.bytes().await.expect("Failed to read bytes")).expect("Failed to load image");let messages = VisionMessages::new().add_image_message(TextMessageRole::User,"What is depicted here? Please describe the scene in detail.",
image,&model,).expect("Failed to create vision message");let response = model
.send_chat_request(messages).await.expect("Error occurred during inference");println!("{}", response.choices[0].message.content.as_ref().unwrap());}
Observed Behavior
The model logs show that the pixel values for an image are incorrectly flagged as specified for a non-prompt. This leads to a runtime error:
2025-01-02T17:18:31.336956Z ERROR mistralrs_core::engine: prompt step - Model failed with error: Msg("Pixel values were specified for a non-prompt.")
thread 'main' panicked at src/main.rs:33:10:
Error occurred during inference: ChatModelError { msg: "Pixel values were specified for a non-prompt.", incomplete_response: ChatCompletionResponse { ... } }
The text was updated successfully, but these errors were encountered:
Summary
The vision model fails during inference with the error message:
"Pixel values were specified for a non-prompt."
Steps to Reproduce
reqwest
.image
crate.VisionMessages
with an image and a user prompt.Code Reproducer
Observed Behavior
The model logs show that the pixel values for an image are incorrectly flagged as specified for a non-prompt. This leads to a runtime error:
The text was updated successfully, but these errors were encountered: