Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex datatypes like List, Dict or Pydantic Objects are not understood by Phidata while registering tools with Gemini. Works for OpenAI. #1759

Open
gauravdhiman opened this issue Jan 11, 2025 · 0 comments

Comments

@gauravdhiman
Copy link
Contributor

I have a tool with below interface:

from phi.tools import Toolkit
from pydantic import BaseModel, Field

class Sentence(BaseModel):
    voice: str = Field(..., description="Voice name", examples=["af_sarah", "am_michael"])
    text: str

class Sentences(BaseModel):
    sentences: List[Sentence]

class TTSToolkit(Toolkit):
    ....
    ....
    def text_to_speech(self, sentences: Sentences, output_filename: str = "output.wav") -> str:
        """
        Converts a list of text sentences to a single audio file with given filename.

        Args:
            sentences (List[Dict[str, str]]): Stringified list of dictionaries, where each dictionary contains
                "voice" (str) and "text" (str) keys. voice key can have one of these values: af_sarah, am_michael alternatively.
            output_path (str, optional): The path to save the output audio file. Defaults to "output.wav".
        """

In agent, if I use Gemini model like this, it fails but if I use OpenAI model, it works. No other change, just change in model.

I am using this Gemini provider class:

from phi.model.google import Gemini

Gemini model fails while determining tool definition with below err:

    raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.InvalidArgument: 400 * GenerateContentRequest.tools[0].function_declarations[0].parameters.properties[sentences].properties: should be non-empty for OBJECT type

For debugging, I printed this:

print(f"Adding function {name} from {tool.name} to model. Function: {func}")

I printed this within this if condition, and this is what it prints:

Adding function text_to_speech from TTSToolkit to model. Function: name='text_to_speech' description=None parameters={'type': 'object', 'properties': {}, 'required': []} strict=None entrypoint=<bound method TTSToolkit.text_to_speech of <TTSToolkit name=TTSToolkit functions=['text_to_speech']>> sanitize_arguments=True show_result=False stop_after_tool_call=False pre_hook=None post_hook=None
@gauravdhiman gauravdhiman changed the title Complex datatypes like List, Dict are not understood by Phidata when while registering tools with Gemini. Works for OpenAI. Complex datatypes like List, Dict or Pydantic Objects are not understood by Phidata when while registering tools with Gemini. Works for OpenAI. Jan 11, 2025
@gauravdhiman gauravdhiman changed the title Complex datatypes like List, Dict or Pydantic Objects are not understood by Phidata when while registering tools with Gemini. Works for OpenAI. Complex datatypes like List, Dict or Pydantic Objects are not understood by Phidata while registering tools with Gemini. Works for OpenAI. Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant