Skip to main content

Image Understanding Tool

Usage

Portia offers both open source tools as well as a cloud-hosted library of tools to save you development time. You can dig into the specs of those tools in our open source repo (SDK repo ↗).

You can import our open source tools into your project using from portia.open_source_tools.registry import open_source_tool_registry and load them into an InMemoryToolRegistry object. You can also combine their use with cloud or custom tools as explained in the docs (Add custom tools ↗).

Tool details

Tool ID: image_understanding_tool

Tool description: Tool for understanding images from a URL. Capable of tasks like object detection, OCR, scene recognition, and image-based Q&A. This tool uses its native capabilities to analyze images and provide insights.

Args schema:

{
"description": "Input for Image Understanding Tool.",
"properties": {
"task": {
"description": "The task to be completed by the Image tool.",
"title": "Task",
"type": "string"
},
"image_url": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Image URL for processing.",
"title": "Image Url"
},
"image_file": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "Image file for processing.",
"title": "Image File"
}
},
"required": [
"task"
],
"title": "ImageUnderstandingToolSchema",
"type": "object"
}

Output schema:

('str', "The Image understanding tool's response to the user query about the provided image.")