portia.open_source_tools.browser_tool
Browser tools.
This module contains tools that can be used to navigate to a URL, authenticate the user, and complete tasks.
The browser tool can run locally or using Browserbase. If using Browserbase, a Browserbase API key is required and project ID is required, and the tool can handle separate end user authentication.
The browser tool can be used to navigate to a URL and complete tasks. If authentication is required, the tool will return an ActionClarification with the user guidance and login URL. If authentication is not required, the tool will return the task output. It uses (BrowserUse)[https://browser-use.com/] for the task navigation.
BrowserToolForUrlSchema Objects
class BrowserToolForUrlSchema(BaseModel)
Input schema for the BrowserToolForUrl.
This schema defines the expected input parameters for the BrowserToolForUrl class.
Attributes:
task
str - The task description that should be performed by the browser tool. This is a required field that specifies what actions should be taken on the predefined URL.
BrowserToolSchema Objects
class BrowserToolSchema(BaseModel)
Input schema for the BrowserTool.
This schema defines the expected input parameters for the BrowserTool class.
Attributes:
url
str - The URL that the browser tool should navigate to. This is a required field specifying the target webpage.task
str - The task description that should be performed by the browser tool. This is a required field that specifies what actions should be taken on the provided URL.
BrowserTaskOutput Objects
class BrowserTaskOutput(BaseModel)
Output schema for browser task execution.
This class represents the response from executing a browser task, including both the task result and any authentication requirements.
Attributes:
task_output
str - The result or output from executing the requested task.human_login_required
bool - Indicates if manual user authentication is needed. Defaults to False.login_url
str, optional - The URL where the user needs to go to authenticate. Only provided when human_login_required is True.user_login_guidance
str, optional - Instructions for the user on how to complete the login process. Only provided when human_login_required is True.
BrowserInfrastructureOption Objects
class BrowserInfrastructureOption(Enum)
Enumeration of supported browser infrastructure providers.
This enum defines the available options for running browser automation tasks.
Attributes:
LOCAL
- Uses a local Chrome browser instance for automation. Suitable for development and testing.BROWSERBASE
- Uses the Browserbase cloud service for automation. Provides better scalability and isolation between users.
BrowserTool Objects
class BrowserTool(Tool[str])
General purpose browser tool. Customizable to user requirements.
This tool is designed to be used for tasks that require a browser. If authentication is required, the tool will return an ActionClarification with the user guidance and login URL. If authentication is not required, the tool will return the task output. It uses (BrowserUse)[https://browser-use.com/] for the task navigation.
When using the tool, you should ensure that once the user has authenticated, that they indicate that authentication is completed and resume the plan run.
The tool supports both local and BrowserBase infrastructure providers for running the web
based tasks. If using local, a local Chrome instance will be used, and the tool will not
support end_user_id. If using BrowserBase, a BrowserBase API key is required and the tool
can handle separate end users. The infrastructure provider can be specified using the
infrastructure_option
argument.
Arguments:
id
str, optional - Custom identifier for the tool. Defaults to "browser_tool".name
str, optional - Display name for the tool. Defaults to "Browser Tool".description
str, optional - Custom description of the tool's purpose. Defaults to a general description of the browser tool's capabilities.infrastructure_option
BrowserInfrastructureOption, optional - The infrastructure provider to use. Can be eitherBrowserInfrastructureOption.LOCAL
orBrowserInfrastructureOption.REMOTE
. Defaults toBrowserInfrastructureOption.REMOTE
.custom_infrastructure_provider
BrowserInfrastructureProvider, optional - A custom infrastructure provider to use. If not provided, the infrastructure provider will be resolved from theinfrastructure_option
argument.
infrastructure_provider
@cached_property
def infrastructure_provider() -> BrowserInfrastructureProvider
Get the infrastructure provider instance (cached).
run
def run(ctx: ToolRunContext, url: str, task: str) -> str | ActionClarification
Run the BrowserTool.
BrowserToolForUrl Objects
class BrowserToolForUrl(BrowserTool)
Browser tool for a specific URL.
This tool is designed to be used for browser-based tasks on the specified URL. If authentication is required, the tool will return an ActionClarification with the user guidance and login URL. If authentication is not required, the tool will return the task output. It uses (BrowserUse)[https://browser-use.com/] for the task navigation.
When using the tool, the developer should ensure that once the user has completed authentication, that they resume the plan run.
The tool supports both local and BrowserBase infrastructure providers for running the web
based tasks. If using local, a local Chrome instance will be used, and the tool will not
support end_user_id. If using BrowserBase, a BrowserBase API key is required and the tool
can handle separate end users. The infrastructure provider can be specified using the
infrastructure_option
argument.
Arguments:
url
str - The URL that this browser tool will navigate to for all tasks.id
str, optional - Custom identifier for the tool. If not provided, will be generated based on the URL's domain.name
str, optional - Display name for the tool. If not provided, will be generated based on the URL's domain.description
str, optional - Custom description of the tool's purpose. If not provided, will be generated with the URL.infrastructure_option
BrowserInfrastructureOption, optional - The infrastructure provider to use. Can be eitherBrowserInfrastructureOption.LOCAL
orBrowserInfrastructureOption.REMOTE
. Defaults toBrowserInfrastructureOption.REMOTE
.custom_infrastructure_provider
BrowserInfrastructureProvider, optional - A custom infrastructure provider to use. If not provided, the infrastructure provider will be resolved from theinfrastructure_option
argument.
__init__
def __init__(
url: str,
id: str | None = None,
name: str | None = None,
description: str | None = None,
model: GenerativeModel | None | str = NotSet,
infrastructure_option: BrowserInfrastructureOption | None = NotSet
) -> None
Initialize the BrowserToolForUrl.
run
def run(ctx: ToolRunContext, task: str) -> str | ActionClarification
Run the BrowserToolForUrl.
BrowserInfrastructureProvider Objects
class BrowserInfrastructureProvider(ABC)
Abstract base class for browser infrastructure providers.
setup_browser
@abstractmethod
def setup_browser(ctx: ToolRunContext) -> Browser
Get a Browser instance.
construct_auth_clarification_url
@abstractmethod
def construct_auth_clarification_url(ctx: ToolRunContext,
sign_in_url: str) -> HttpUrl
Construct the URL for the auth clarification.
step_complete
@abstractmethod
def step_complete(ctx: ToolRunContext) -> None
Call when the step is complete to e.g release the session.
BrowserInfrastructureProviderLocal Objects
class BrowserInfrastructureProviderLocal(BrowserInfrastructureProvider)
Browser infrastructure provider for local browser instances.
__init__
def __init__(chrome_path: str | None = None,
extra_chromium_args: list[str] | None = None) -> None
Initialize the BrowserInfrastructureProviderLocal.
setup_browser
def setup_browser(ctx: ToolRunContext) -> Browser
Get a Browser instance.
Note: This provider does not support end_user_id.
Arguments:
ctx
ToolRunContext - The context for the tool run, containing execution context and other relevant information.
Returns:
Browser
- A configured Browser instance for local browser automation.
construct_auth_clarification_url
def construct_auth_clarification_url(ctx: ToolRunContext,
sign_in_url: str) -> HttpUrl
Construct the URL for the auth clarification.
Arguments:
ctx
ToolRunContext - The context for the tool run, containing execution context and other relevant information.sign_in_url
str - The URL that the user needs to sign in to.
Returns:
HttpUrl
- The URL for the auth clarification, which in this case is simply the sign-in URL passed directly through.
get_chrome_instance_path
def get_chrome_instance_path() -> str
Get the path to the Chrome instance based on the operating system or env variable.
Returns:
str
- The path to the Chrome executable. First checks for the PORTIA_BROWSER_LOCAL_CHROME_EXEC environment variable, then falls back to default locations based on the operating system.
Raises:
RuntimeError
- If the platform is not supported (not macOS, Windows, or Linux) and the env variable isn't set.
step_complete
def step_complete(ctx: ToolRunContext) -> None
Call when the step is complete to e.g release the session.
get_extra_chromium_args
def get_extra_chromium_args() -> list[str] | None
Get the extra Chromium arguments.
Returns:
list[str] | None: A list of extra Chromium arguments if the environment variable is set, otherwise None.