函数工具

什么是函数工具？

当开箱即用的工具无法完全满足特定需求时，开发者可以创建自定义函数工具。这允许实现定制化功能，例如连接专有数据库或实现独特算法。

例如，一个名为"myfinancetool"的函数工具可能是计算特定财务指标的函数。ADK还支持长时间运行的函数，因此如果计算耗时较长，智能体可以继续处理其他任务。

ADK提供多种创建函数工具的方式，分别适用于不同复杂度和控制需求：

函数工具
长时间运行函数工具
智能体即工具

1. 函数工具

将函数转化为工具是将自定义逻辑集成到智能体中的直接方式。这种方法提供了灵活性和快速集成能力。

参数

使用标准的JSON可序列化类型（如字符串、整数、列表、字典）定义函数参数。注意避免为参数设置默认值，因为大模型目前不支持解析默认值。

返回类型

Python函数工具的首选返回类型是字典。这允许通过键值对结构化响应，为大模型提供上下文和清晰度。如果函数返回非字典类型，框架会自动将其包装为包含单个键"result"的字典。

尽量使返回值具有描述性。例如，与其返回数字错误代码，不如返回包含"error_message"键的字典，其中包含人类可读的解释。请记住需要理解结果的是大模型而非代码。最佳实践是在返回字典中包含"status"键来指示整体结果（如"success"、"error"、"pending"），为大模型提供操作状态的明确信号。

文档字符串

函数的文档字符串作为工具描述发送给大模型。因此，编写完善全面的文档字符串对于大模型理解如何有效使用工具至关重要。需清晰说明函数用途、参数含义和预期返回值。

示例

该工具是一个获取给定股票代码/符号股价的Python函数。

注意：使用此工具前需要pip install yfinance库。

from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.genai import types

import yfinance as yf


APP_NAME = "stock_app"
USER_ID = "1234"
SESSION_ID = "session1234"

def get_stock_price(symbol: str):
    """
    Retrieves the current stock price for a given symbol.

    Args:
        symbol (str): The stock symbol (e.g., "AAPL", "GOOG").

    Returns:
        float: The current stock price, or None if an error occurs.
    """
    try:
        stock = yf.Ticker(symbol)
        historical_data = stock.history(period="1d")
        if not historical_data.empty:
            current_price = historical_data['Close'].iloc[-1]
            return current_price
        else:
            return None
    except Exception as e:
        print(f"Error retrieving stock price for {symbol}: {e}")
        return None


stock_price_agent = Agent(
    model='gemini-2.0-flash',
    name='stock_agent',
    instruction= 'You are an agent who retrieves stock prices. If a ticker symbol is provided, fetch the current price. If only a company name is given, first perform a Google search to find the correct ticker symbol before retrieving the stock price. If the provided ticker symbol is invalid or data cannot be retrieved, inform the user that the stock price could not be found.',
    description='This agent specializes in retrieving real-time stock prices. Given a stock ticker symbol (e.g., AAPL, GOOG, MSFT) or the stock name, use the tools and reliable data sources to provide the most up-to-date price.',
    tools=[get_stock_price],
)


# Session and Runner
session_service = InMemorySessionService()
session = session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=stock_price_agent, app_name=APP_NAME, session_service=session_service)


# Agent Interaction
def call_agent(query):
    content = types.Content(role='user', parts=[types.Part(text=query)])
    events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)

    for event in events:
        if event.is_final_response():
            final_response = event.content.parts[0].text
            print("Agent Response: ", final_response)

call_agent("stock price of GOOG")

该工具的返回值将被包装成字典。

{"result": "$123"}

最佳实践

虽然定义函数时有很大灵活性，但请记住简单性可提升大模型的可用性。考虑以下准则：

参数越少越好：减少参数数量以降低复杂度
简单数据类型：尽可能使用str和int等基本类型而非自定义类
有意义的命名：函数名和参数名显著影响大模型对工具的理解和使用。选择能清晰反映函数用途和输入含义的名称。避免使用do_stuff()等通用名称

2. 长时间运行函数工具

专为需要大量处理时间但不会阻塞智能体执行的任务设计。该工具是FunctionTool的子类。

使用LongRunningFunctionTool时，Python函数可以启动长时间运行操作，并可选择返回中间结果以保持模型和用户了解进度。智能体可以继续处理其他任务。典型场景是需要人工审批才能继续执行的人机交互流程。

工作原理

使用LongRunningFunctionTool包装Python生成器函数（使用yield的函数）。

初始化：当大模型调用工具时，生成器函数开始执行
中间更新(yield)：函数应定期生成中间Python对象（通常是字典）报告进度。ADK框架获取每个生成值并将其包装在FunctionResponse中发回大模型，使大模型能通知用户（如状态、完成百分比、消息）
完成(return)：任务结束时，生成器函数使用return提供最终Python对象结果
框架处理：ADK框架管理执行过程。将每个生成值作为中间FunctionResponse发回。当生成器完成时，框架将返回值作为最终FunctionResponse的内容发送，向大模型标记长时间运行操作结束

创建工具

定义生成器函数并用LongRunningFunctionTool类包装：

from google.adk.tools import LongRunningFunctionTool

# Define your generator function (see example below)
def my_long_task_generator(*args, **kwargs):
    # ... setup ...
    yield {"status": "pending", "message": "Starting task..."} # Framework sends this as FunctionResponse
    # ... perform work incrementally ...
    yield {"status": "pending", "progress": 50}               # Framework sends this as FunctionResponse
    # ... finish work ...
    return {"status": "completed", "result": "Final outcome"} # Framework sends this as final FunctionResponse

# Wrap the function
my_tool = LongRunningFunctionTool(func=my_long_task_generator)

中间更新

生成结构化Python对象（如字典）对提供有意义的更新至关重要。应包含以下键：

status：如"pending"、"running"、"waiting_for_input"
progress：如百分比、已完成步骤数
message：面向用户/大模型的描述性文本
estimated_completion_time：如可计算

框架将每个生成值打包成FunctionResponse发送给大模型。

最终结果

生成器函数返回的Python对象被视为工具执行的最终结果。框架将该值（即使为None）打包到发送给大模型的最终FunctionResponse内容中，标记工具执行完成。

示例：文件处理模拟

import time
from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.tools import LongRunningFunctionTool
from google.adk.sessions import InMemorySessionService
from google.genai import types

# 1. Define the generator function
def process_large_file(file_path: str) -> dict:
    """
    Simulates processing a large file, yielding progress updates.

    Args:
      file_path: Path to the file being processed.

    Returns: 
      A final status dictionary.
    """
    total_steps = 5

    # This dict will be sent in the first FunctionResponse
    yield {"status": "pending", "message": f"Starting processing for {file_path}..."}

    for i in range(total_steps):
        time.sleep(1)  # Simulate work for one step
        progress = (i + 1) / total_steps
        # Each yielded dict is sent in a subsequent FunctionResponse
        yield {
            "status": "pending",
            "progress": f"{int(progress * 100)}%",
            "estimated_completion_time": f"~{total_steps - (i + 1)} seconds remaining"
        }

    # This returned dict will be sent in the final FunctionResponse
    return {"status": "completed", "result": f"Successfully processed file: {file_path}"}

# 2. Wrap the function with LongRunningFunctionTool
long_running_tool = LongRunningFunctionTool(func=process_large_file)

# 3. Use the tool in an Agent
file_processor_agent = Agent(
    # Use a model compatible with function calling
    model="gemini-2.0-flash",
    name='file_processor_agent',
    instruction="""You are an agent that processes large files. When the user provides a file path, use the 'process_large_file' tool. Keep the user informed about the progress based on the tool's updates (which arrive as function responses). Only provide the final result when the tool indicates completion in its final function response.""",
    tools=[long_running_tool]
)


APP_NAME = "file_processor"
USER_ID = "1234"
SESSION_ID = "session1234"

# Session and Runner
session_service = InMemorySessionService()
session = session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=file_processor_agent, app_name=APP_NAME, session_service=session_service)


# Agent Interaction
def call_agent(query):
    content = types.Content(role='user', parts=[types.Part(text=query)])
    events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)

    for event in events:
        if event.is_final_response():
            final_response = event.content.parts[0].text
            print("Agent Response: ", final_response)

call_agent("Replace with a path to your file...")

示例关键点

process_large_file：该生成器模拟耗时操作，生成中间状态/进度字典
LongRunningFunctionTool：包装生成器；框架处理发送生成的更新和最终返回值作为连续的FunctionResponse
智能体指令：指导大模型使用工具并理解传入的FunctionResponse流（进度与完成）以更新用户
最终返回：函数返回最终结果字典，该字典在结束FunctionResponse中发送以标记完成

3. 智能体即工具

这一强大功能允许通过将其他智能体作为工具调用来利用系统中的智能体能力。智能体即工具使您可以调用其他智能体执行特定任务，实现责任委托。概念上类似于创建调用其他智能体并将智能体响应作为函数返回值的Python函数。

与子智能体的关键区别

需区分智能体即工具与子智能体：

智能体即工具：当智能体A将智能体B作为工具调用（使用智能体即工具）时，智能体B的答案传回智能体A，智能体A汇总答案并生成对用户的响应。智能体A保持控制权并继续处理后续用户输入
子智能体：当智能体A将智能体B作为子智能体调用时，应答用户的职责完全转移给智能体B。智能体A实际上退出交互循环。所有后续用户输入将由智能体B应答

使用方法

要将智能体作为工具使用，需用AgentTool类包装智能体。

tools=[AgentTool(agent=agent_b)]

自定义

AgentTool类提供以下属性用于自定义行为：

skip_summarization: bool：如设为True，框架将跳过基于大模型的工具智能体响应摘要。当工具响应已格式良好且无需进一步处理时很有用

示例

from google.adk.agents import Agent
from google.adk.runners import Runner
from google.adk.sessions import InMemorySessionService
from google.adk.tools.agent_tool import AgentTool
from google.genai import types

APP_NAME="summary_agent"
USER_ID="user1234"
SESSION_ID="1234"

summary_agent = Agent(
    model="gemini-2.0-flash",
    name="summary_agent",
    instruction="""You are an expert summarizer. Please read the following text and provide a concise summary.""",
    description="Agent to summarize text",
)

root_agent = Agent(
    model='gemini-2.0-flash',
    name='root_agent',
    instruction="""You are a helpful assistant. When the user provides a long text, use the 'summarize' tool to get a summary and then present it to the user.""",
    tools=[AgentTool(agent=summary_agent)]
)

# Session and Runner
session_service = InMemorySessionService()
session = session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)
runner = Runner(agent=summary_agent, app_name=APP_NAME, session_service=session_service)


# Agent Interaction
def call_agent(query):
    content = types.Content(role='user', parts=[types.Part(text=query)])
    events = runner.run(user_id=USER_ID, session_id=SESSION_ID, new_message=content)

    for event in events:
        if event.is_final_response():
            final_response = event.content.parts[0].text
            print("Agent Response: ", final_response)


long_text = """Quantum computing represents a fundamentally different approach to computation, 
leveraging the bizarre principles of quantum mechanics to process information. Unlike classical computers 
that rely on bits representing either 0 or 1, quantum computers use qubits which can exist in a state of superposition - effectively 
being 0, 1, or a combination of both simultaneously. Furthermore, qubits can become entangled, 
meaning their fates are intertwined regardless of distance, allowing for complex correlations. This parallelism and 
interconnectedness grant quantum computers the potential to solve specific types of incredibly complex problems - such 
as drug discovery, materials science, complex system optimization, and breaking certain types of cryptography - far 
faster than even the most powerful classical supercomputers could ever achieve, although the technology is still largely in its developmental stages."""


call_agent(long_text)

工作原理

当main_agent收到长文本时，其指令告知其对长文本使用'summarize'工具
框架识别'summarize'为包装summary_agent的AgentTool
后台中，main_agent将以长文本作为输入调用summary_agent
summary_agent将根据其指令处理文本并生成摘要
summary_agent的响应随后传回main_agent
main_agent可获取摘要并形成对用户的最终响应（如"以下是文本摘要：..."）