Agent Controlled Abilities

An Agent Controlled Ability is one the Agent can trigger autonomously. Rather than relying on the user to invoke it, the Agent reads the user’s request and, when an Agent Controlled Ability is the right fit, triggers it on the user’s behalf and presents the result as part of its own response. For example, answering “is it raining outside?” needs real-time information the Agent does not have on its own. With a Weather Ability set to Agent Controlled, the Agent recognizes that the request needs live data, triggers the Ability, and answers the user with the current conditions. This makes the Agent feel like a smart assistant that knows when to reach for the right tool. The user simply speaks naturally, and the Agent decides what is needed to answer well. The Live Web Search Ability is a real example. The user asks something that needs current information, and the Agent autonomously runs a web search to answer it.

Agent Controlled Abilities require trigger words. The Agent uses them to recognize and trigger the Ability when the user’s request calls for it.

The four Ability categories

Every Ability falls into one of four categories: Skill, Agent Controlled, Background Daemon, or Local. See Ability Types for the full breakdown of when to use each.

How it works

The Agent triggers an Agent Controlled Ability autonomously. It reads the user’s request and, when the Ability is the right fit, triggers it from that request, so the user does not have to invoke it themselves. The Agent triggers an Ability only when the user’s request clearly calls for it. Casual conversation, greetings, and acknowledgements do not trigger it. Once triggered, the Ability runs its normal flow using the same CapabilityWorker methods as any other Ability. While it runs, the Agent acts between the Ability and the user:

If the Ability needs more information to do its job, such as which city to check the weather for, the Agent answers the Ability’s question. It provides the answer from the user context it already has, and if that context does not contain the answer, it asks the user. Either way, the conversation keeps flowing, and the user can keep talking to the Agent while the Ability works.
When the Ability produces a result, the Agent collects it and presents it to the user in its own words, shaped to answer what the user actually asked.

Because the Agent voices the result, your Ability’s spoken output is not delivered word for word. Write it as clear, factual content and let the Agent phrase the final response.

Agent Controlled Abilities are always mediated by the Agent. Even when the Ability is triggered by a trigger word, the Agent controls how and when the result reaches the user and weaves it into the conversation, rather than the Ability replying to the user directly. If you want an Ability that responds to the user directly and immediately when it is invoked, use a Skill category Ability instead.

Example interaction

A user asks the Agent a question naturally, and the Agent answers using an Agent Controlled Weather Ability.

User: Is it snowing outside? Agent: In San Francisco, it’s currently a light drizzle, not snowing. It’s about 14 degrees, though it feels closer to 12.

The user never named a city or invoked the Ability. The Agent recognized that the request needed live weather, triggered the Ability, filled in the city it already knew, and answered the actual question of whether it was snowing, all in its own words.

When to use Agent Controlled

The clearest signal is how the user phrases the request. When a user asks a question or states an intent indirectly rather than issuing a direct command, the Agent should recognize the need and act on it. That is what Agent Controlled is for.

The user says…	Best fit	Why
”is it raining outside?”	Agent Controlled	An indirect request. The Agent recognizes the need and triggers the Weather Ability.
”open weather”	Skill	A direct command. The user is explicitly invoking the Ability.

Choose Agent Controlled when:

The Ability provides something the Agent cannot produce on its own, such as real-time information, data from the user’s connected accounts, or an action in the outside world.
You want the Agent to decide when the Ability is needed, based on what the user is asking.
The result should be woven into the Agent’s conversational response rather than spoken by the Ability directly.

Good candidates

Category	Examples
Real-time information the Agent lacks on its own	Weather, web search, news, sports scores, stock or crypto prices
Personal data from connected accounts	Calendar lookups, email, tasks, reminders
Actions and external services	Sending a message, adding a calendar event, updating a list, querying an external API

When not to use Agent Controlled

Some Abilities are a better fit as a Skill. Use a Skill instead when:

The user invokes it deliberately with a command. For example, “start my morning routine” or “enter focus mode.” The user wants explicit control rather than autonomous triggering.
The Ability runs a long, interactive session. Examples include a guided flow, a quiz, or a multi-step walkthrough that the user commits to, rather than a single question the Agent answers.
The Agent can already answer from its own knowledge. General questions that do not need live data or an external action fall here. Routing these through an Ability only adds latency.
You want an immediate, direct reply when the Ability is triggered. Agent Controlled results are delivered by the Agent as part of the conversation, not as an instant reply from the Ability itself. For a direct, interactive flow that the user triggers with a trigger word, use a Skill.

File structure

An Agent Controlled Ability is built from the same files as a Skill. At minimum it is a single main.py that contains the Ability class and its run() logic.

Building an Agent Controlled Ability

You write an Agent Controlled Ability exactly like a Skill, using the same SDK methods, the same main.py structure, and the same lifecycle. There is no special base class or special code. What makes an Ability Agent Controlled is its category and its description.

The description gives the Agent context

The Agent reads your Ability’s description to understand what the Ability does and extracts the context it needs to decide when the Ability is relevant to a user’s request. Write a clear, specific description of the Ability’s purpose and the kind of requests it handles. A vague description gives the Agent too little context, so it may trigger the Ability at the wrong time, or not at all. A strong description states the purpose, the scope, and a few example requests.

Gives real-time weather and current conditions for any city, so users can
just ask about the weather naturally. Anything that needs current weather
for a location can be handled by this ability.

Examples:
- what's the weather in Miami
- is it raining in Tokyo
- how hot is it outside

Example

This Weather Ability reports the current conditions for a city using the free Open-Meteo API, which requires no API key. It is written like any Skill, and the category is what makes it Agent Controlled. When the Ability asks “which city?”, the Agent provides the answer. It uses the city it already knows for the user when it can, or it asks the user. The Ability code is the same either way.

from src.agent.capability import MatchingCapability
from src.main import AgentWorker
from src.agent.capability_worker import CapabilityWorker

GEOCODE_URL = "https://geocoding-api.open-meteo.com/v1/search"
WEATHER_URL = "https://api.open-meteo.com/v1/forecast"

WEATHER_CODES = {
    0: "clear skies", 1: "mainly clear", 2: "partly cloudy", 3: "overcast",
    45: "foggy", 51: "light drizzle", 61: "light rain", 63: "rain", 65: "heavy rain",
    71: "light snow", 80: "rain showers", 95: "thunderstorms",
}


class WeatherCapability(MatchingCapability):
    worker: AgentWorker = None
    capability_worker: CapabilityWorker = None

    #{{register capability}}

    async def run(self):
        try:
            # Ask for the detail the Ability needs. When the Agent triggers this
            # Ability, the Agent supplies the answer, either from what it already
            # knows about the user or by asking them. The Ability code is the same.
            city = await self.capability_worker.run_io_loop(
                "Which city would you like the weather for?"
            )
            if not city or not city.strip():
                await self.capability_worker.speak("I didn't catch a city name.")
                return

            city = city.strip()

            geo = await self.worker.session_tasks.httpx_get_async(GEOCODE_URL, params={
                "name": city, "count": 1, "language": "en", "format": "json"
            })
            results = geo.json().get("results") or []
            if not results:
                await self.capability_worker.speak(f"I couldn't find {city}.")
                return
            place = results[0]

            forecast = await self.worker.session_tasks.httpx_get_async(WEATHER_URL, params={
                "latitude": place["latitude"],
                "longitude": place["longitude"],
                "current": "temperature_2m,weather_code",
            })
            current = forecast.json().get("current") or {}

            temp = current.get("temperature_2m")
            condition = WEATHER_CODES.get(current.get("weather_code"), "variable conditions")

            # Return factual content. The Agent voices the result in its own words,
            # shaped to answer what the user actually asked.
            summary = f"In {place['name']}, it's currently {condition} at {round(temp)} degrees Celsius."
            await self.capability_worker.speak(summary)

        except Exception as error:
            self.worker.editor_logging_handler.error(f"weather error: {error}")
            await self.capability_worker.speak("Something went wrong while checking the weather.")
        finally:
            self.capability_worker.resume_normal_flow()

    def call(self, worker: AgentWorker):
        self.worker = worker
        self.capability_worker = CapabilityWorker(self.worker)
        self.worker.session_tasks.create(self.run())

The Ability uses standard CapabilityWorker methods. For the full method catalog, see Building Abilities and the SDK Reference.

Creating an Agent Controlled Ability

Open the Dashboard and create a new Ability.
Fill in the Ability information. Write a clear, specific description. The Agent reads it to understand what the Ability does and to decide when to trigger it.
Under Ability Behavior, set the Category to Agent Controlled.
Add Trigger Words. These are required, and the Agent uses them to recognize and trigger the Ability.

Testing your Ability

Because the Agent decides when to trigger an Agent Controlled Ability, test it by speaking naturally rather than by saying an exact command.

Phrase a request that matches your Ability’s description.
Confirm that the Ability is triggered, performs its action, and the Agent presents the result.
If the Ability does not trigger, refine the description so it more clearly states the requests it handles. A clearer, more specific description helps the Agent recognize when the Ability is relevant.

Best practices

Write a clear, specific description

The Agent reads the description to understand what your Ability does and extracts the context it needs to decide when it’s relevant. State the purpose, the scope, and a few example requests so the Agent triggers it at the right time. Keep descriptions distinct, because overlapping descriptions across Abilities make it harder for the Agent to choose.

Set trigger words (required)

Trigger words are required for Agent Controlled Abilities. The Agent uses them to recognize and trigger the Ability, so choose clear, distinct words or phrases.

Write spoken output as content, not final phrasing

The Agent presents your Ability’s result in its own words, tailored to what the user asked, so your text is not spoken verbatim. Return clear, factual content and avoid fixed phrasing such as “Here’s what I found.”

Keep the Ability focused

Agent Controlled works best for a focused task that answers a request, ideally one that needs little or no extra input. The fewer details the Ability has to gather, the smoother the experience.

Always call resume_normal_flow()

Call resume_normal_flow() on every exit path, including success, failure, and early returns, to return control to the Agent.

Building Abilities

Integrations

Agent Controlled Abilities

The four Ability categories

How it works

Example interaction

When to use Agent Controlled

Good candidates

When not to use Agent Controlled

File structure

Building an Agent Controlled Ability

The description gives the Agent context

Example

Creating an Agent Controlled Ability

Testing your Ability

Best practices

See also

​The four Ability categories

​How it works

​Example interaction

​When to use Agent Controlled

​Good candidates

​When not to use Agent Controlled

​File structure

​Building an Agent Controlled Ability

​The description gives the Agent context

​Example

​Creating an Agent Controlled Ability

​Testing your Ability

​Best practices

​See also

The four Ability categories

How it works

Example interaction

When to use Agent Controlled

Good candidates

When not to use Agent Controlled

File structure

Building an Agent Controlled Ability

The description gives the Agent context

Example

Creating an Agent Controlled Ability

Testing your Ability

Best practices

See also