This is the single source of truth for everything available inside an Ability. If a method or property isn’t listed here, it either doesn’t exist or hasn’t been documented yet. Found something missing? Let us know on Discord.
Quick Orientation
Inside any Ability, you have access to two objects:| Object | What it is | Access via |
|---|---|---|
self.capability_worker | The SDK — all I/O, speech, audio, LLM, files, and flow control | CapabilityWorker(self.worker) |
self.worker | The Agent — logging, session management, memory, user connection info | Passed into call() |
Table of Contents
- Speaking / TTS
- Listening / User Input
- Combined Speak + Listen
- LLM / Text Generation
- Audio Playback
- Audio Recording
- Audio Streaming
- File Storage (Persistent + Temporary)
- WebSocket Communication
- Flow Control
- Logging
- Session Tasks
- User Connection Info
- Conversation Memory & History
- Music Mode
- Common Patterns
- Appendix: What You CAN’T Do (Yet)
- Appendix: Blocked Imports
1. Speaking / TTS
speak(text)
Converts text to speech using the Personality’s default voice. Streams audio to the user.
- Async: Yes (
await) - Voice: Uses whatever voice is configured on the Personality
- Tip: Keep it to 1-2 sentences. This is voice, not text.
text_to_speech(text, voice_id)
Converts text to speech using a specific Voice ID (e.g., from ElevenLabs). Use when your Ability needs its own distinct voice.
- Async: Yes (
await) - Voice: Overrides the Personality’s default
- See: Voice ID catalog at the bottom of this doc
2. Listening / User Input
user_response()
Waits for the user’s next spoken or typed input. Returns it as a string.
- Async: Yes (
await) - Returns:
str— the transcribed user input - Tip: Always check for empty strings (
if not user_input: continue)
wait_for_complete_transcription()
Waits until the user has completely finished speaking before returning. Use when you need the full utterance without premature cutoff.
- Async: Yes (
await) - Returns:
str— the final transcribed input - When to use: Long-form input like descriptions, stories, or dictation
3. Combined Speak + Listen
run_io_loop(text)
Speaks the text, then waits for the user’s response. Returns the user’s reply. A convenience wrapper around speak() + user_response().
- Async: Yes (
await) - Returns:
str— user’s reply - Note: Uses the Personality’s default voice (not a custom voice ID)
run_confirmation_loop(text)
Speaks the text (appends “Please respond with ‘yes’ or ‘no’”), then loops until the user clearly says yes or no.
- Async: Yes (
await) - Returns:
bool—Truefor yes,Falsefor no
4. LLM / Text Generation
text_to_text_response(prompt_text, history=[], system_prompt="")
Generates a text response using the configured LLM.
- ⚠️ THIS IS THE ONLY SYNCHRONOUS METHOD. Do NOT use
await. - Returns:
str— the LLM’s response - Parameters:
prompt_text(str): The current prompt/questionhistory(list): Optional conversation history for multi-turn context. Each item:{"role": "user"|"assistant", "content": "..."}system_prompt(str): Optional system prompt to control LLM behavior
- Tip: LLMs often wrap JSON in markdown fences. Always strip them:
5. Audio Playback
play_audio(file_content)
Plays audio directly from bytes or a file-like object.
- Async: Yes (
await) - Input:
bytesor file-like object - Tip: For anything longer than a TTS clip, use Music Mode
play_from_audio_file(file_name)
Plays an audio file stored in the Ability’s directory (same folder as main.py).
- Async: Yes (
await) - Input: Filename (string) — must be in the same folder as your
main.py
6. Audio Recording
Record audio from the user’s microphone during a session.start_audio_recording()
Begins recording audio from the user’s mic.
stop_audio_recording()
Stops the current audio recording.
get_audio_recording()
Returns the recorded audio as a .wav file.
- Returns:
.wavfile data
get_audio_recording_length()
Returns the length/duration of the current recording.
Recording Example
7. Audio Streaming
For streaming audio in chunks rather than loading it all into memory at once.stream_init()
Initializes an audio streaming session.
send_audio_data_in_stream(file_content, chunk_size=4096)
Streams audio data in chunks. Handles mono conversion and resampling automatically.
- Input:
bytes, file-like object, orhttpx.Response - chunk_size: Bytes per chunk (default: 4096)
stream_end()
Ends the streaming session and cleans up.
Streaming Example
8. File Storage (Persistent + Temporary)
OpenHome provides a server-side file storage system that allows Abilities to persist data across sessions. This is the primary mechanism for cross-session memory.How It Works
| Flag | Scope | Persistence | Use Case |
|---|---|---|---|
temp=False | User-level, global | Survives disconnects and new sessions forever | User preferences, saved data, history, onboarding state |
temp=True | Session-level | Deleted when session ends | Scratch data, cached API responses, temp processing |
user_prefs.json and a completely separate Smart Hub Ability can read it.
Allowed file types: .txt, .csv, .json, .md, .log, .yaml, .yml
check_if_file_exists(filename, temp)
- Async: Yes (
await) - Returns:
bool - Always call this before reading — don’t assume a file exists on first run
write_file(filename, content, temp)
- Async: Yes (
await) - ⚠️ Behavior: APPENDS to existing file. Creates the file if it doesn’t exist. If it already exists, content is added to the end.
- This is fine for
.txtand.logfiles (just append new lines) - For JSON: this WILL corrupt your data. See the JSON pattern below.
read_file(filename, temp)
- Async: Yes (
await) - Returns:
str— full file contents
delete_file(filename, temp)
- Async: Yes (
await)
⚠️ The JSON Rule: Always Delete + Write
Becausewrite_file appends, writing JSON to an existing file will produce invalid JSON ({"a":1}{"a":1,"b":2}). Always delete first, then write the complete object:
When to Use Which Mode
Usetemp=False (persistent) for:
- User preferences and settings
- Onboarding data (“has this user done setup?”)
- Learned context (name, location, timezone)
- Conversation summaries
- Accumulated data (journals, logs, scores, history)
- Any data that should survive a disconnect
temp=True (session-only) for:
- Cached API responses
- Intermediate processing data
- Temporary state that doesn’t need to survive a disconnect
Cross-Ability Data Sharing
Since storage is user-level (not per-ability), use consistent file names across abilities to share data:Complete Example: Persistent User Preferences
Complete Example: First Run Detection
Complete Example: Activity Logging (Append-Friendly)
For.txt and .log files, appending works perfectly:
Complete Example: Session-Only Cache
9. WebSocket Communication
send_data_over_websocket(data_type, data)
Sends structured data over WebSocket. Used for custom events (music mode, DevKit actions, etc.).
- Async: Yes (
await) - Parameters:
data_type(str): Event type identifierdata(dict): Payload
send_devkit_action(action)
Sends a hardware action to a connected DevKit device.
- Async: Yes (
await)
10. Flow Control
resume_normal_flow()
⚠️ CRITICAL: You MUST call this when your Ability is done. It hands control back to the Personality. Without it, the Personality goes silent and the user has to restart the conversation.
- Async: No (synchronous)
- When to call: On EVERY exit path:
- End of your main logic (happy path)
- After a
breakin a loop - Inside
exceptblocks (error fallback) - After timeout
- After user says “exit”/“stop”/“quit”
- Called after the main flow completes?
- Called after every
breakstatement? - Called in every
exceptblock that ends the ability? - Called after timeout logic?
- Called after user exit detection?
11. Logging
editor_logging_handler
Always use this. Never use print().
- Tip: Log before and after API calls so you can see what’s happening in the Live Editor:
12. Session Tasks
OpenHome’s managed task system. Ensures async work gets properly cancelled when sessions end. Rawasyncio tasks can outlive a session — if the user hangs up or switches abilities, your task keeps running as a ghost process. session_tasks ensures everything gets cleaned up properly.
session_tasks.create(coroutine)
Launches an async task within the agent’s managed lifecycle.
- Use instead of:
asyncio.create_task()(which can leak tasks)
session_tasks.sleep(seconds)
Pauses execution for the specified duration.
- Use instead of:
asyncio.sleep()(which can’t be cleanly cancelled)
13. User Connection Info
user_socket.client.host
The user’s public IP address at connection time.
- Use case: IP-based geolocation, timezone detection, personalization
- Tip: Cloud/datacenter IPs won’t give you useful location data. Check the ISP name for keywords like “amazon”, “aws”, “google cloud” before using for geolocation.
Example: IP Geolocation
14. Conversation Memory & History
agent_memory.full_message_history
Access the full conversation message history from the current session.
- Returns: The complete message history for the active session
- Use case: Building context-aware abilities that know what was said before the ability was triggered
Maintaining History in a Looping Ability
Thetext_to_text_response method accepts a history parameter. Use it to maintain multi-turn conversation context:
Passing Context Back After resume_normal_flow()
Currently, there is no direct way to inject data into the Personality’s system prompt after an Ability finishes. When resume_normal_flow() fires, the Ability is done and control returns to the Personality.
What you CAN do:
- Save to conversation history — Anything spoken during the Ability (via
speak()) becomes part of the conversation history, which the Personality’s LLM can see in subsequent turns. - Use file storage — Write data to persistent files (see File Storage) that other Abilities can read later. The Personality itself won’t read these files directly, but your Abilities can share data through them.
- Memory feature — OpenHome has a new memory feature that can persist user context. (Details TBD as this feature evolves.)
- Directly update or modify the Personality’s system prompt from within an Ability
- Pass structured data (like user location or preferences) to the Personality’s LLM context after
resume_normal_flow()
15. Music Mode
When playing audio that’s longer than a TTS utterance (music, sound effects, long recordings), you need to signal the system to stop listening and not interrupt.Full Pattern
16. Common Patterns
LLM as Intent Router
Use the LLM to classify user intent and route to different actions:Error Handling for Voice
Always speak errors to the user and always resume:Using a Custom Voice
Voice ID Quick Reference
Use withtext_to_speech(text, voice_id) to give your Ability its own voice.
| Voice ID | Accent | Gender | Tone | Good For |
|---|---|---|---|---|
21m00Tcm4TlvDq8ikWAM | American | Female | Calm | Narration |
EXAVITQu4vr4xnSDxMaL | American | Female | Soft | News |
XrExE9yKIg1WjnnlVkGX | American | Female | Warm | Audiobook |
pMsXgVXv3BLzUgSXRplE | American | Female | Pleasant | Interactive |
ThT5KcBeYPX3keUQqHPh | British | Female | Pleasant | Children |
ErXwobaYiN019PkySvjV | American | Male | Well-rounded | Narration |
GBv7mTt0atIp3Br8iCZE | American | Male | Calm | Meditation |
TxGEqnHWrfWFTfGW9XjX | American | Male | Deep | Narration |
pNInz6obpgDQGcFmaJgB | American | Male | Deep | Narration |
onwK4e9ZLuTAKqWW03F9 | British | Male | Deep | News |
D38z5RcWu1voky8WS1ja | Irish | Male | Sailor | Games |
IKne3meq5aSn9XLyUdCD | Australian | Male | Casual | Conversation |
Appendix: What You CAN’T Do (Yet)
Being explicit about limitations saves developers hours of guessing:| You might want to… | Status |
|---|---|
| Update the Personality’s system prompt from an Ability | ❌ Not possible |
Pass structured data back to the Personality after resume_normal_flow() | ❌ Not possible — use conversation history or file storage as workarounds |
| Access other Abilities from within an Ability | ❌ Not supported |
Run background tasks after resume_normal_flow() | ❌ Tasks are cancelled on session end |
| Access a database directly (Redis, SQL, etc.) | ❌ Blocked — use File Storage API instead |
Use print() | ❌ Blocked — use editor_logging_handler |
Use asyncio.sleep() or asyncio.create_task() | ❌ Blocked — use session_tasks |
Use open() for raw file access | ❌ Blocked — use File Storage API |
Import redis, connection_manager, user_config | ❌ Blocked |
Appendix: Blocked Imports
These will cause your Ability to be rejected by the sandbox:| Import | Why | Use Instead |
|---|---|---|
redis | Direct datastore coupling | File Storage API |
RedisHandler | Bypasses platform abstractions | File Storage API |
connection_manager | Breaks isolation | CapabilityWorker APIs |
user_config | Can leak global state | File Storage API |
exec(), eval(), pickle, dill, shelve, marshal, hardcoded secrets, MD5, ECB cipher mode.
Last updated: February 2026
Found an undocumented method? Report it on Discord so we can add it here.

