Skip to main content

Designing OpenHome Abilities

A manifesto for voice-first ambient intelligence. For technical implementation patterns, SDK usage, and code examples, see What Makes a Good Ability.
Philosophy - Frameworks - Sound Design - Lifecycle - API Integrations - OpenClaw Bridge - 170+ Ability Ideas OpenHome - 2026

Part 1: Philosophy

1.1 - The Core Premise

  • You are not building an app. You are building a presence in a room.
  • A smart speaker is a microphone, a speaker, and a brain that never sleeps.
  • The best Ability is the one the user forgets is running - until it does something so well-timed they think: “How did it know?”
Key insight: It knew because it was there. Listening. Learning. Waiting.

1.2 - The Three Modes of Operation

ModeWhat It DoesKey Principle
ListeningCaptures ambient audio, transcribes speech, identifies speakers, detects sounds, extracts meaningThe user may not even be talking to the device
SpeakingInterjects, responds, narrates, coaches, entertainsVoice is expensive. Every word is a second the user cannot skip. Silence is often better.
LoggingWrites to persistent backends, companion apps, dashboards silentlyAccumulates intelligence over hours, days, and weeks. The most powerful layer.

1.3 - When Something Should Be an Ability

If the LLM can handle it with a Agent prompt alone, it does not need to be an Ability. Abilities exist for things the LLM cannot do on its own:
  • Call an external API
  • Play or generate audio
  • Control a physical device
  • Persist data across sessions
  • Run multi-step workflows with branching logic
  • Access real-time data (weather, scores, stocks, calendars)
Pro tip: Ask yourself: “Does this require reaching outside the LLM?” If yes, it is an Ability.

Part 2: Design Frameworks

2.1 - The Three Ability Archetypes

ArchetypeBehaviorExamples
The ResponderMostly silent. User initiates. Speaker answers, then exits.Weather, timer, WiFi password, quick lookup
The CompanionActive participant in ongoing back-and-forth. Has agent and memory.Debate coach, recipe walkthrough, brainstorm partner, bedtime story
The ObserverMostly silent. Listens, transcribes, analyzes, logs, surfaces insights later.Life logger, meeting transcriber, sleep tracker, dream decoder
Pro tip: The Observer archetype is underused and extremely powerful. Silence is the feature.

2.2 - Ten Design Frameworks

  1. The Invisible Worker - Handles tedious labor users would never manually maintain.
  2. The Information Funnel - Compresses many dashboards and apps into one timely spoken sentence.
  3. Surprise Artifact Generation - Builds over time and delivers meaningful outputs unexpectedly.
  4. The Emotional Radar - Adapts behavior based on how users sound, not only what they say.
  5. The Daily Ritual Anchor - Attaches to existing habits, not net-new behaviors.
  6. The Compound Intelligence Loop - Gets smarter over weeks; value compounds over time.
  7. The Proxy Agent - Acts for the user (send, book, reorder), not only informs.
  8. The Social Multiplier - Designs for rooms with multiple people.
  9. The Context Mesh - Weaves multiple sources into contextual intelligence.
  10. The Graceful Silence Principle - Define silence rules first; speak less than possible.

Part 3: Voice-First Design Rules

3.1 - Keep It Short

  • Keep each speak() to 1-2 sentences.
  • Lead with the headline.
  • Use progressive disclosure.
Example: “You have 3 meetings. Next is at 2 with Sarah. Want the full list?“

3.2 - Fill the Silence

  • If an API call takes over 1 second, speak first.
  • Example fillers: “One sec, pulling that up.” “Hang on, checking.” “Let me look into that.”
  • Dead silence feels broken.

3.3 - Confirm Before Acting

  • Destructive or high-stakes actions need confirmation.
  • Example: “Cancel Team Standup? Say yes to confirm.”
  • Low-stakes lookups can run directly.

3.4 - Expect Messy Input

  • Transcription is messy.
  • Use the LLM to extract clean intent.
  • If parsing fails, ask again explicitly.

3.5 - Handle Exits

  • Looping abilities need exit words: done, stop, bye, nothing else, I'm good.
  • One idle cycle: keep going.
  • Two idle cycles: offer to leave.
  • Call resume_normal_flow() on every path.

3.6 - Spell It Out

  • TTS can mangle emails, URLs, and numbers.
  • Say “at” for @, “dot” for ..
  • Read phone numbers digit by digit.

3.7 - Silence Is a Feature

  • Do not respond to every moment.
  • Log interesting details silently.
  • Do not read more than three items without asking.

Part 3B: Sound Design - Audio as Interface

Voice abilities are audio experiences, not only speech.

Sound Effect Types

TypeWhen to UseExample
Confirmation TonesAction completes successfully (low stakes)“Lights off” -> soft click
Transition SoundsSwitching modes or statesEntering ability -> short whoosh
Intro Music/ThemesCompanion or game abilitiesTrivia -> game show sting
Feedback BeepsCorrect/wrong, milestones, timersCorrect -> bright pip, wrong -> low tone
Ambient AudioAtmosphere under speechFocus mode -> low lo-fi; sleep -> rain
Alert/InterruptBackground interruptionsTimer done -> escalating soft alarm

Sound Design Principles

  • Less is more.
  • Consistency builds trust.
  • Time-of-day awareness is mandatory.
  • Let sounds replace words over repeated usage.
Key insight: Over time, sound can replace spoken confirmations as users learn the audio language.

Sound Anti-Patterns

  • Sound effects on every speak().
  • Long intros that delay useful speech.
  • Loud nighttime alerts.
  • Alarm-like sounds that induce panic.
  • Loops that clash with speech.
  • Inconsistent sounds for the same action.

Part 4: Trigger Word Design

4.1 - Think in Speech, Not Text

Users say: “what’s on my calendar”, “do I have a 3pm”, “am I free Tuesday”.

4.2 - Balance Coverage vs False Positives

Trigger TypeRiskStrategy
Safe single words (calendar, weather)LowUse freely
Dangerous single words (book, free, cancel)HighPrefer phrases
Phrase triggers (book a time, am I free)Medium-LowStrong default
Full sentence triggersLowCapture indirect phrasing

4.3 - Trigger Checklist

  • Include plural forms.
  • Include regional variants.
  • Include indirect phrasing.
  • Include natural full sentences.

4.4 - Read Trigger Context

Use prior conversation to classify intent and route correctly:
  • “What’s on my calendar today?” -> daily schedule.
  • “Create a meeting with Sarah at 3” -> direct create flow.
Pattern: read history -> classify intent -> route handler.

Part 5: The Ability Lifecycle

background.py background daemons now run alongside interactive ability flows.

5.1 - Two Runtime Lifecycles

Interactive Skill / Brain Skill path
  1. User or brain routing activates main.py.
  2. Main flow calls call(self, worker).
  3. Ability runs interaction logic.
  4. Ability exits with resume_normal_flow().
Background Daemon path
  1. Session starts.
  2. Platform auto-starts background.py (no hotword).
  3. Main flow calls call(self, worker, background_daemon_mode).
  4. Daemon runs a continuous while True loop for the session lifetime.

5.2 - Ability Categories

CategoryBehavior
SkillStandard user-triggered ability. Hotword -> flow -> resume_normal_flow().
Brain SkillsTriggered by the Agent brain to fill knowledge gaps or delegate actions.
Background DaemonStarts automatically at session start and runs continuously, including during sleep mode.
LocalDevice-side Python package model for Raspberry Pi. Under development (not yet released).
Note: Brain Skills templates are still being finalized.

5.3 - Ability File Structures

PatternFilesBehavior
Standard Interactivemain.pyTriggered by user or brain routing, then exits to main flow.
Standalone Background Daemonbackground.pyRuns continuously for monitoring/logging/scheduling.
Interactive Combinedmain.py + background.pyForeground user flow plus background daemon coordination.
background.py must be named exactly background.py or it will not be detected.

5.4 - Critical main.py vs background.py Differences

Aspectmain.pybackground.py
call() signaturecall(self, worker)call(self, worker, background_daemon_mode)
TriggerUser hotword or brain routingAutomatic on session start
LifecycleRun once, then exitContinuous loop
resume_normal_flow()Required on exit pathsNot used in daemon loop
Sleep modeNot active while asleepKeeps running while Agent sleeps

5.5 - Combined Pattern (main.py + background.py)

  1. User says “set an alarm for 3 PM Thursday.”
  2. main.py parses intent and writes schedule data to alarms.json.
  3. main.py confirms and exits via resume_normal_flow().
  4. background.py polls alarms.json on an interval.
  5. At trigger time, background calls send_interrupt_signal(), then plays/speaks alert.
  6. Background updates alarm status to triggered.

5.6 - Background Daemon Best Practices

  1. Use session_tasks.sleep() instead of asyncio.sleep().
  2. Keep poll intervals reasonable (typically 10-30 seconds).
  3. Handle missing files gracefully (check_if_file_exists() first).
  4. For JSON updates, delete then write full content.
  5. Log heavily with editor_logging_handler.
  6. Call send_interrupt_signal() before daemon speak() or play_audio().
  7. Keep a never-ending while True loop for sleep-mode continuity.

5.7 - The ability.md Pattern

Each ability should include ability.md with YAML frontmatter and markdown body. Critical: description is the primary trigger field for system routing.

Part 6: Ability Ideas by Location

6.1 - Nightstand (Bedroom)

  • Morning Manifest
  • Lights Out Debrief
  • Tomorrow’s Weather Whisper
  • Bedtime Story Engine
  • Midnight Worry Jar
  • Gratitude Fade-Out
  • Morning Body Check
  • Dream Catcher
  • Sleep Debt Tracker
  • Power Nap Coach

6.2 - Living Room (Couple)

  • Settle It
  • Movie Matchmaker
  • Dinner Decider
  • Couple’s Trivia
  • The Argument Cooldown
  • Weekend Planner
  • Guest Mode
  • Anniversary Vault
  • Background Narrator

6.3 - Kitchen

  • Recipe Walkthrough
  • Grocery List Builder
  • Cooking Timer Orchestrator
  • Kitchen Radio DJ
  • Sous Chef Advisor

6.4 - Conference Room

  • Decision Logger
  • Action Item Extractor
  • Meeting Recap
  • Who Talked Most
  • Pre-Meeting Briefer
  • Follow-Up Drafter
  • Agenda Enforcer
  • Cross-Meeting Intelligence

6.5 - College Dorm

  • Study Pomodoro Coach
  • Exam Countdown
  • Cram Session Quiz Master
  • Budget Buddy
  • Wake Up Enforcer
  • Roommate Mediator

6.6 - Home Office

  • Focus Guardian
  • Standup Generator
  • Meeting Prep Briefer
  • End-of-Day Wrap

6.7 - Car / Commute

  • Commute Debrief
  • Hands-Free Messenger
  • Traffic Aware ETA
  • Errand Optimizer

Part 7: Ability Ideas by User

7.1 - Kids (Ages 5-12)

  • Homework Helper
  • Would You Rather
  • Animal Expert
  • Story Builder
  • Spelling Bee Coach
  • Mystery Detective

7.2 - Kids Games (Ages 8-10)

  • Boss Battle Trivia
  • Monster Collector
  • Speed Round
  • Dungeon Crawler
  • Conspiracy Board

7.3 - Parents

  • Baby Sleep Tracker
  • Toddler Vocabulary Tracker
  • Family Calendar Sync
  • Bedtime Routine Manager

7.4 - Elderly Users

  • Medication Reminder
  • Cognitive Wellness Check
  • Family Connection
  • Daily Companion

7.5 - Professionals

  • Executive Brief
  • Sales Call Scorer
  • Client Meeting Debrief

Part 8: Ability Ideas by Use Case

8.1 - Health and Wellness

  • Mood Logger
  • Guided Meditation Selector
  • Breathing Exercise Coach
  • Symptom Tracker
  • Voice Health Scanner

8.2 - Productivity

  • Inbox Zero Coach
  • Voice-to-Task
  • Weekly Review
  • Voice Notes to Structured Docs

8.3 - Finance

  • Portfolio Pulse
  • Spending Tracker
  • Trending Stocks
  • Bank Balance Reality Check

8.4 - Entertainment

  • Song of the Day
  • Movie/Show Recommender
  • Live Sports Companion
  • Spotify Time Machine

8.5 - Shopping and Logistics

  • Price Watcher
  • Grocery Auto-Order
  • Package Tracker
  • Gift Idea Collector

8.6 - Smart Home and IoT

  • Scene Controller
  • Morning Routine
  • Security Check

Part 9: 3rd-Party API Integration

CategoryAPIsWhat They Enable
Music and AudioSuno, ElevenLabs, Spotify, Podcast APIsSong generation, voice, playback, discovery
FinancePlaid, Alpha Vantage, Polygon.io, CoinGeckoBanking, stock prices, portfolio, crypto alerts
Calendar and ProductivityGoogle Calendar, Todoist, Notion, GmailEvents, tasks, notes, triage
CommunicationTwilio, Slack, Telegram, SendGridSMS, chat, email delivery
Media and ContentTMDB, YouTube, NewsAPI, GoodreadsMedia discovery and summaries
Location and TravelFlightAware, Google Places, Uber/Lyft, TicketmasterFlight, local, rides, events
Smart HomePhilips Hue, Nest, SmartThings, IFTTTDevice control and scenes
HealthApple Health, Nutritionix, Headspace, FitbitSteps, calories, meditation, sleep
AI and GenerationOpenAI, DALL-E, Whisper, ElevenLabs SFXLLM tasks, images, transcription, SFX
NicheAstrology APIs, Spoonacular, SportRadar, GitHubDomain-specific utilities
Pro tip: The highest-value abilities often combine 2-3 APIs into one synthesized output.

Part 10: The OpenClaw Bridge

10.1 - What OpenClaw Is

  • Locally running desktop AI agent with a large skill ecosystem.
  • Can read files, run CLIs, and access local network resources.
  • Has registry-based skills for smart home, finance, communication, media, code, logistics, and health.

10.2 - Why It Matters

OpenHome is sandboxed. OpenClaw can operate on-device. A bridge unlocks desktop-level agency through voice.

10.3 - What the Bridge Unlocks

CategorySkills AvailableVoice Bridge Example
Smart HomeHue, IKEA, Nest, Tesla, Govee, Roborock”Turn off the living room lights”
CommunicationWhatsApp, Slack, Telegram, email”Tell Mom I’ll be there at 6”
MediaSpotify, Plex, Jellyfin”Play Discover Weekly on living room speaker”
ProductivityGoogle Workspace, GitHub, Notion”What PRs need my review?”
ShoppingAmazon, grocery, price tracking”Order everything on my grocery list”

10.4 - Flagship Bridge Abilities

  • Smart Home Scene Controller
  • Send Message by Voice
  • Voice-Triggered Email Triage
  • Tesla Voice Control
  • GitHub Standup
  • Voice Clone Creator
  • Meeting Notes to Vault
  • Document Generator

10.5 - Security

  • Use permission tiers (read-only, write-confirmed, financial/messaging explicit confirmation).
  • Send structured requests, not raw code.
  • Vet registry skills carefully before installation.

Part 11: Combining Frameworks

CombinationWhat It CreatesExample
Observer + Surprise ArtifactPassive intelligence producing unexpected documentsAnniversary Vault, Dream Dictionary
Proxy Agent + OpenClaw BridgeVoice actions in the real worldSend WhatsApp, book rides, order groceries
Daily Ritual + Compound LoopHabits that improve dailyPersonalized morning briefing
Social Multiplier + Emotional RadarGroup experiences that adapt to room energyAdaptive party trivia
Information Funnel + Context MeshOne sentence from many data streamsCalendar + weather + traffic + mood
Invisible Worker + Graceful SilenceBackground intelligence with selective speakingFlight delay watcher
Pro tip: Start with one primary framework, then add one secondary framework to 10x value.

Part 12: The Sci-Fi Frontier

These ideas are technically feasible today with ambient audio, diarization, extraction, and longitudinal logging.
  • Relationship Autopsy - detect communication pattern shifts before conscious recognition.
  • Voice Health Scanner - detect illness signatures from vocal micro-changes.
  • Cognitive Decline Watchdog - monitor repetition and word-finding over months.
  • Emotional Forecast - predict daily trajectory from morning voice plus context.
  • Agent Drift Monitor - track long-term language and interest changes.
  • Argument Predictor - identify precursors and intervene before escalation.
  • Dream Decoder Network - connect sleep-talk themes with daytime context.
Insight: Voice is a biomarker; longitudinal speech can reveal patterns users do not explicitly report.

Part 13: Quality Checklist

Run this before shipping any Ability:
  • For main.py skills: resume_normal_flow() called on every exit path.
  • No print(); use editor_logging_handler.
  • No raw asyncio; use session_tasks.
  • API calls wrapped in try/except with spoken fallback.
  • All network calls include timeout (e.g., timeout=10).
  • Exit word detection in looping abilities.
  • speak() strings are short and natural aloud.
  • Destructive actions require confirmation loop.
  • Multi-turn flows support cancel (never mind, cancel).
  • Filler speech before API calls over 1 second.
  • API keys are placeholders, never hardcoded secrets.
  • No blocked imports (redis, user_config, open).
  • File names are ability-namespaced.
  • Read all speak() strings out loud during testing.
  • For background.py daemons: no resume_normal_flow() inside the daemon loop.
  • For background.py daemons: call send_interrupt_signal() before speaking or playing audio.
  • For background.py daemons: use a continuous while True loop with session_tasks.sleep().

Part 14: The Brainstorm Catalog (170+ Ideas)

Format: Ability Name - Speaker Location - Example User - Description

14.1 - Daily Life and Routines

  • Daily Song Generator - Living Room - 20s Woman - Suno-generated hype song summarizing the day.
  • Morning Motivation - Nightstand - Entrepreneur - Reads goals and asks for one priority.
  • Outfit Advisor - Bedroom - Professional - Weather plus calendar formality suggestion.
  • Commute Launcher - Entryway - Office Worker - Traffic, ETA, and podcast queue.
  • Arrival Debrief - Living Room - Parent - Welcome recap after returning home.
  • Evening Wind-Down - Living Room - Couple - Lights, ambient music, reflective prompt.
  • Weekend Kickoff - Living Room - Family - Friday activity suggestions from weather and preferences.
  • Bedtime Closer - Nightstand - Anyone - Lock doors, set alarm, preview tomorrow.
  • Caffeine Tracker - Kitchen - Coffee Addict - Tracks intake and sleep impact.
  • Habit Streak - Any Room - Self-Improver - Daily check-ins and streak announcements.
  • Dog Walk Tracker - Entryway - Pet Owner - Tracks walk cadence and weather-aware nudges.

14.2 - Work and Productivity

  • Standup Bot - Home Office - Developer - Reads git plus calendar and drafts standup update.
  • Email Sniper - Home Office - Executive - Voice triage on top email subjects.
  • Focus Lock - Home Office - Writer - Blocks interruptions with optional white noise.
  • Decision Journal - Home Office - Founder - Logs decisions and 30-day outcome reviews.
  • Client Prep - Home Office - Salesperson - CRM context before calls.
  • Idea Capture - Any Room - Creative - Timestamped idea logging by project.
  • Pitch Practice - Living Room - Startup Founder - Timing and clarity feedback.
  • Code Review Reader - Home Office - Developer - Reads PR comments aloud.
  • Sprint Closer - Home Office - PM - Sprint summary and retro point generation.

14.3 - Finance and Money

  • Spending Alarm - Kitchen - Overspender - Alerts when spend exceeds daily budget.
  • Bill Countdown - Living Room - Budgeter - Weekly due bills summary.
  • Impulse Blocker - Living Room - Shopper - Defers purchases and rechecks next day.
  • Side Hustle Tracker - Home Office - Gig Worker - Logs earnings and monthly P and L.
  • Subscription Audit - Living Room - Anyone - Monthly recurring subscription breakdown.
  • Savings Goal - Living Room - Saver - Goal progress nudges.
  • Crypto Morning Brief - Home Office - Trader - Overnight movers and activity summary.

14.4 - Health and Wellness

  • Stretch Break - Home Office - Desk Worker - Two-minute mobility prompts every 90 minutes.
  • Breathing Coach - Bedroom - Anxious Person - Tone-guided breathing pacing.
  • Calorie Estimator - Kitchen - Dieter - Meal estimation via nutrition API.
  • Symptom Log - Bedroom - Chronic Illness - Daily symptom tracking and weekly report.
  • Allergy Alert - Kitchen - Allergy Sufferer - Pollen-aware outdoor warnings.
  • Mental Health Check - Bedroom - Anyone - Weekly check-in with monthly patterns.

14.5 - Relationships and Social

  • Date Night Planner - Living Room - Couple - Budget-aware restaurant and activity suggestions.
  • Love Language Tracker - Living Room - Couple - Tracks expression balance over time.
  • Friend Tracker - Living Room - Social Person - Nudges for neglected relationships.
  • Party DJ - Living Room - Host - Guest requests and playlist control.
  • Gift Brain - Any Room - Thoughtful Person - Year-round gift idea capture.
  • Anniversary Countdown - Bedroom - Partner - Contextual reminders from past activities.

14.6 - Kids and Family

  • Chore Quest - Living Room - Family - Gamified chores with XP and leaderboards.
  • Vocabulary Builder - Kid’s Room - Student (8) - Word of the day with reinforcement.
  • Math Duel - Living Room - Siblings - Competitive adaptive mental math.
  • Joke of the Day - Kitchen - Family - Daily joke plus weekly best-of.
  • Talent Show Host - Living Room - Family - MC flow with applause and scoring.

14.7 - Entertainment and Games

  • Murder Mystery - Living Room - Dinner Party - Role assignment and clue progression.
  • Rap Battle Coach - Bedroom - Teen - Freestyle prompts and judging.
  • Sports Bar Mode - Living Room - Sports Fan - Live score narratives and alerts.
  • DnD Dungeon Master - Living Room - Gamers - Campaign narration and NPC voices.
  • Escape Room - Living Room - Couple - Voice puzzle scenarios with timer and hints.
  • Debate Tournament - Living Room - Friends - Timed topics and AI judging.

14.8 - Smart Home and Environment

  • Room Mood Setter - Living Room - Anyone - Scene orchestration with lights and climate.
  • Leaving House Check - Entryway - Forgetful - Lock, lights, thermostat verification.
  • Energy Coach - Living Room - Homeowner - Efficiency nudges from usage context.
  • Guest Welcome - Entryway - Host - Door-aware welcome and privacy mode.
  • Thermostat Negotiator - Living Room - Couple - Fair compromise between preferences.

14.9 - Creative and Maker

  • Song of the Day - Living Room - 20s Woman - Personalized Suno track.
  • Beat Maker - Bedroom - Teen - Vibe-to-beat generation.
  • Sound Effect Studio - Any Room - Creator - On-demand SFX generation.
  • Writing Prompt - Home Office - Writer - Genre-aware prompt creation.
  • Remix My Day - Bedroom - Producer - Ambient track from day transcript.
  • Mood Playlist - Living Room - Anyone - Mood-aware playlist generation.

14.10 - Background / Always-On

  • Life Logger - Any Room - Reflective - Always-on ambient capture with daily summaries.
  • Baby Monitor Plus - Nursery - New Parent - Cry detection and breathing/silence alerts.
  • Meeting Scribe - Conference - Team - Auto-start with 3+ voices and real-time notes.
  • Daily To-Do Compiler - Any Room - Busy Person - Captures “I need to” moments.
  • Gratitude Harvester - Any Room - Anyone - Collects positive statements for weekly review.
  • Dream Recorder - Bedroom - Dreamer - Sleep-talking capture into journal.
  • Profanity Jar - Living Room - Family - Running tally with playful fines.

14.11 - Niche and Weird

  • Wine Pairing - Kitchen - Foodie - Meal-to-wine recommendation.
  • Dad Joke Engine - Kitchen - Dad - Endless joke generator with groan scoring.
  • Plant Care - Any Room - Plant Parent - Species-specific care reminders.
  • Hot Take Generator - Living Room - Friends - Debate-fueling spicy prompts.
  • Life Narrator - Any Room - Anyone - Stylized narration mode.
  • Compliment Machine - Bathroom - Anyone - Daily compliment on detection.
  • Random Fact Cannon - Kitchen - Family - Timed obscure fact drops.

Closing Thought

The best Ability does something the LLM cannot, at a moment the user did not expect, using context accumulated over time, delivered in fewer words than the user would use. Build for the room. Build for the moment. Build for the silence between words. Then let the speaker do what it does best: be there.