Intent Model
Purpose
Section titled “Purpose”This document defines the future canonical model for transforming voice, text, and UI interaction into structured runtime intent.
It is the bridge between raw interaction and orchestrated system behavior.
Related docs:
runtime-orchestration.mddocumentation-hub-runtime-role.md../workflows/agent-orchestrated-execution.md
Core principle
Section titled “Core principle”At the orchestration layer:
- voice input
- text input
- UI action
should all normalize into the same conceptual contract.
Recommended equivalence:
- raw modality-specific input ->
InteractionEvent - interpreted objective ->
StructuredIntent
This allows the runtime system to reason consistently regardless of whether a user typed, spoke, or clicked.
Canonical interaction objects
Section titled “Canonical interaction objects”InteractionEvent
Section titled “InteractionEvent”The normalized envelope for any inbound interaction.
Recommended fields:
interactionIdactorIdorgIdworkspaceIdsourceTypevoicetextui
channelTyperawInputnormalizedTextuiActiontimestampsessionIdconversationIdcontextSnapshot
Notes:
normalizedTextshould exist for voice and text flows- UI actions may also carry semantic labels that can be normalized into text-like intent hints
StructuredIntent
Section titled “StructuredIntent”The actionable representation of what the user is trying to achieve.
Recommended fields:
intentIdgoalactionTypesubjectentitiesconstraintsrequestedOutcomeconfidenceambiguityLevelriskClassapprovalClassrecommendedWorkflowrequiredCapabilities
Intent lifecycle
Section titled “Intent lifecycle”flowchart TD interactionEvent[InteractionEvent] normalization[Normalization] intentInference[IntentInference] ambiguityCheck[AmbiguityCheck] policyCheck[PolicyPrecheck] structuredIntent[StructuredIntent]
interactionEvent --> normalization normalization --> intentInference intentInference --> ambiguityCheck ambiguityCheck --> policyCheck policyCheck --> structuredIntentStage 1: Normalization
Section titled “Stage 1: Normalization”Convert modality-specific input into a consistent structure.
Examples:
- voice -> transcript + metadata
- text -> cleaned text + metadata
- UI click -> semantic action + local context + optional generated intent hint
Stage 2: Intent inference
Section titled “Stage 2: Intent inference”Infer:
- what the user wants
- which entities are involved
- whether the system likely can fulfill it
Stage 3: Ambiguity check
Section titled “Stage 3: Ambiguity check”Determine whether the runtime can proceed safely or must clarify.
Common ambiguity sources:
- missing entity
- missing target workspace
- conflicting instructions
- low confidence
- policy-sensitive action
Stage 4: Policy precheck
Section titled “Stage 4: Policy precheck”Before orchestration proceeds, intent must be classified for:
- access scope
- sensitivity
- approval requirements
- tool eligibility
- tenant boundary risk
Confidence and ambiguity model
Section titled “Confidence and ambiguity model”Confidence
Section titled “Confidence”Confidence represents how strongly the system believes it understands the user goal.
Suggested bands:
highmediumlow
Ambiguity
Section titled “Ambiguity”Ambiguity represents how much clarification is still required before safe execution.
Suggested bands:
noneresolvableblocking
Rules:
- low confidence does not always require a stop, but it should increase validation pressure
- blocking ambiguity should prevent autonomous execution
Approval and action classes
Section titled “Approval and action classes”Every structured intent should carry an execution sensitivity classification.
Suggested classes:
informationalrecommendationdraft_onlystate_changesensitive_state_changecompliance_sensitive
Examples:
- asking a question ->
informational - suggesting the next setup step ->
recommendation - drafting a prompt revision ->
draft_only - changing a workspace setting ->
state_change - deleting a channel or modifying permissions ->
sensitive_state_change
Design constraints
Section titled “Design constraints”- Intent models must remain workspace-aware and tenant-safe.
- UI interactions should not bypass the intent and policy model for sensitive actions.
- Voice inputs must preserve modality-specific metadata, even when normalized to text.
- The intent model should support clarification before execution.
- Intent contracts must remain stable enough to support analytics, recommendations, and workflow replay.
MVP-compatible interpretation
Section titled “MVP-compatible interpretation”The MVP-safe path is:
- start with text and UI interactions
- model them as
InteractionEvent - generate
StructuredIntentfor a small number of high-value workflows - add voice later as another input mode mapped into the same contract
Voice should be an extension of the intent model, not the first dependency of the architecture.