Categories
News

Pipecat voice AI framework: composable Python pipelines for real-time agents

Pipecat is an open-source Python toolkit for stitching together streaming speech, language models, and playback behind a single real-time loop—so product teams can focus on dialog design instead of rewiring transports for every vendor swap.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart LR
  A["Caller audio"] --> B["Transport WebRTC or WebSocket"]
  B --> C["Optional VAD"]
  C --> D["Speech to text"]
  D --> E["LLM or speech to speech"]
  E --> F["Text to speech or native audio"]
  F --> G["Return stream"]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  classDef decision fill:#444,color:#fff
  class A,G agent
  class B,C,D,E,F hook
Infographic of microphone to transport STT LLM TTS and audio output

Why teams reach for Pipecat

PropertyWhat it buys you
Voice-first layoutProcessors for streaming STT, LLM calls, and TTS sit on one timeline with explicit frame flow control.
Pluggable servicesSwap Deepgram for AssemblyAI, Cartesia for ElevenLabs, or route LLM traffic across OpenAI, Anthropic, Gemini, Groq, Ollama, and others without rewriting orchestration glue.
Composable pipelinesSmall units snap together for branching dialogs, tool calls, and guardrails—similar mental model to media pipelines.
Real-time transportsFirst-class adapters for WebSockets, WebRTC providers such as Daily or LiveKit, telephony serializers, and local debug paths.

Coverage snapshot

PlaneExamples supported upstream
Speech-to-textDeepgram, AssemblyAI, Whisper-class routes via OpenAI or Groq, cloud STT from major hyperscalars, plus specialist vendors in the docs matrix.
LLMsOpenAI, Anthropic Claude, Gemini, Grok, Mistral, Cerebras, SambaNova, Together, OpenRouter, Ollama for local inference, and more.
Text-to-speechOpenAI, ElevenLabs, Cartesia, Deepgram, Azure, AWS, Google, Fish, Rime, Piper, and additional engines listed in the service catalogue.
Speech-to-speechOpenAI Realtime, Gemini Multimodal Live, AWS Nova Sonic, Ultravox, Grok Voice Agent—useful when you want audio-native reasoning without discrete TTS.
ObservabilityOpenTelemetry hooks and Sentry integration ship for production tracing.

Quick start (from upstream docs)

Python 3.11 is the minimum supported release; maintainers recommend 3.12+. Core install stays lean—pull optional extras only for the providers you wire in.

# Scaffold a project (requires uv per README)
pipecat init quickstart

# Or add to an existing uv project
uv add pipecat-ai

# Provider-specific wheels (example pattern)
uv add "pipecat-ai[deepgram,cartesia,openai]"

Adjacent ecosystem

  • Pipecat Flows for structured state machines when a single linear pipeline is not enough.
  • Subagents repository for multi-agent buses when conversations need specialist hand-offs.
  • Voice UI Kit for client-side React and native SDKs that pair with the same transports.
  • CLI and Pipecat Cloud helpers for packaging agents once they leave a laptop.

At a glance

SignalTakeaway
Community tractionThe core repository is past 12k GitHub stars with active CI badges and Discord support—expect rapid surface-area growth.
Operational stanceTreat Pipecat as orchestration: you still own latency budgets, key management, and content moderation for customer-facing voice.
Learning pathStart with the quickstart, then clone the examples tree for incremental complexity before jumping to full demo apps.

References

Categories
News

Hermes Telegram loop: Codex build goals and Claude Code review handoffs

When two different coding agents share the same goal primitive, a messaging-first orchestrator can route long-running implementation to one stack and structured critique to another—without you babysitting shells from a desk.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart TD
  U["User message from chat"] --> G["Gateway such as Telegram"]
  G --> H["Hermes agent session"]
  H --> P["Pick next worker for goal primitive"]
  P --> B["Codex build pass"]
  B --> S["Work completes"]
  S --> R["Claude Code review pass"]
  R --> K["Kanban or board state refresh"]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  classDef decision fill:#444,color:#fff
  class U,G,K agent
  class H,P,B,S,R hook
Diagram of chat gateway routing goals from Codex build to Claude Code review with Kanban

What each layer contributes

LayerRole in the loopWhere to read more
Messaging gatewayAccepts short natural-language goals while you are away from the IDE; pairs with Hermes security guidance on DM allowlists.Hermes messaging docs cover Telegram, Discord, Slack, and related controls.
Hermes AgentMIT-licensed gateway and tool loop from Nous Research: cron, skills, MCP, and multi-terminal backends so work can live on a small VPS instead of a laptop.Upstream README and docs site.
Codex /goalOpenAI Codex can hold a durable objective across many tool turns—useful for autonomous implementation when checkpoints are clear.OpenAI “Follow a goal” use case guide.
Claude Code /goalAnthropic Claude Code runs successive turns until a stated completion test passes, with a fast gate between turns.Claude Code goal documentation.

Why split builder and reviewer stacks

IdeaEngineering angle
Decorrelated blind spotsOne model family may repeat the same mistaken assumption on both generation and critique; handing review to another toolchain pressures the plan with different priors.
Operational clarityKanban-style cards map cleanly to “building” versus “awaiting review”, which keeps async agents from colliding silently.
Closer analogue inside Claude CodeOpenAI’s codex-plugin-cc already wraps local Codex reviews inside Claude Code for users who want that separation without leaving the editor.

Controls operators still need

  • Budget caps: background goals on two providers can burn tokens quickly—set spend alerts and pause hooks before enabling always-on queues.
  • Deterministic gates: community feedback on the original thread highlights lint, type-check, and CI scripts as non-negotiable back-stops once models disagree.
  • Secrets hygiene: remote chat triggers should stay behind paired accounts, command allowlists, and isolated working directories as described in Hermes security guidance.

At a glance

SignalTakeaway
Primitive parityMatching /goal-style autonomy on Codex and Claude Code lets an orchestrator swap workers without rewriting the playbook.
Human valueYou still define stopping conditions, tests, and review criteria—agents accelerate execution, they do not replace product judgement.
Deployment shapeKeep the gateway on infrastructure you control (for example a small VPS) so mobile triggers never expose a laptop sleep/wake race.

References

Categories
News

Floci AWS emulator: lightweight Docker local stack and LocalStack alternative

Floci is an MIT-licensed, AWS-shaped local emulator you run beside your laptop stack: point the normal SDK or CLI at localhost:4566 and exercise a broad slice of control-plane behaviour without touching a live cloud account.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart LR
  A["App or test suite"] --> B["AWS SDK or CLI"]
  B --> C["Endpoint override localhost:4566"]
  C --> D["Floci router"]
  D --> E["In-process services"]
  D --> F["Docker-backed runtimes"]
  F --> G["Docker Engine on host"]
  E --> H["In-memory or disk-backed state"]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  classDef decision fill:#444,color:#fff
  class A,B agent
  class C,D,E,F,G hook
  class H decision

What changed in the landscape

SignalPractical meaning
LocalStack Community policyUpstream blog notes auth tokens from March 2026 and frozen security updates for the community path—teams are re-evaluating local emulators.
Floci positioningMarketed as a no-token drop-in on port 4566, with environment translation from common LocalStack variables.
Social summary vs READMEViral posts emphasise a tiny footprint; the project README quotes about 13 MiB idle RAM and a ~90 MB image—still far lighter than pulling a multi-GB stack for every developer machine.

Architecture at a glance

LayerRoleNotes from upstream
HTTP front doorJAX-RS on Vert.xSingle port 4566 mirrors the LocalStack-style workflow.
In-process planeS3, DynamoDB, IAM, STS, KMS, Cognito, Step Functions, EventBridge, API Gateway, and dozens moreStateful modes include memory, hybrid, WAL, and persistent storage profiles.
Docker planeLambda, RDS, ElastiCache, MSK, ECS, EC2, EKS, CodeBuild, OpenSearchUses real engine images (for example public Lambda ECR bases) so SigV4 and IAM flows stay close to production.

README benchmarks (idle laptop profile)

MetricFloci (quoted)LocalStack Community (quoted)
Startup~24 ms~3.3 s
Idle memory~13 MiB~143 MiB
Image size~90 MB~1.0 GB
Service count47 AWS-shaped services

Minimal bring-up

# docker-compose.yml (from upstream quick start)
services:
  floci:
    image: floci/floci:latest
    ports:
      - "4566:4566"
    volumes:
      - ./data:/app/data
      - /var/run/docker.sock:/var/run/docker.sock

# Then:
docker compose up

export AWS_ENDPOINT_URL=http://localhost:4566
export AWS_DEFAULT_REGION=us-east-1
export AWS_ACCESS_KEY_ID=test
export AWS_SECRET_ACCESS_KEY=test

When to trust the emulator

  • Treat Floci as a fast feedback loop for integration tests and demos—not a contractual guarantee of full AWS parity.
  • Community threads already flag API conformance gaps on complex services such as DynamoDB; pair emulator runs with selective tests against real sandboxes when behaviour must be exact.
  • Anything that shells out to Docker inherits host resource limits, socket permissions, and image pull policies—budget CI time accordingly.

Takeaways

TopicDecision hint
Developer experienceReuse existing boto3, AWS CLI v2, or SDK v3 clients with endpoint_url or AWS_ENDPOINT_URL—no rewrites.
MigrationSwap container image from localstack/localstack to floci/floci:latest; upstream documents env var translation.
RiskValidate critical paths against production-shaped data because emulators always truncate edge cases.
Infographic comparing a compact local cloud emulator with heavier multi-container setups

References

Categories
News

Interactive science apps: GPT Image UI plus Gemini 3.1 Pro code

A practical split for interactive science demos—especially 3D biology explorers—is to let a GPT Image-class model shape the interface while a strong coding model such as Gemini 3.1 Pro owns orchestration, scene logic, and tool calls.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart LR
  B[Science brief] --> U["GPT Image style pass"]
  U --> M[UI frames and assets]
  B --> C["Gemini 3.1 Pro coding"]
  C --> A[App logic and 3D wiring]
  M --> P[Interactive web app]
  A --> P

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  class B,P agent
  class U,M,C,A hook

Why split UI and code models

TrackJobTypical outputs
Image generation stackRapid layout exploration, iconography, marketing-quality panelsPNG or SVG references you hand to the front-end
Reasoning coding modelState management, WebGL or canvas pipelines, data fetch, accessibilityTypeScript or Python services with tests

Implementation notes

  • Keep scientific claims reviewable: wire 3D assets to curated datasets instead of letting the model invent structures.
  • Stream partial UI updates from the image side while the coding model focuses on deterministic simulation parameters.
  • Reuse the same session transcripts as regression prompts so UI refreshes do not break physics constraints.

At a glance

TopicTakeaway
SparkIndependent builders are pairing GPT Image tooling with Gemini 3.1 Pro for immersive science UX
ArchitectureParallel art and engineering tracks merge in the browser shell
Risk controlHuman review remains essential for biomedical accuracy

References

Categories
News

Remotion Agent Skills: install remotion-dev/skills for Claude Code and Cursor

Remotion ships an official Agent Skills bundle so coding agents such as Claude Code learn Remotion timing, media, and composition rules—installable with one CLI line alongside new `bun create video` scaffolding.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart LR
  P[Natural language brief] --> I["npx skills add remotion-dev/skills"]
  I --> A[Agent with Remotion skills]
  A --> R[React Remotion project]
  R --> V[Programmatic video output]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  class P,V agent
  class I,A,R hook

Install path

CommandPurpose
npx skills add remotion-dev/skillsPull the maintained Remotion Agent Skills package into your workspace
bun create videoNew Remotion project flow that can add the same skills during bootstrap

What the skills are for

They encode best practices for Remotion’s React-based, programmatic video model so agents avoid common timing, asset, and composition mistakes. Remotion links the catalogue to the open Agent Skills standard and hosts the source beside the main repository.

At a glance

TopicTakeaway
GoalLet Claude Code, Codex, or Cursor ship Remotion code that matches framework conventions
Installnpx skills add remotion-dev/skills per official docs
Source treeSkills live under packages/skills in the Remotion monorepo

References

Categories
News

Gemini 3.1 Flash-Lite GA: low-cost tier for agents, translation, and scale

Gemini 3.1 Flash-Lite is now positioned as Google’s fastest, lowest-cost Gemini 3 tier for high-volume work—agentic pipelines, translation, and light extraction—while keeping a stable API model id for production calls.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart LR
  V[High volume traffic] --> M["Gemini 3.1 Flash-Lite"]
  M --> T[Text and structured outputs]
  M --> A[Tool calling and routing]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  class V agent
  class M,T,A hook

What shipped

SurfaceStatus
Gemini API / AI StudioStable model code gemini-3.1-flash-lite with multimodal inputs and text output
Google CloudGenerally available on Gemini Enterprise Agent Platform as of 7 May 2026

Why teams pick Flash-Lite

ThemeDetail
Cost and latencyBuilt for ultra-low latency and large batch spend; Google cites strong price/performance versus prior Flash tiers on external speed benchmarks
Agentic fitFunction calling, structured outputs, caching, batch, and “thinking” controls for depth when needed
Typical workloadsTranslation, moderation-scale text, transcription-style audio-to-text, PDF triage, lightweight routers classifying traffic to heavier models

API snapshot

ItemValue
Model stringgemini-3.1-flash-lite
Context windowUp to 1,048,576 input tokens and 65,536 output tokens (per Gemini API model card)
Public pricing (API blog)About $0.25 per million input tokens and $1.50 per million output tokens at the announced preview price point

At a glance

TopicTakeaway
PositioningMost cost-efficient Gemini 3 model for scale-heavy, latency-sensitive jobs
When to useAgents, translation, simple extraction, routing—reserve heavier tiers for rare complex steps

References

Categories
News

Cursor /orchestrate skill: recursive SDK agents for large tasks

Cursor announced a /orchestrate skill aimed at the Cursor SDK that recursively spawns agents for large jobs—backed with internal figures on token savings and cold-start latency.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart TD
  U[User goal] --> O["/orchestrate skill"]
  O --> A1[Sub-agent A]
  O --> A2[Sub-agent B]
  O --> A3[Sub-agent N]
  A1 --> M[Merge results]
  A2 --> M
  A3 --> M
  M --> R[SDK outcome]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  class U,R agent
  class O,A1,A2,A3,M hook

What Cursor said

ItemDetail
Skill/orchestrate — recursive agent spawning for ambitious tasks via the Cursor SDK
Internal exampleAutoresearch of internal skills: about 20% lower token use with better evals
Internal exampleInternal backend cold start: about 80% reduction (as reported by Cursor)

How it fits the wider Cursor 3.3 drop

The same-day Cursor 3.3 notes emphasise parallel work with async subagents—Build in Parallel on plans, editor /multitask, and pinned skills as quick actions—so teams can mix UI-driven parallelism with skill-driven orchestration patterns.

Skills mechanics (context)

Cursor loads project and user skill folders (for example .cursor/skills/), exposes them to Agent, and supports explicit /skill-name invocation; skills can bundle scripts and references for progressive loading—useful when an orchestration skill coordinates long-running SDK sessions.


At a glance

TopicTakeaway
IntentRecursive multi-agent runs for large SDK tasks
EvidenceCursor on X (7 May 2026) plus same-day changelog and skills documentation
Try nextInvoke /orchestrate from Agent chat once the skill is available in your workspace

References

Categories
News

GPT-Realtime-2 API: voice agents with GPT-5-class reasoning and new audio stack

OpenAI shipped GPT-Realtime-2 in the Realtime API so voice agents can keep a live conversation moving while applying GPT-5-class reasoning, parallel tools, and larger session context—alongside new streaming translation and transcription models.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart LR
  U[User audio] --> R[GPT-Realtime-2 session]
  R --> T[Tools and retrieval]
  T --> R
  R --> O[Spoken reply]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  class U,O agent
  class R,T hook

Three models in one drop

ModelRole
GPT-Realtime-2Speech-to-speech with configurable reasoning, stronger tool use, longer sessions
GPT-Realtime-TranslateLive speech translation (70+ input languages to 13 output languages)
GPT-Realtime-WhisperStreaming speech-to-text as the user talks

What GPT-Realtime-2 adds for builders

CapabilityDetail
Context128K context window (up from 32K) for longer agent flows
ReasoningAdjustable effort from minimal through xhigh (low default)
ToolsParallel tool calls with short spoken preambles so latency feels covered
RecoveryMore explicit spoken fallbacks instead of silent failure
DeliveryBetter tone control for calm, empathetic, or upbeat responses

Pricing snapshot (API)

ModelBilling basis
GPT-Realtime-2About $32 per million audio input tokens ($0.40 cached) and $64 per million audio output tokens
GPT-Realtime-TranslateAbout $0.034 per minute
GPT-Realtime-WhisperAbout $0.017 per minute

All three run through the Realtime API; the announcement pairs them with customer examples (travel, telecom, property search) where low-latency speech, translation, or live captions must stay aligned with changing user intent.


At a glance

TopicTakeaway
PositioningFirst Realtime voice model with GPT-5-class reasoning in the API
Companion surfacesLive translation plus streaming transcription in the same release wave
Where to tryRealtime Playground and Realtime docs for session setup

References

Categories
News

SpaceX compute deal: what changed for Claude Code and the Claude API

Anthropic’s SpaceX compute agreement adds Colossus 1 capacity so Claude Code and Claude API limits can move up the same day—here is what changed for subscribers and API callers.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
flowchart LR
  S[SpaceX Colossus 1 capacity] --> A[Anthropic training and inference]
  A --> C[Claude Code higher ceilings]
  A --> P[Claude API Opus limits]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  class S,A agent
  class C,P hook

What went live on 6 May 2026

AreaChange
Claude Code (Pro, Max, Team, seat-based Enterprise)Five-hour rate limits doubled
Claude Code (Pro and Max)Removed peak-hours reduction on those plans
Claude APIHigher rate limits for Claude Opus models (see vendor rate-limit docs for numbers)

What the SpaceX slice adds

ItemStated scope
FacilityAll compute capacity at SpaceX Colossus 1
PowerMore than 300 megawatts of new capacity
AcceleratorsOver 220,000 NVIDIA GPUs (within the month)
Downstream focusCapacity called out for Claude Pro and Claude Max subscribers

The same announcement frames wider infrastructure work (other hyperscaler and infrastructure partners) and notes interest in future orbital compute at gigawatt scale; day-one user impact above is the limit moves for Claude Code and Opus API traffic.


At a glance

TopicTakeaway
TriggerNew SpaceX Colossus 1 supply plus other recent compute deals
Product impactHigher Claude Code ceilings and raised Opus API limits effective immediately
EvidenceAnthropic news post dated 6 May 2026

References

Categories
News

Dreaming, outcomes, and webhooks: Claude Managed Agents update (May 2026)

Claude Managed Agents now pair always-on work sessions with an explicit success rubric, asynchronous memory curation, and HTTPS webhooks so long-running agent jobs can finish, self-correct, and notify your stack without constant polling.

%%{init: {"theme": "base", "themeVariables": {"background": "transparent", "lineColor": "#000000"}}}%%
graph TD
  A[Managed agent session] --> B[Outcome rubric and iterations]
  B --> C[Separate grader context]
  C -->|needs revision| B
  C -->|satisfied| D[Idle with deliverables]
  A --> E[(Memory store)]
  E --> F[Dream job reviews transcripts]
  F --> E
  A --> G[HTTPS webhooks on milestones]

  classDef agent fill:#8B0000,color:#fff
  classDef hook fill:#189AB4,color:#fff
  classDef decision fill:#444,color:#fff

  class A agent
  class E,F,G hook
  class C decision
Diagram of work session, outcome grader, dream job, and webhooks around managed agent memory

What changed on 6 May 2026

The live Code with Claude stream introduced dreaming as a research preview inside Managed Agents, while outcomes, multi-agent orchestration, and webhooks moved to public beta alongside the existing memory features.

Surface areaAvailabilityWhat it gives you
DreamingResearch preview (access request)Scheduled consolidation of memory stores plus optional mining of up to 100 past sessions
OutcomesPublic betaRubric-backed iterations with an isolated grader and webhook completion signals
Multi-agent orchestrationPublic betaCoordinator agents that delegate to specialists with isolated threads on a shared filesystem
WebhooksPublic betaSmall signed HTTPS callbacks instead of polling for session, thread, outcome, and vault events

Dreaming: curate memory without mutating the source store

Dreaming runs as an asynchronous job that reads a memory store and up to one hundred session transcripts, then writes a brand-new store containing merged facts, removed contradictions, and freshly surfaced patterns. The input store stays read-only until you adopt the output, which keeps experiments reversible.

ItemDetail
Beta headersmanaged-agents-2026-04-01 plus dreaming-2026-04-21 on dream calls
Models supported todayclaude-opus-4-7 and claude-sonnet-4-6
Instruction budget4,096 characters of extra guidance per dream
RuntimeTypically minutes to tens of minutes depending on transcript volume
BillingStandard token metering on the selected dream model
curl -s https://api.anthropic.com/v1/dreams \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "anthropic-beta: managed-agents-2026-04-01,dreaming-2026-04-21" \
  -H "content-type: application/json" \
  -d '{"inputs":[{"type":"memory_store","memory_store_id":"'"$STORE"'"},
              {"type":"sessions","session_ids":["'"$SESSION_A"'","'"$SESSION_B"'"]}],
           "model":"claude-opus-4-7",
           "instructions":"Focus on durable coding preferences; drop one-off debugging notes."}'

Outcomes: rubrics, graders, and iteration budgets

When you emit user.define_outcome, the harness spins up a grader that scores each criterion independently in its own context window, then returns either a pass or a precise gap list that feeds the next agent revision. You can supply the rubric inline or via the Files API, cap the loop with max_iterations (default three, hard maximum twenty), and subscribe to session.outcome_evaluation_ended webhooks when grading rounds finish.

Grader resultWhat happens next
satisfiedSession returns to idle with deliverables under /mnt/session/outputs/
needs_revisionAgent takes another pass using the supplied critique
max_iterations_reachedLoop halts after the configured ceiling
failedRubric and task description were incompatible
interruptedOperator paused the outcome via user.interrupt

Multi-agent orchestration: coordinators, threads, and limits

Coordinators declare a roster of delegate agents (maximum twenty unique IDs, single hop only) and spawn isolated session threads that keep their own transcripts while sharing the container filesystem. Up to twenty-five threads may run concurrently; the primary session stream stays a condensed feed while per-thread streams expose full tool traces when you need forensic detail.

Webhooks: verify signatures, then hydrate objects yourself

Endpoints must be public HTTPS on port 443. Each delivery includes a signing secret (whsec_) and an X-Webhook-Signature header; Anthropic’s SDK unwrap() helper validates the payload and rejects anything older than five minutes, which also gives you a safe retry discriminator because duplicate retries reuse the same event.id.

Event familyExamples
Session lifecyclesession.status_run_started, session.status_idled, session.status_rescheduled, session.status_terminated
Multi-agent threadssession.thread_created, session.thread_idled, session.thread_terminated
Outcomessession.outcome_evaluation_ended
Vault credentialsvault.created, vault_credential.refresh_failed, and related archival events

Clip from the live announcement

Where to go next

Read the Managed Agents product announcement for customer vignettes and headline benchmark figures.

Deep-dive the platform guides for dreams, outcomes, multi-agent sessions, and webhooks.

Request dreaming access via the Managed Agents intake form if you want the research preview enabled for your organisation.


At a glance

DimensionDetail
Primary objectiveLet agents finish complex work, grade themselves, learn across sessions, and signal upstream systems reliably
Key APIs/v1/dreams, user.define_outcome events, coordinator multiagent blocks, Console webhook registrations
Operational guardrailsThread and iteration caps, signed webhook payloads, immutable dream inputs until you promote outputs
Launch postureDreaming in research preview; outcomes, multi-agent orchestration, and webhooks in public beta as of 6 May 2026

References