BreakthroughsWednesday, April 22, 2026· 2 min read

WebSockets and Caching Turbocharge Codex Agent Workflows

Source: OpenAI Blog

TL;DR

OpenAI's deep dive into the Codex agent loop shows how using WebSockets and connection-scoped caching in the Responses API reduces API overhead and improves model latency. The result: faster, more efficient agentic workflows that make real-time and production deployments noticeably snappier for developers.

Key Takeaways

  • 1WebSockets cut per-request HTTP overhead for agent loops, enabling persistent low-latency connections.
  • 2Connection-scoped caching preserves relevant model state across an agent session, reducing redundant requests.
  • 3Together these changes reduce API overhead and improve model response latency for Codex-based agents.
  • 4Developers can build more responsive, cost-efficient agentic workflows and scale production deployments more easily.

Faster, smoother agentic workflows with WebSockets

OpenAI's technical deep dive into the Codex agent loop highlights a practical engineering win: by adopting WebSockets and implementing connection-scoped caching in the Responses API, teams can significantly reduce API overhead and improve model latency. These changes target the common bottlenecks in agentic workflows—repeated HTTP handshakes and redundant requests—so agents spend more time reasoning and less time waiting on infrastructure.

WebSockets keep a persistent bidirectional channel between client and API, eliminating the cost of repeated connection setup and enabling streaming interactions and faster back-and-forths. At the same time, connection-scoped caching stores session-specific artifacts and state so the agent doesn’t repeatedly request the same data or reinitialize context unnecessarily. The combination results in noticeably snappier model responses in multi-step agent loops.

The practical benefits are immediate for developers: lower end-to-end latency, fewer redundant API calls, and reduced operational overhead when running agents in production. These improvements make agentic applications—from interactive assistants to automated workflows—more responsive and cost-efficient.

For teams building with Codex and the Responses API, OpenAI’s guidance provides concrete patterns to implement WebSockets and scoped caching. The upgrade path is straightforward and yields tangible runtime gains, helping developers deliver smoother user experiences and scale agent-driven systems with confidence.

Try it today: adopt WebSocket connections for persistent sessions and apply connection-scoped caching where appropriate to unlock faster, more efficient agentic workflows.

Get AI Wins in Your Inbox

The best positive AI stories delivered to your inbox. No spam, unsubscribe anytime.