BreakthroughsTuesday, May 12, 2026· 2 min read

Mira Murati’s Thinking Machines unveils real-time “interaction models” for natural AI collaboration

Source: The Verge AI

TL;DR

Thinking Machines, founded by former OpenAI CTO Mira Murati, introduced the idea of 'interaction models' — continuous, multimodal AIs that perceive audio, video, and text in real time to collaborate like a human partner. This approach promises more natural, responsive assistants for creators, teams, accessibility tools, and everyday users.

Key Takeaways

  • 1Interaction models continuously ingest audio, video, and text so AI can perceive and respond in real time rather than waiting for a finished prompt.
  • 2The design aims to let people collaborate with AI the way they collaborate with other humans — with ongoing, context-aware back-and-forth.
  • 3Potential applications include more natural virtual assistants, live collaboration aids for creators and teams, and improved accessibility tools for people with impairments.
  • 4The announcement is an early-stage vision from Thinking Machines but represents a notable shift toward persistent, multimodal AI interaction.

Thinking Machines proposes a new way for people to work with AI

Thinking Machines, the startup led by former OpenAI CTO Mira Murati, has introduced the concept of "interaction models": AI systems that continuously take in audio, video, and text and act like an ongoing collaborator. Unlike today's models, which typically wait for a complete prompt, interaction models are designed to perceive a user's actions as they happen and respond in real time.

This continuous, multimodal approach aims to make AI feel more like a teammate. By keeping shared context alive — monitoring tone, gestures, screen activity, and spoken words — interaction models could join live creative sessions, offer timely suggestions, co-edit content, or help people navigate complex tasks without interrupting their flow.

Why this matters:

  • Real-time perception can reduce friction: users no longer need to stop and reformulate prompts to keep the AI up to date.
  • Multimodal understanding expands usefulness: combining sight, sound, and text enables richer, more relevant responses in dynamic situations.
  • Broad opportunities: from accessibility assistants that follow a user's environment to collaboration tools that help remote teams work together more naturally.

Thinking Machines’ announcement is an early but exciting step toward more natural human-AI partnerships. While technical and safety challenges remain — such as privacy, continuous context management, and robust multimodal perception — the idea signals a meaningful shift in how AI could be integrated into everyday workflows, creative processes, and assistive technologies.

Get AI Wins in Your Inbox

The best positive AI stories delivered to your inbox. No spam, unsubscribe anytime.