OpenAI has launched GPT-5.3-Codex-Spark, it's very first AI model designed for real-time coding, in a research preview. It’s a smaller version of GPT-5.3-Codex and marks a major move in the company’s AI development strategy, focusing on coding models rather than general-purpose ones.
The model is optimized for ultra-low latency and delivers 1000 tokens per second in partnership with Cerebras while being capable of real-world coding tasks. Developers can write, edit, and refine code instantly. According to OpenAI, “Codex is the very first model designed for working with Codex in real-time, making targeted edits, reshaping logic, or refining and seeing results immediately.”
Performance and Technical Capabilities
Codex-Spark currently supports a 128K token context window, is text-only, and is the first in the family of ultra-fast models. OpenAI said that while the model delivers faster outputs, complex engineering tasks may still need the full GPT-5.3-Codex model.
However, it is important to note that usage during the preview comes with different rate limits and may be limited during peak times. In addition, limited API access has been granted to select design partners to evaluate real-world integration.
It’s the significant outcome of OpenAI’s partnership with Cerebras. It runs on Cerebra’s Wafer Scale Engine 3 accelerator, the model that helps to achieve faster inference speeds. OpenAI has implemented end-to-end latency improvements across all its models. The company has reduced client-server round-trip overhead by 80%, per-token overhead by 30%, and time to first token by 50%.
How About Accessibility and Availability?
Code Spark is currently available as a research preview for ChatGPT Pro users via the Codex app, command-line interface, and integrated development environment extensions.
What Comes Next?
OpenAI clearly describes Code-Spark as the first step towards a dual-mode coding approach that enables real-time collaboration alongside the execution of autonomous workflows. Its super-fast inference reduces interaction delays, making AI-driven coding natural, faster, and more efficient for developers.
Stay informed with all the tech updates we cover on our website!
Read More:





