Audio Demo

Audio Interview Demo

A one-off OpenAI Realtime WebRTC prototype for spoken moderator prompts and spoken respondent answers, with both sides converted into live text.

Project Library

Prototype status

This demo now uses OpenAI Realtime over WebRTC. It is meant to get us closer to the natural back-and-forth feel of the ChatGPT voice experience while we shape the interview UX.

Realtime Voice Session

This prototype uses OpenAI Realtime over WebRTC so the moderator can listen and reply in natural spoken turns instead of browser-generated text-to-speech.

Ready

gpt-realtime-1.5

Voice: Marin

English - Australia

Turn handling: server VAD

Voice and language

VoiceSmooth, polished, premiumLanguage / dialectApplies to the next session you start.

Voice quality: OpenAI recommends `marin` and `cedar` for best quality. You can now switch voices and language profiles before starting a session.

Dialect note: the earlier Spanish behavior was not hard-coded to Argentina. It was the model choosing a Spanish delivery on its own. These presets make that choice intentional.

Latest session learnings

End a voice session or click `Save and analyze` to generate a session-level summary of what the test surfaced.

Potential improvements

Each saved voice test adds deduped product ideas here so we can spot recurring UX friction over time.

0 suggestions

0 saved sessions

No saved improvements yet. Run a voice session and end it to save the transcript for analysis.

This browser does not appear to support the WebRTC + microphone flow needed for the realtime demo. Chrome and Edge are the safest starting point while we refine this.

What this demo is proving

Natural turn-taking, live microphone capture, spoken moderator replies, and realtime text transcripts for both sides. From here, we can start folding in guide-led flow and eventually respondent video capture.

Pause tolerance: this version gives the respondent a much longer reflective pause before the moderator checks back in.

Live Transcript

This transcript is driven by Realtime server events over the WebRTC data channel.

Start the voice demo to open a live WebRTC session. The moderator should greet you first, then both sides will begin appearing here as the conversation unfolds.

Live mic

Listening

Moderator

Live microphone

Ready

Start a session to open the live microphone and transcript flow.

Ambient floor: 1.0% mic energy

This is the main live feedback area. It stays stable while the conversation updates, so the mic state is readable without extra floating indicators.