Audio Interview Demo
A one-off OpenAI Realtime WebRTC prototype for spoken moderator prompts and spoken respondent answers, with both sides converted into live text.
This demo now uses OpenAI Realtime over WebRTC. It is meant to get us closer to the natural back-and-forth feel of the ChatGPT voice experience while we shape the interview UX.
Realtime Voice Session
This prototype uses OpenAI Realtime over WebRTC so the moderator can listen and reply in natural spoken turns instead of browser-generated text-to-speech.
Voice and language
Voice quality: OpenAI recommends `marin` and `cedar` for best quality. You can now switch voices and language profiles before starting a session.
Dialect note: the earlier Spanish behavior was not hard-coded to Argentina. It was the model choosing a Spanish delivery on its own. These presets make that choice intentional.
Latest session learnings
Potential improvements
Each saved voice test adds deduped product ideas here so we can spot recurring UX friction over time.
What this demo is proving
Natural turn-taking, live microphone capture, spoken moderator replies, and realtime text transcripts for both sides. From here, we can start folding in guide-led flow and eventually respondent video capture.
Pause tolerance: this version gives the respondent a much longer reflective pause before the moderator checks back in.
Live Transcript
This transcript is driven by Realtime server events over the WebRTC data channel.
Live microphone
Ready
Start a session to open the live microphone and transcript flow.
Ambient floor: 1.0% mic energy
This is the main live feedback area. It stays stable while the conversation updates, so the mic state is readable without extra floating indicators.
