Native audio support is convenient — and opaque
Gemini's native audio path is a black box: you upload, Gemini transcribes internally, and you don't see the transcript. Speaker attribution is approximate, timestamps aren't exposed, and re-running on the same file can produce slightly different summaries. For ad-hoc questions that's fine; for any analysis you want to reproduce, cite, or share, an explicit Markdown transcript is the controllable artefact.
Convert once on Audio to Markdown, hand-correct any speaker labels that need fixing, and feed the corrected .md file to Gemini. The 1M-token window means you can fit several hours of structured transcript alongside related documents — meeting notes from the last quarter plus the latest customer interview plus the relevant product spec PDF, all in one prompt.
Cross-source analysis in AI Studio and Vertex
Both AI Studio and Vertex accept multiple .md attachments per conversation. Pattern: convert the audio, attach the transcript, also attach the PDF version of the agenda (PDF to Markdown for Gemini) and the web page of the project brief (URL to Markdown for Gemini). Ask Gemini to cross-reference what was promised in the brief against what was committed in the meeting. The 1M window finally has something coherent to do with all that capacity.