Uploading a Recording

How to hand a meeting recording to your AI employee — direct upload, URL, or pointing at a file that already lives on the server.

Last updated: April 22, 2026

uploadrecordingaudiovideozoommeetteamsyoutube

Supported Recording Sources

Sarudo accepts recordings from anywhere yt-dlp can reach — Zoom, Google Meet, Microsoft Teams, YouTube, Loom, direct HTTPS links, and most public video hosts. For audio-only or video-with-audio files, common formats are all supported (MP3, M4A, WAV, OGG, Opus, WebM, MP4, and the rest of what yt-dlp extracts). The actual transcription step runs on the extracted audio track, so video files have their audio pulled out first. If you have a recording that is behind authentication (for example a private Zoom cloud recording), download it yourself first and send the file, rather than the URL — yt-dlp does not log into your accounts.

Audio downloads are capped at a 5-minute download timeout. If a remote recording is unusually large or slow, download it to your own machine and upload the file directly instead.

How to Hand It Over

The easiest way is to forward a recording link in Telegram and tell your AI employee what it is. "Here is the kickoff with ClientCo — please process this: https://zoom.us/rec/share/…". The AI downloads the audio, transcribes it, extracts action items, and stores the result. You can also attach an audio or video file directly to the Telegram message, or reference a file that has already been saved to your server (useful if your AI employee has just recorded or downloaded something in an earlier step).

Processing a Zoom recording

Hand a recording URL to Sarudo and get back a full analysis.

You say:

Process this Zoom recording and log attendees to the ClientCo and Mark contact records. https://zoom.us/rec/share/abc123

Sarudo responds:

Downloaded and transcribing the recording now — this typically takes 15-30 seconds per minute of audio. I will link the meeting to the ClientCo contact (#42) and Mark (#17) as attendees. Back with the summary, action items, and decisions in a moment.

Titling and Linking

When you hand over a recording, you can provide a meeting title and link it to a calendar event or CRM contacts in the same breath. Titles make the meeting easier to find later ("show me the ClientCo kickoff transcript"). Calendar event IDs tie the meeting record back to the original scheduled event, useful when you want to see the meeting output alongside the invite. Contact IDs determine who gets a "Meeting" activity logged in their CRM record, so the transcript and action items show up in that person's timeline.

If you do not provide contact IDs, no CRM activities are created — the meeting is stored in the meetings table but is not linked to anyone. You can always go back later and link attendees manually.

Language and Speaker Detection

Language is auto-detected by default, so you rarely need to specify it. If the recording is short or the audio is noisy, detection can pick the wrong language — in that case, tell the AI explicitly ("this recording is in Spanish"). Speaker detection is on by default and works heuristically: long silence gaps and question-answer patterns are used to label up to eight speakers (Speaker 1, Speaker 2, etc.). The heuristic is not biometric — two people with similar speaking patterns or short back-and-forth turns may get merged. You can disable speaker detection for solo recordings (a personal voice memo, a monologue, a webinar) by saying so when you upload.

What Meetings Can Do

An overview of Sarudo's meeting pipeline — transcribe recordings, extract action items and decisions, and track follow-ups.

Automatic Transcription

How transcripts are generated — faster-whisper running locally, typical turnaround, privacy guarantees, and quality tuning.

Action Items & Attendees

How action items, decisions, key topics, and CRM attendees are extracted from every transcript — and how to review and edit them.