Rivya AI Audio Workflow Guide
Choose Rivya audio workflows for voice, text to speech, dialogue, sound effects, cleanup, music drafts, credits, and Studio iteration.
Last reviewed on 2026/04/28
Use this AI audio workflow guide before you choose between voice, text to speech, dialogue, sound effects, cleanup, music drafts, or lyric-first work in Rivya.
The easiest way to get audio wrong in Rivya is to think “audio” is one workflow.
It is not.
The current audio category really covers several different kinds of work side by side.
This page is the workflow reference for the audio area. If you want the more decision-oriented guide about how to start the first real voice or sound task, How to Start Your First AI Audio Workflow in Rivya is the better paired read.
Right now, the part most users will touch first is still spoken audio: voice, multilingual readout, dialogue, sound effects, and cleanup. But the catalog also already includes a live music branch built around Suno Music, Suno Sounds, and Suno Lyrics, so the category is broader than "TTS plus audio cleanup."
Start With the Job Shape
Before you choose an audio model, decide which of these problems you are actually solving:
- single-speaker voice or narration
- multilingual spoken output
- multi-speaker dialogue
- generated sound effects
- cleanup of an uploaded recording
- a full song draft or instrumental-first track
- lyric ideation before audio generation
Those are different workflows, not one workflow with slightly different settings.
What the Current Audio Catalog Actually Covers
The current audio catalog spans two different clusters today.
Voice, dialogue, sound effects, and cleanup
- ElevenLabs Turbo 2.5
- ElevenLabs Multilingual V2
- ElevenLabs Dialogue V3
- ElevenLabs Sound Effect V2
- ElevenLabs Audio Isolation
Music and music-adjacent work
The important point is not that several of them happen to sit under the same category. It is that they belong to different form shapes and different cost patterns.
Spoken Voice and Narration
If the task is a single voice reading one script, ElevenLabs Turbo 2.5 is still the clean default.
That is the best place to start for:
- narration
- voice-over
- quick TTS drafts
- simple spoken tracks
If the spoken delivery has to work across languages, ElevenLabs Multilingual V2 is the better fit.
If the script already has two or more speakers, ElevenLabs Dialogue V3 is the better path because dialogue is structurally different from one-person readout.
If you already know the job is narrower than the whole voice area, the paired decision pages are Best Text to Speech Generator in 2026 for plain readout, AI Narration Generator for one-speaker explainers, and AI Dubbing Generator for localized or replaced spoken tracks.
Sound Design and Cleanup
If the task is "generate a sound," ElevenLabs Sound Effect V2 is the relevant path.
If the task is "fix this recording I already have," ElevenLabs Audio Isolation is the right one.
That distinction matters because the first is prompt-first generation, while the second is upload-first cleanup.
The Live Music Branch
The music side of the audio catalog is already live, but it is intentionally narrower than a full music-production suite.
If the goal is song structure, lyric-led ideation, or music-style output, it helps to start from the music side of the audio catalog instead of from the voice guides.
Suno Music is for first track drafts
Suno Music is the better path when you need a playable track draft with or without vocals.
That makes it the clearest start for:
- first song drafts
- instrumental-first concept tracks
- rough music for videos, demos, or podcasts
Successful results can continue through Extend Music, and the current result-based follow-ups also include WAV conversion and vocal separation.
Suno Sounds is for short sound sketches
Suno Sounds is a better fit when the real job is a shorter sonic sketch, ambience bed, loop idea, or background texture rather than a complete song structure.
It is the more useful place to start when BPM, key, or looping matter more than verses and choruses do.
Successful results can continue into a Vocal Separation action.
Suno Lyrics is for words before audio
Suno Lyrics is the words-first path.
It is useful when the hook, title, chorus direction, or verse shape matters before you spend on track generation. The important boundary is that it returns text results, not playable audio.
If you want the music branch broken out in more detail, read Music Workflows in Rivya.
Why the Forms Change So Much
The audio surface is intentionally model-shaped.
The forms differ because the jobs differ:
- voice models ask for text
- dialogue models ask for turns and speaker assignment
- sound effects ask for cue-like generation input
- cleanup models expect uploaded audio
- music models introduce their own prompt patterns and follow-up actions
- lyric-first tools can return structured text instead of media files
That is not inconsistency. It is Rivya exposing the real shape of each workflow instead of pretending everything works the same way under one form.
What the Music Branch Is Not
The right way to describe the current music branch is "live and useful, but intentionally narrow."
It is not:
- a full DAW
- a deep mastering or multi-stem editing suite
- the entire Suno family exposed at once
- a reason to treat all audio work as music work
That boundary matters because Rivya's current strength is still the broader multimodal workflow, not a music-only specialist stack.
Why Audio Costs Feel Different
Audio work in Rivya does not always behave like fixed-cost image generation.
Cost can depend much more directly on variables such as:
- script length
- output duration
- uploaded audio duration
- result-based follow-up actions on music tasks
Some audio entries, especially on the live music branch, are documented with fixed per-run pricing. Others behave more like duration- or text-shaped cost patterns.
That is why credits hint is especially worth reading on audio models. In many cases it is describing a cost pattern, not promising one flat number.
The Most Common Audio Mistakes
The most common wrong turns are:
- choosing voice when the real task is cleanup
- treating dialogue like single-speaker narration
- choosing sound effects when the real task is to repair an existing recording
- starting with Suno Sounds when the real need is a full song draft
- starting with Suno Lyrics when the real need is a playable result
- ignoring duration or follow-up actions as part of the cost picture
Most of those mistakes disappear once you sort by workflow shape first.
A Fast Way to Choose
If you want the shortest reliable decision path:
- decide whether the input is text, structured dialogue, uploaded audio, a music brief, or a lyric brief
- decide whether the output is voice, multilingual voice, dialogue, sound design, cleanup, a full track, a short sound sketch, or lyric text
- choose the matching model
- only then tune the parameters or result-based follow-up actions
That sequence prevents most bad fits before you spend time or credits.
Public Audio Pages vs Studio
Use the public audio pages when you want a first run, a quick comparison, or a search landing page that gets you to the right branch.
Use Studio when you want repeated iteration, saved continuity, fuller account context, or a steadier place to keep pushing the same audio task forward.
If you want the most useful companions next, go to Music Workflows in Rivya, How to Create AI Music with Rivya, How to Start Your First AI Audio Workflow in Rivya, AI Narration Generator, AI Voiceover for Videos, AI Dubbing Generator, or Studio.
Audio Workflow Checklist
Start here when the input or output is sound:
- Decide whether the job is voice, dialogue, sound effect, cleanup, music, or lyrics.
- Separate generating new audio from repairing uploaded audio.
- Check voice, language, speaker count, and commercial review before delivery.
- Use shorter drafts before spending on longer or higher-risk audio tasks.
- Keep scripts and pronunciation notes separate from general creative direction.
Recheck When Audio Changes Shape
Recheck when a voiceover becomes dubbing, a music idea becomes lyrics-first writing, or cleanup becomes re-recording. Audio tasks drift quickly if the job shape is not named.
Rivya Audio Uploads Guide
Prepare Rivya audio uploads for cleanup, speech isolation, voice review, dubbing, localization, source checks, file safety, and retries.
Rivya Login and Account Access Guide
Understand Rivya login methods, email password, Google, GitHub, Discord, Magic Link, password reset, protected pages, and account security.