Text to Audio

Text-to-audio

ElevenLabs brings one of the most realistic Text to Speech and Voice Cloning model. The node is available in the multimodal section.

To get it running, two nodes are needed: an input node where the text should be written and the “Text to Audio” node that will generate the audio. This audio could be either played in Stack AI platform or downloaded.

Whisper model

Defaults voices sare available using Stack AI’s API Key (default configuration). The list is the following one:

['Rachel', 'Domi', 'Bella', 'Antoni', 'Elli', 'Josh', 'Arnold', 'Adam', 'Sam']

Setting up custom API Key is available in the settings section of the “text to audio” node. Voice appears now as a parameter that could be inputed in the deployment. 3. Text to Image

Was this page helpful?

Zapier Text to Image

On this page

Text-to-audio
Whisper model

Get Started

Builder Guide

Deployer Guide

Settings

Technical Considerations

Text-to-audio

Whisper model

Get Started

Builder Guide

Deployer Guide

Settings

Technical Considerations

​Text-to-audio

​Whisper model

Text-to-audio

Whisper model