MARKETPLACE
PLUGINS
GOOGLE CLOUD - TEXT TO SPEECH
Google Cloud - Text to Speech logo

Google Cloud - Text to Speech

Published August 2020
   •    Updated December 2025

Plugin details

Google Cloud - Text to Speech is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Google Cloud's Text-to-Speech (TTS) service uses advanced deep learning technologies to synthesize natural sounding human speech.

With dozens of lifelike voices across a broad set of languages, you can build speech-enabled applications that work in many different countries.

The following element is provided:
- GOOGLE CLOUD - TEXT TO SPEECH (FRONT-END)

The following actions are provided:
- GET LIST OF VOICES
- SYNTHESIZE SPEECH (BACK-END)
- SYNTHESIZE SPEECH (FRONT-END)

The plugin returns a list of available voices, and a MP3 file for playback with the chosen audio profile (such as wearable, handset, car speakers, and so on).

The limit for input text is a maximum of 5000 characters total, either SSML or text.

Demo Link: https://gcptexttospeechdemo.bubbleapps.io/version-test

Editor Link: https://bubble.io/page?name=index&id=gcptexttospeechdemo-editor&tab=tabs-1

💡 𝗦𝘂𝗯𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝗽𝗿𝗼𝗿𝗮𝘁𝗲𝗱. 𝗜𝗳 𝘆𝗼𝘂 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗮𝗻𝗱 𝘂𝗻𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲 𝘁𝗵𝗶𝘀 𝗽𝗹𝘂𝗴𝗶𝗻 𝗶𝗻 𝗼𝗻𝗲 𝗱𝗮𝘆 𝘁𝗼 𝘁𝗲𝘀𝘁 𝗶𝘁 𝗼𝘂𝘁, 𝘆𝗼𝘂'𝗹𝗹 𝗼𝗻𝗹𝘆 𝗯𝗲 𝗰𝗵𝗮𝗿𝗴𝗲𝗱 𝟭/𝟯𝟬𝘁𝗵 𝗼𝗳 𝘁𝗵𝗲 𝗺𝗼𝗻𝘁𝗵𝗹𝘆 𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻 𝗳𝗲𝗲.

📖 𝗦𝘁𝗲𝗽-𝗯𝘆-𝗦𝘁𝗲𝗽 𝗶𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝘁𝗵𝗲 "𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀" 𝘀𝗲𝗰𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗗𝗲𝗺𝗼 𝗘𝗱𝗶𝘁𝗼𝗿 𝗶𝘀 𝗶𝗻 𝘁𝗵𝗲 "𝗟𝗶𝗻𝗸𝘀" 𝘀𝗲𝗰𝘁𝗶𝗼𝗻 𝗼𝗳 𝘁𝗵𝗲 𝗣𝗹𝘂𝗴𝗶𝗻 𝗣𝗮𝗴𝗲.

Contact us at [email protected] for any additional feature you would require or support question.

$5

Per month

4.5 stars   •   4 ratings
125 installs  
This plugin does not collect or track your personal data.

Platform

Web & Native mobile

Contributor details

wise:able logo
wise:able
Joined 2020   •   122 Plugins
View contributor profile

Instructions

1️⃣: GET LIST OF VOICES ====================

📋 ACTION DESCRIPTION
--------------------------------
GET LIST OF VOICES returns the available voices from the language identification tag (BCP-47 code) for filtering the list returned.

🔧 STEP-BY-STEP SETUP
--------------------------------
ℹ️ The steps from 0) to 1) can be automatically performed by logging in into your Google Cloud Console, opening the Cloud Shell (top right corner of your page) and copy pasting this command and press enter:

 wget -q https://storage.googleapis.com/bubblegcpdemo/demo-assets/wiseable-gcp-texttospeech.py && python3 wiseable-gcp-texttospeech.py

0) Set-up a project from Google Cloud Console: https://cloud.google.com/text-to-speech/docs/libraries#setting_up_authentication
 - Create or select a project
 - Enable the TEXT-TO-SPEECH API for that project
 - Create a service account
 - Download a private key as JSON.

1) Open the private key JSON file with a text editor, copy/paste the following parameters from your file to the Plugin settings:
 - CLIENT_EMAIL
 - PROJECT_ID
 - PRIVATE_KEY, including the -----BEGIN PRIVATE KEY-----\n prefix and \n-----END PRIVATE KEY-----\n suffix.

2) Set up the action "GET LIST OF VOICES" in the workflow.
  Input Fields:
    - LANGUAGE CODE: The language identification tag (BCP-47 code) for filtering the list of voices returned. If you don't specify this optional parameter, all available voices are returned.
    - RESULT DATA TYPE: Returned type, must always be set to "RESULT (TEXT TO SPEECH)".
  Output Fields:
    - RESULTS: Returns a list of voices with their properties, such as gender, and voice name containing the engine.

3) Set-up a visual element supporting a list to allow the user to select the required voice property. Please refer to the demo for this specific implementation.

2️⃣: SYNTHESIZE SPEECH (BACK-END)
====================

📋 ACTION DESCRIPTION
--------------------------------
SYNTHESIZE SPEECH (BACK-END) converts plain text or SSML into synthesized speech in MP3 file format with the chosen audio profile.

🔧 STEP-BY-STEP SETUP
--------------------------------
ℹ️ If not already done, perform steps 0 and 1 from the GET LIST OF VOICES setup to configure your Google Cloud credentials.

1) Set up the action "SYNTHESIZE SPEECH (BACK-END)" in the workflow.
  Input Fields:
    - TEXT: Input text or SSML to synthesize. Maximum of 5000 characters total.
    - NAME: The name of the voice. If not set, the service will choose a voice based on the other parameters such as LANGUAGE CODE and gender.
    - GENDER: The preferred gender of the voice. If not set, the service will choose a voice based on the other parameters such as LANGUAGE CODE and NAME. Note that this is only a preference, not requirement; if a voice of the appropriate gender is not available, the synthesizer should substitute a voice with a different gender rather than failing the request.
    - LANGUAGE CODE: The language identification tag (BCP-47 code) for voice synthesising.
    - AUDIO PROFILE: Select the Audio Profile for the generated audio files, please refer here for more information: https://cloud.google.com/text-to-speech/docs/audio-profiles#available_audio_profiles
    - RESULT DATA TYPE: Returned type, must always be set to "RESULT (TEXT TO SPEECH)".
  Output Fields:
    - RESULTS: Returns the synthesised speech in MP3 file in base64 stream format with the chosen audio profile.

2) Set-up an audio player element supporting MP3 base64 stream as URI as Dynamic Link, such as "Circle Music Player", then set as input of this element the output of the previous action, which will return a base64 file data.
  Please refer to the demo for this specific implementation.

3️⃣: GOOGLE CLOUD - TEXT TO SPEECH (FRONT-END)
===========================================

📋 ELEMENT DESCRIPTION
--------------------------------
GOOGLE CLOUD - TEXT TO SPEECH (FRONT-END) provides the ability to convert text to speech directly from the client-side. This element is suitable for applications when reactivity is desired, such as but not limited to, mobile applications.

🔧 STEP-BY-STEP SETUP
--------------------------------
ℹ️ If not already done, perform steps 0 and 1 from the GET LIST OF VOICES setup to configure your Google Cloud credentials.

0) Register on plugins.wiseable.io. Create a new Credential which associates your BUBBLE APP URL, GOOGLE CLOUD PROJECT ID and SERVICE ACCOUNT CREDENTIALS.
  The registration service will generate your PUBLIC ACCESS KEY. This key serves as a secure proxy for your real API key.

1) Enter in the PLUGIN SETTINGS your PUBLIC ACCESS KEY (used for this element only), PROJECT_ID, and other required credentials.

2) Add the GOOGLE CLOUD - TEXT TO SPEECH (FRONT-END) element to the page on which the text-to-speech feature must be integrated. Select the RESULT DATA TYPE as "RESULT (TEXT TO SPEECH)".

3) Integrate the logic into your application using the following element's states and actions:

  FIELDS:
  - RESULT DATA TYPE: Returned type, must always be set to "RESULT (TEXT TO SPEECH)".

  EVENTS:
  - SUCCESS: Event triggered upon success
  - ERROR: Event triggered upon error

  EXPOSED STATES:
  Use any element able to show/process the data of interest stored within the result of the following states:
  - RESULTS: Populated upon SUCCESS event. Returns the synthesised speech in MP3 file in base64 stream format.
  - ERROR MESSAGE: Populated upon ERROR event.
  - IS PROCESSING: Set to true when processing is in progress, false otherwise.

  ELEMENT ACTIONS - TRIGGERED IN WORKFLOW:
  - SYNTHESIZE SPEECH (FRONT-END): Convert text or SSML to speech. Populates RESULTS state upon completion.

     Input Fields:
     - TEXT: Input text or SSML to synthesize. Maximum of 5000 characters total.
     - VOICE NAME: The name of the voice. If not set, the service will choose a voice based on the other parameters.
     - GENDER: The preferred gender of the voice. If not set, the service will choose a voice based on the other parameters.
     - LANGUAGE CODE: The language identification tag (BCP-47 code) for voice synthesising.
     - AUDIO PROFILE: Select the Audio Profile for the generated audio files.

🔍 IMPLEMENTATION EXAMPLE
======================
Feel free to browse the app editor in the Service URL for an implementation example.

ℹ️ ADDITIONAL INFORMATION
======================
> SSML Reference: https://cloud.google.com/text-to-speech/docs/ssml
> Supported Audio Profiles here: https://cloud.google.com/text-to-speech/docs/audio-profiles
> Supported Phonemes: https://cloud.google.com/text-to-speech/docs/phonemes
> Supported Voices & Languages: https://cloud.google.com/text-to-speech/docs/voices

> GOOGLE TEXT-TO-SPEECH service limits: https://cloud.google.com/text-to-speech/quotas

⚠️ TROUBLESHOOTING
================
Any plugin related error will be posted to the the Logs tab, "Server logs" section of your App Editor.
Make sure that "Plugin server side output" and "Plugin client side output" is selected in "Show Advanced".

For front-end actions, you can also open your browser's developer console (F12 or Ctrl+Shift+I in most browsers) to view detailed error messages and logs.

Always check the ERROR MESSAGE state of the element and implement error handling using the ERROR event to provide a better user experience.

> Server Logs Details: https://manual.bubble.io/core-resources/bubbles-interface/logs-tab#server-logs

⚡ PERFORMANCE CONSIDERATIONS
===========================

⏱️ BACK-END ACTION START DELAY
-----------------------------------------------
Each time a server-side action is called, Bubble initializes a small virtual machine to execute the action. If the same action is called shortly after, the caching mechanism kicks in, resulting in faster execution on subsequent calls.

A useful workaround is to fire a dummy execution at page load, which pre-warms the Bubble engine for the next few minutes, reducing the impact of cold starts for your users.

⏳ PROCESSING TIME LIMITS
-----------------------------------------------
For back-end actions, the maximum processing duration is capped at 30 seconds as per Bubble.io design. This time limitation does not apply to front-end actions.

FRONT-END VS BACK-END PROCESSING
----------------------------------------------------
The front-end element is designed to support and optimize multiple formats and will automatically handle SSML validation and correction. The back-end action doesn't perform this optimization, so be careful with input format when using it.

❓ QUESTIONS?
===========
Contact us at [email protected] for any additional feature you would require or support question.

Types

This plugin can be found under the following types:
Api   •   Background Services   •   Element   •   Event   •   Action

Categories

This plugin can be found under the following categories:
Media   •   Internationalization   •   Customer Support   •   AI   •   Visual Elements

Resources

Support contact
Documentation
Tutorial

Rating and reviews

Average rating (4.5)

Great Plugin
October 25th, 2024
Great plugin with amazing support
Great plugin. Amazing support
July 10th, 2024
Página de demo do plugin???
August 14th, 2023
https://gcptexttospeechdemo.bubbleapps/
https://gcptexttospeechdemo.bubbleapps.io/version-test ;-) (plugin author)
August 15th, 2023
  •  
wise:able
Works great!!
June 6th, 2023
Bubble