MARKETPLACE
PLUGINS
MICROSOFT AZURE - SPEECH-TO-TEXT
Microsoft Azure - Speech-to-Text logo

Microsoft Azure - Speech-to-Text

Published May 2023
   •    Updated August 2025

Plugin details

Microsoft Azure Speech Services is an automated data processing system that uses AI and to convert audio to text. Use speaker diarisation to determine who said what and when. Get readable transcripts with automatic formatting and punctuation.

This plugin provides:
- A speech recorder visual element is provided to record the speech from the user's device.
- Microsoft Azure Speech-to-Text service in asynchronous mode.

The demo application link: https://microsoftazurespeechtotext.bubbleapps.io/version-test

💡 𝗦𝘂𝗯𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝗽𝗿𝗼𝗿𝗮𝘁𝗲𝗱. 𝗜𝗳 𝘆𝗼𝘂 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗮𝗻𝗱 𝘂𝗻𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲 𝘁𝗵𝗶𝘀 𝗽𝗹𝘂𝗴𝗶𝗻 𝗶𝗻 𝗼𝗻𝗲 𝗱𝗮𝘆 𝘁𝗼 𝘁𝗲𝘀𝘁 𝗶𝘁 𝗼𝘂𝘁, 𝘆𝗼𝘂'𝗹𝗹 𝗼𝗻𝗹𝘆 𝗯𝗲 𝗰𝗵𝗮𝗿𝗴𝗲𝗱 𝟭/𝟯𝟬𝘁𝗵 𝗼𝗳 𝘁𝗵𝗲 𝗺𝗼𝗻𝘁𝗵𝗹𝘆 𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻 𝗳𝗲𝗲.

📖 𝗦𝘁𝗲𝗽-𝗯𝘆-𝗦𝘁𝗲𝗽 𝗶𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝘁𝗵𝗲 "𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀" 𝘀𝗲𝗰𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗗𝗲𝗺𝗼 𝗘𝗱𝗶𝘁𝗼𝗿 𝗶𝘀 𝗶𝗻 𝘁𝗵𝗲 "𝗟𝗶𝗻𝗸𝘀" 𝘀𝗲𝗰𝘁𝗶𝗼𝗻 𝗼𝗳 𝘁𝗵𝗲 𝗣𝗹𝘂𝗴𝗶𝗻 𝗣𝗮𝗴𝗲.

Contact us at [email protected] for any additional feature you would require or support question.

$39

One time  •  Or  $7/mo

stars   •   0 ratings
20 installs  
This plugin does not collect or track your personal data.

Platform

Web & Native mobile

Contributor details

wise:able logo
wise:able
Joined 2020   •   122 Plugins
View contributor profile

Instructions

0: SPEECH RECORDER ELEMENT ==========================

ELEMENT DESCRIPTION
--------------------------------
 SPEECH RECORDER is a visual element allowing you to record voice in WAV, OGG, WAV, WEBM or PCM format on all desktop devices and browsers (with exception of iOS, where due to browser policy restrictions it works only in Safari browser). After recording, the element stored the file in the app's storage and returns the file URL.

STEP-BY-STEP SETUP
--------------------------------
 1) Drap and drop up the visual element SPEECH RECORDER in your app.

 2) Select the SPEECH RECORDER element, in APPEARANCE section, configure the following fields :
 
 FIELDS :
 - ENABLE AUTO-BINDING PARENT ELEMENT'S THING : If selected, SPEECH RECORDER will update the parent elements thing, evaluating to a FILE, once the recording will be ready.
 - MAX FILE SIZE : Limits the file size of the recording (Megabytes).
 - FILE UPLOAD ENABLED : Must be set to yes.
 - CHANNELS : Select the number of channels to record.
 - FORMAT : Output format of the recording. Valid values are WAV | OGG | PCM | WEBM | MP3.
 - BACKGROUND WHEN OFF : Recorder background color when recording is off.
 - BACKGROUND WHEN ON : Recorder background color when recording is on.
 - RECORDER WHEN OFF : Recorder color when recording is off.
 - RECORDER WHEN ON : Recorder color when recording is on.

 3) Integrate the logic into your application using the following SPEECH RECORDER, states and actions:

 EVENTS :
 - RECORD CAPTURED : Triggered when the record has been captured.
 - RECORD ENCOUNTERED ERROR : Triggered when the record has encountered an error. The "ERROR MESSAGE" is then exposed as element STATE.
 
 EXPOSED STATES:
 Use any element able to show/process the data of interest (such as a Group with a Text field) stored within the result of the following states of the SPEECH RECORDER element :
 - DURATION : Duration of the recording.
 - RECORDING : Returns yes while recording.
 - FILE SIZE : Size of the recording in bytes.
 - SAVING : Returns yes while recording is being saved to the app's storage.
 - PAUSED : Returns yes while paused.
 - RECORDING FILE : URL of the recording file, saved to the app's storage.
 - ERROR MESSAGE : Contains the error message upon "RECORDER ENCOUNTERS AN ERROR" event.

 ELEMENT ACTIONS - TRIGGERED IN WORKFLOW:
   - START - STOP
   - PAUSE - RESUME
   - CANCEL RECORDING

 4)  Then, implement either one the of the following actions to trigger the speech transcription.

1 : START & GET SPEECH-TO-TEXT JOB (ASYNC)
=======================================

ACTION DESCRIPTION
--------------------------------
START & GET SPEECH-TO-TEXT JOB (ASYNC) from an audio file returns the transcribed speech along with diarized information.

STEP-BY-STEP SETUP
--------------------------------
 0) Sign-up for MICROSOFT AZURE - COGNITIVES SERVICES by following this link: https://azure.microsoft.com/free/cognitive-services/

 1) Create a SPEECH SERVICE INSTANCE with a STANDARD PRICING TIER (S0) by following this link https://portal.azure.com/#create/Microsoft.CognitiveServicesSpeechServices

 2) In KEYS & REGION section of the created SPEECH INSTANCE, note the KEY, REGION and enter those in the PLUGIN SETTINGS.

 3)  Set up the "MICROSOFT AZURE - START SPEECH-TO-TEXT JOB" action in the workflow.

   Inputs Fields :
     - URL : Protocol-relative URL (//server/path/file.ext) from Bubble Uploader or Bubble Storage. The file must be publicly readable.
     - LOCALE : The locale of the batch transcription. This should match the expected locale of the audio data to transcribe.
       Supported locales: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support?tabs=stt#tabpanel_1_stt
     - PROFANITY FILTER : Specifies how to handle profanity in recognition results. Accepted values are None to disable profanity filtering, Masked to replace profanity with asterisks, Removed to remove all profanity from the result, or Tags to add profanity tags. The default value is Masked.
     - DIARIZATION : Specifies that diarization analysis should be carried out on the input, which is expected to be a mono channel that contains two voices. The default value is false.
   When this property is selected, source audio length can't exceed 240 minutes per file.

   Output Fields:
     - SELF : Returns the OPERATION-LOCATION value is URL format, containing the JOB ID used in subsequent actions.
     - STATUS : Job STATUS at request.

 4) Set up a recurring workflow ('Do every x seconds'), to poll the job completion status on a regular basis using GET SPEECH-TO-TEXT JOB STATUS action.

   Inputs Fields :
     - ID : ID of the Job Operation, extracted from the URL in MICROSOFT AZURE - START SPEECH-TO-TEXT JOB action output.

   Output Fields :
     - STATUS :  Job STATUS.
     - PROPERTIES ERROR CODE : Returns the ERROR CODE, when applicable.
     - PROPERTIES ERROR MESSAGE : Returns the ERROR MESSAGE, when applicable.

 5) Configure this recurring workflow to execute the next step once the job status is Succeeded, using 'Only When' Event Condition, to retrieve the TRANSCRIPT URL using GET SPEECH-TO-TEXT TRANSCRIPT URL action.

   Inputs Fields :
     - ID : ID of the Job Operation, extracted from the URL in MICROSOFT AZURE - START SPEECH-TO-TEXT JOB action output.

   Output Fields : Returns the URL of the TRANSCRIPTION FILE, to use in GET SPEECH-TO-TEXT JOB TRANSCRIPTION RESULTS action.

 6) Configure the last step to get the TRANSCRIPTION RESULTS using GET SPEECH-TO-TEXT JOB TRANSCRIPTION RESULTS action

   Inputs Fields :
     - URL : Output of the GET SPEECH-TO-TEXT TRANSCRIPT URL action.
   Output Fields : Returns the TRANSCRIBED SPEECH along with diarized information.

 7) The next step shall be the DISPLAY LIST IN REPEATING GROUP RESULTS action with the DATA SOURCE the ANALYZE RESULT DOCUMENTS of the JOB RESULTS, that will then be populated upon successful status.

IMPLEMENTATION EXAMPLE
======================
Feel free to browse the app editor in the Service URL for an implementation example.

ADDITIONAL INFORMATION
======================

> Speech Service Limits : https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-services-quotas-and-limits

> Supported Languages : https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/language-support?tabs=stt

TROUBLESHOOTING
================
Any plugin related error will be posted to the the Logs tab, "Server logs" section of your App Editor.
 Make sure that "Plugin server side output" and "Plugin server side output" is selected in "Show Advanced".

> Server Logs Details: https://manual.bubble.io/core-resources/bubbles-interface/logs-tab#server-logs

PERFORMANCE CONSIDERATIONS
===========================
 N/A

QUESTIONS ?
===========
Contact us at [email protected] for any additional feature you would require or support question.

Types

This plugin can be found under the following types:
Api   •   Action   •   Element   •   Event

Categories

This plugin can be found under the following categories:
Analytics   •   Productivity   •   Compliance   •   AI   •   Customer Support   •   Input Forms

Resources

Support contact
Documentation
Tutorial

Rating and reviews

No reviews yet

This plugin has not received any reviews.
Bubble