MARKETPLACE
PLUGINS
GOOGLE VIDEO AI - TRANSCRIBE VIDEO
Google Video AI - Transcribe Video logo

Google Video AI - Transcribe Video

Published January 2021
   •    Updated October 2025

Plugin details

Transcribe spoken audio in a video from multiple speakers into text and returns blocks of text for each portion of the transcribed audio, along with the speaker within a .MOV, .MPEG4, .MP4, .AVI, or any ffmpeg decodable video file format, provided as input.
The supported language are specified here: https://cloud.google.com/speech-to-text/docs/languages

The use-case ranges from automated captioning, simple archiving, categorising, enhanced search purposes of your video portfolio to SEO improvement.

The plugin provides :
- a first Workflow Action to trigger the analysis.
- a second Workflow Action to return the analysis progress rate, completion status, and when completed, a list of transcriptions. For each, it returns a list of words with related timestamps, confidence rate, and the speaker(s).

Also, a script is provided to automatically configure your Google Cloud settings.

The demo application link: https://gcpvideointelligencetranscribedemo.bubbleapps.io/version-test

💡 𝗦𝘂𝗯𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝗽𝗿𝗼𝗿𝗮𝘁𝗲𝗱. 𝗜𝗳 𝘆𝗼𝘂 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗮𝗻𝗱 𝘂𝗻𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗯𝗲 𝘁𝗵𝗶𝘀 𝗽𝗹𝘂𝗴𝗶𝗻 𝗶𝗻 𝗼𝗻𝗲 𝗱𝗮𝘆 𝘁𝗼 𝘁𝗲𝘀𝘁 𝗶𝘁 𝗼𝘂𝘁, 𝘆𝗼𝘂'𝗹𝗹 𝗼𝗻𝗹𝘆 𝗯𝗲 𝗰𝗵𝗮𝗿𝗴𝗲𝗱 𝟭/𝟯𝟬𝘁𝗵 𝗼𝗳 𝘁𝗵𝗲 𝗺𝗼𝗻𝘁𝗵𝗹𝘆 𝘀𝘂𝗯𝘀𝗰𝗿𝗶𝗽𝘁𝗶𝗼𝗻 𝗳𝗲𝗲.

📖 𝗦𝘁𝗲𝗽-𝗯𝘆-𝗦𝘁𝗲𝗽 𝗶𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀 𝗮𝗿𝗲 𝘁𝗵𝗲 "𝗜𝗻𝘀𝘁𝗿𝘂𝗰𝘁𝗶𝗼𝗻𝘀" 𝘀𝗲𝗰𝘁𝗶𝗼𝗻 𝗮𝗻𝗱 𝗗𝗲𝗺𝗼 𝗘𝗱𝗶𝘁𝗼𝗿 𝗶𝘀 𝗶𝗻 𝘁𝗵𝗲 "𝗟𝗶𝗻𝗸𝘀" 𝘀𝗲𝗰𝘁𝗶𝗼𝗻 𝗼𝗳 𝘁𝗵𝗲 𝗣𝗹𝘂𝗴𝗶𝗻 𝗣𝗮𝗴𝗲.

Contact us at [email protected] for any additional feature you would require or support question.

$49

One time  •  Or  $5/mo

stars   •   0 ratings
26 installs  
This plugin does not collect or track your personal data.

Platform

Web & Native mobile

Contributor details

wise:able logo
wise:able
Joined 2020   •   122 Plugins
View contributor profile

Instructions

1 : START & GET TRANSCRIBE VIDEO ================================

ACTION DESCRIPTION
--------------------------------
 TRANSCRIBE VIDEO from a video file to return a list of transcriptions. For each, it returns the transcription, list of words, with related timestamps, confidence rate, and the audio channels (if applicable).
 Asynchronous request mode, useful for large files and time-insensitive application.

STEP-BY-STEP SETUP
--------------------------------
If you intend to store your files in Google Cloud Storage, please refer to the instructions of "GOOGLE STORAGE DROPZONE & UTILITIES" plugin (https://bubble.io/plugin/google-storage-dropzone--utilities-1616855011494x235332313714262000) first to setup your bucket. Then follow the instructions below.

The steps from 0) to 1) can be automatically performed by logging in into your Google Cloud Console, opening the Cloud Shell (top right corner of your page) and copy pasting this command and press enter:

 wget -q https://storage.googleapis.com/bubblegcpdemo/demo-assets/wiseable-gcp-video.py && python3 wiseable-gcp-video.py

Otherwise, follow these manual steps:

 0) Set-up a project from Google Cloud Console : https://cloud.google.com/video-intelligence/docs/common/auth#enabling_the_api
 - Create or select a project
 - Enable the CLOUD VIDEO INTELLIGENCE API for that project
 - Create a service account
 - Download a private key as JSON.

 1) Open the private key JSON file with a text editor, copy/paste the following parameters from your file to the Plugin settings:
 - CLIENT_EMAIL
 - PROJECT_ID
 - PRIVATE_KEY, including the -----BEGIN PRIVATE KEY-----\n prefix and \n-----END PRIVATE KEY-----\n suffix.

 2) Set up the action "START TRANSCRIBE VIDEO OPERATION" in the workflow.
   Inputs Fields :
       - VIDEO FILE : .MOV, .MPEG4, .MP4, .AVI, or any ffmpeg decodable video file from the Bubble.io uploader, or a Protocol-relative URLs (//server/video.mov), or a HTTPS video URL (https://server/video.mov), or a Google Storage URL (gs://bucket/video.mov). See Performance Considerations in the documentation.
     - LANGUAGE CODE : The language identification tag (BCP-47 code) of the media to analyse. The supported language are specified here: https://cloud.google.com/speech-to-text/docs/languages.
           Example : en-US
     - SPEAKER DIARIZATION : If checked, will enable speaker diarization.
     - SPEAKER COUNT : If set, specifies the estimated number of speakers in the conversation. If not set, defaults to 2.
     - PROFANITY FILTER : If set to true, will attempt to filter out profanities, replacing all but the initial character in each filtered word with asterisks, e.g. "f***". If set to false or omitted, profanities won't be filtered out.
   Output Fields :
     - OPERATION NAME : ID of the operation, to be reused in the "GET TRANSCRIBE VIDEO RESULT (ASYNC)".

 3) Set up the action "GET TRANSCRIBE VIDEO RESULT" in a recurring workflow ('Do every x seconds'), to poll the operation completion status on a regular basis.
   Configure this recurring workflow to retrieve the results once the operation DONE status is 'YES', using Only When' Event Condition,
   Inputs Fields :
     - OPERATION NAME : ID of the operation to poll, returned by "START TRANSCRIBE VIDEO OPERATION" action.
     - OUTPUT TYPE : Returned type, must always be set to "RESULT (VIDEO)".
   Output Fields :
     - RESULTS: Returns the operation progress rate, done status and the list of transcriptions. For each, it returns the transcription, list of words, with related timestamps, confidence rate, and the speaker (if applicable).

IMPLEMENTATION EXAMPLE
======================
Feel free to browse the app editor in the Service URL for an implementation example.

ADDITIONAL INFORMATION
======================
> Supported video formats : https://cloud.google.com/video-intelligence/docs/supported-formats
> Supported Languages : https://cloud.google.com/speech-to-text/docs/languages

> GOOGLE SPEECH-TO-TEXT service limits : https://cloud.google.com/speech-to-text/quotas

TROUBLESHOOTING
================
Any plugin related error will be posted to the the Logs tab, "Server logs" section of your App Editor.
Make sure that "Plugin server side output" and "Plugin server side output" is selected in "Show Advanced".

> Server Logs Details: https://manual.bubble.io/core-resources/bubbles-interface/logs-tab#server-logs

PERFORMANCE CONSIDERATIONS
===========================

GENERAL
-------------
   This implementation posts the file data to GOOGLE VIDEO INTELLIGENCE for non-Google Storage URLs (e.g: non-gs://).
 Therefore, the maximum allowable file size depends both on the bandwidth between Bubble.io & GOOGLE VIDEO INTELLIGENCE, the highest limit being capped by Bubble.io's Workflow Action maximum execution time to perform this transfer operation.

QUESTIONS ?
===========
Contact us at [email protected] for any additional feature you would require or support question.

Types

This plugin can be found under the following types:
Api   •   Background Services   •   Action

Categories

This plugin can be found under the following categories:
Media   •   Productivity   •   Video   •   AI

Resources

Support contact
Documentation
Tutorial

Rating and reviews

No reviews yet

This plugin has not received any reviews.
Bubble