OPENAI - REALTIME SPEECH TRANSLATION (FRONT-END DESKTOP & NATIVE MOBILE)
============================================
OPENAI - REALTIME SPEECH TRANSLATION (FRONT-END DESKTOP & NATIVE MOBILE) - ELEMENT DESCRIPTION
------------------------------------------------------------------------------
OPENAI - REALTIME SPEECH TRANSLATION (FRONT-END DESKTOP & NATIVE MOBILE) provides GPT realtime translation capabilities.
STEP-BY-STEP SETUP
--------------------------------
0) Register on OpenAI and get your OPENAI API KEY. If your users are behind strict firewalls or restrictive networks, optionally register on METERED.CA to obtain a TURN SERVER SECRET KEY.
1) Test on
https://platform.openai.com/playground/realtime to confirm your account and key may use REALTIME API
2) Implement the OPENAI - REALTIME SPEECH TRANSLATION (FRONT-END DESKTOP & NATIVE MOBILE) ERROR workflow (see demo) in order to raise any OpenAI errors in your application.
3) Register on plugins.wiseable.io. Create a new Credential which associates your BUBBLE APP URL your OPENAI API KEY, and optionally yoour METERED.CA TURN SERVER SECRET KEY.
The registration service will generate your PUBLIC ACCESS KEY. This key serves as a secure proxy for your real API key. It allows your application to communicate with the service without exposing your real API key. Since this PUBLIC ACCESS KEY is explicitly tied to your registered BUBBLE APP URL, it can only be used from that domain, ensuring that even if the key is publicly visible, it remains safe and cannot be misused by unauthorized sources.
4) In the Plugin Settings, enter your PUBLIC ACCESS KEY generated at the previous step.
5) Add the OPENAI - REALTIME SPEECH TRANSLATION (FRONT-END DESKTOP & NATIVE MOBILE) ELEMENT to the page on which the chat must be integrated. Select the RESULT DATA TYPE as CONVERSATION (OPENAI REALTIME SPEECH TRANSLATION).
6) Integrate the logic into your application using the following OPENAI - REALTIME SPEECH TRANSLATION element's states and actions:
FIELDS:
- RESULT DATA TYPE: Must always be selected as CONVERSATION (OPENAI REALTIME SPEECH TRANSLATION).
- MICROPHONE : Name of the Microphone to use as a source. It must be one of the item of INPUT MICROPHONES state.
PROMPT (MANUAL)
- MODEL: Name of the GPT model. See
https://developers.openai.com/api/docs/models/gpt-realtime-translate AUDIO SETTINGS :
- ECHO CANCELLATION : Echo cancellation is a feature which attempts to prevent echo effects on a two-way audio connection by attempting to reduce or eliminate crosstalk between the user's output device and their input device.
- NOISE SUPPRESSION : Noise suppression automatically filters the audio to remove background noise, hum caused by equipment, and the like from the sound before delivering it to your code.
- AUTO GAIN CONTROL : Automatic gain control is a feature in which a sound source automatically manages changes in the volume of its source media to maintain a steady overall volume level.
APP DATA RETRIEVAL SETTINGS (FRONT-END DESKTOP ONLY) :
METERED.CA NETWORK SETTINGS :
- TURN SERVER ROUTING : If true, the connection will be routed through a Metered.ca TURN server. Your Public Access Key must be configured with a valid Metered.ca Secret Key.
- CUSTOM DOMAIN : Metered.ca custom domain associated with your Metered.ca Secret Key. Mandatory if TURN Server Routing is true. Example: customdomain.metered.live
EVENTS:
- ERROR: Event triggered when an error occurs.
- CONVERSATION TO SAVE: Event triggered when any of the conversation has changed.
- CONVERSATION STARTED: Event triggered when the connection to the realtime session is established.
- CONVERSATION STOPPED: Event triggered when the connection to the realtime session is stopped.
- TOKEN USAGE UPDATE : Event triggered when a token usage update is reported.
- RATE LIMITS UPDATE : Event triggered when a token limit update is reported.
- AUDIO RECORDING SAVED (FRONT-END DESKTOP ONLY) : Event triggered when an audio recording has been successfully saved. Populates LATEST AUDIO RECORDING URL states.
EXPOSED STATES:
Use any element able to show/process the data of interest (such as a Group with a Text field) stored within the result of the following states of the OPENAI - REALTIME SPEECH TRANSLATION (FRONT-END DESKTOP & NATIVE MOBILE) ELEMENT:
- ERROR: Error message upon Error event trigger.
- IS LISTENING: Returns true when listening is in progress.
- IS AI SPEAKING: Returns true when AI speaking is in progress.
- IS RECORDING (FRONT-END DESKTOP ONLY): Returns true when audio recording is in progress.
- LATEST AUDIO RECORDING URL (FRONT-END DESKTOP ONLY): Returns the latest audio recording URL. Populates upon AUDIO RECORDING SAVED event.
- IS CONNECTED: Returns true when the conversation has been started and the WebRTC connection is active. Becomes false when the conversation is stopped or the connection is lost (e.g. disconnected, failed, closed).
- IS MICROPHONE MUTED: Returns true when microphone is muted.
- INPUT MICROPHONES: List of detected microphones, populated after DETECT DEVICES action.
- CURRENT CONVERSATION: List of role and message content.
- CONVERSATION (RAW DATA): String containing conversation in JSON format. You may use this string to load conversation in "LOAD CONVERSATION" action.
- LATEST INPUT AUDIO TOKEN USAGE: Latest input audio token usage of the AI engine. Triggered upon TOKEN USAGE UPDATE event.
- LATEST INPUT CACHED TOKEN USAGE: Latest input cached token usage of the AI engine. Triggered upon TOKEN USAGE UPDATE event.
- LATEST INPUT TEXT TOKEN USAGE: Latest input text token usage of the AI engine. Triggered upon TOKEN USAGE UPDATE event.
- LATEST OUTPUT AUDIO TOKEN USAGE: Latest output audio token usage of the AI engine. Triggered upon TOKEN USAGE UPDATE event.
- LATEST OUTPUT TEXT TOKEN USAGE: Latest output text token usage of the AI engine. Triggered upon TOKEN USAGE UPDATE event.
- LATEST REQUESTS RATE LIMIT REMAINING COUNT: Latest remaining requests count rate limit. Triggered upon RATE LIMIT UPDATE event.
- LATEST REQUESTS RATE LIMIT RESET SECONDS: Latest seconds count until the tokens rate limit resets. Triggered upon RATE LIMIT UPDATE event.
- LATEST TOKENS RATE LIMIT REMAINING COUNT: Latest remaining tokens count rate limit. Triggered upon RATE LIMIT UPDATE event.
- LATEST TOKENS RATE LIMIT RESET SECONDS: Latest seconds count until the tokens rate limit resets. Triggered upon RATE LIMIT UPDATE event.
ELEMENT ACTIONS - TRIGGERED IN WORKFLOW:
- DETECT DEVICES: Detect input devices based on INPUT DEVICES TYPE input field. Populate INPUT MICROPHONES states.
- MUTE CURRENT MICROPHONE: Mute the current microphone.
- UNMUTE CURRENT MICROPHONE: Unmute the current microphone.
- START CONVERSATION: Start Voice Conversation.
Inputs Fields:
- VOICE: The voice to use when generating the audio modality. Valid values: alloy | echo | shimmer | ash | ballad | coral | sage | verse.
- STOP CONVERSATION: Stop Voice Conversation.
Inputs Fields:
- INTERRUPT PLAYBACK : If true, immediately interrupt playback. If false, buffered audio will continue playing until completed.
- LOAD CONVERSATION : Load the conversation.
Inputs Fields :
- CONVERSATION (RAW DATA) : String containing the conversation in JSON format.
- START RECORDING (FRONT-END DESKTOP ONLY): Start to record audio
- STOP RECORDING (FRONT-END DESKTOP ONLY): Stop to record audio
- SAVE RECORDING (FRONT-END DESKTOP ONLY): Save the latest audio recording to your app. The audio format will be automatically selected based on browser's capabilities.
Inputs Fields:
- FILENAME: Filename without extension. The extension will automatically be set based on the file type
- ATTACH TO RESULT: Optional thing to privately attach the file to.
- MUTE MICROPHONE: Start to record audio
- UNMUTE MICROPHONE: Stop to record audio
IMPLEMENTATION EXAMPLE
======================
Feel free to browse the app editor in the Service URL for an implementation example.
TROUBLESHOOTING
================
Any plugin-related error will be posted to the Logs tab, "Server logs" section of your App Editor.
Make sure that "Plugin server side output" and "Plugin client side output" are selected in "Show Advanced".
> Server Logs Details:
https://manual.bubble.io/core-resources/bubbles-interface/logs-tab#server-logsPERFORMANCE CONSIDERATIONS
===========================
N/A
QUESTIONS?
===========
Contact us at
[email protected] for any additional features you would require or support questions.