MARKETPLACE
PLUGINS
AI OCR & TRANSLATION (TEXT EXTRACT)
AI OCR & Translation (Text Extract) logo

AI OCR & Translation (Text Extract)

Published October 2025
   •    Updated today

Plugin details

Extract, translate, and digitize text — even from handwritten documents — using OpenAI’s advanced AI models. This plugin transforms PDFs and images into structured HTML and translated PDFs, preserving the original document’s layout as closely as possible.

Powered entirely by OpenAI’s OCR and translation capabilities, it recognizes printed text, handwriting, and multilingual content with high accuracy.
No third-party licenses or external APIs are needed — simply connect your own OpenAI API key from a paid account. All processing runs securely within your Bubble workflows. If you do not have an OpenAI API key, You can get one at https://platform.openai.com/ .

The output recreates the structure and flow of the original document, though perfect visual accuracy cannot be guaranteed since formatting is interpreted by the AI model.

Key Features

🧠 AI-Powered OCR (by OpenAI): Extracts text from printed and handwritten documents

🌍 Automatic Translation: Translate extracted text into 100+ languages

🧾 Preserve Layout: Generates HTML resembling the original layout

📄 Dual Output: Produces both HTML and translated PDF versions

✍️ Handwriting Recognition: Detects and interprets legible handwritten text, ideal for notes, forms, or letters

⚙️ Simple Setup: No external dependencies — only your OpenAI API key

🔄 Seamless Integration: Works directly within Bubble workflows or backend workflows

Example Use Cases

Translating scanned documents, handwritten notes, or printed reports

Digitizing handwritten forms, meeting notes, or service records

Converting multilingual invoices, letters, or manuals into editable text

Creating searchable archives from handwritten or printed documents

Automating document translation and digitization pipelines inside Bubble

Notes

The output layout closely matches the original, though perfect accuracy cannot be guaranteed since formatting is AI-generated.

When translating into non-Latin languages (e.g., Chinese, Japanese, Arabic), the PDF output may show some character misrepresentations due to the PDF library used.

The HTML output correctly displays all languages, and Latin-based languages render properly in both HTML and PDF.

If a document is unclear, low-quality, or very large, processing a single page may take over 30 seconds — in such cases, that particular page would not be converted.

✅ Risk-free Trial:
The most risk-free way to try out this plugin is to subscribe to it. If you unsubscribe a few days later you will be charged on pro-rata basis, so for example if the plugin monthly price is $5 then you’d pay only 17¢ per day ($5/30 days)!

🔗 Link to test app editor - https://bubble.io/page?id=test-38043&tab=Design&name=scanned_pdf_and_image_ocr_with_translation&type=page

🔗 Link to demo page - https://ai-ocr-with-translation.bubbleapps.io/

The demo app includes an OpenAI key, allowing the plugin to be tested without requiring a paid OpenAI account.

🔗 Link to a scanned PDF document that can be used for testing - https://c360fcabb48dbb0154fda5c32a47d619.cdn.bubble.io/f1761225289077x337014401665316900/Random%20Enid%20Blyton%20pg%20scan%20%281%29.pdf

🔗 Link to an Image that can be used for testing - https://c360fcabb48dbb0154fda5c32a47d619.cdn.bubble.io/f1761225644610x824424530069811700/WhatsApp%20Image%202025-07-10%20at%2022.19.42.jpeg


$20

One time  •  Or  $5/mo

stars   •   0 ratings
6 installs  
This plugin does not collect or track your personal data.

Platform

Web & Native mobile

Contributor details

JagTech logo
JagTech
Joined 2021   •   3 Plugins
View contributor profile

Instructions

🧭 How the Plugin Works
This plugin performs several independent steps to complete the document transformation process:

The source file (PDF or image) is converted into optimized images — one image per page (for multi-page PDFs).

Each image is converted into HTML using an OpenAI model.

If translation is required, the generated HTML is translated using an OpenAI model.

The complete HTML is then converted into a PDF document.

⚙️ Plugin Components
Element: JT PDF Converter
Fields

Source File URL – The URL of the file to transform (PDF or image).

Exposed States

Status – Current status of the file transformation.

Error Message – Details of any error encountered.

Images – The set of optimized images to be transformed into a PDF document.

Source File Name – Name of the uploaded source file.

Source File MIME Type – MIME type of the source file.

Events

Source File URL Is Updated – Triggered when the Source File URL input is updated.

Convert File to Images Action Completed – Triggered when the action Convert File to Images finishes.

Recursive – Triggered internally by the Trigger Recursive Event action to support recursive operations if required.

Element Actions

Convert File to Images – Converts the source file into image data and stores the results in the exposed state named 'Images'.

Trigger Recursive Event – Triggers the Recursive event to enable iterative workflows.

🔧 Plugin Actions
Convert Image to HTML (Server-Side Action)

Fields

OpenAI API Key – Provide your OpenAI API key if it’s not already set in the plugin settings.

AI Request Timeout (seconds) – Maximum wait time for the AI model’s response before terminating the request.
Must be less than Bubble’s workflow timeout limit. For longer processes, use a backend workflow. (Default: 27 seconds)

Image – The image’s binary data to convert into HTML.

Return Values

Is Successful – true if the conversion succeeds; false if an error occurs.

Error Message – Error details (if any).

HTML DIV – Extracted text as HTML content.

AI Request Did Time Out – Indicates if the AI request exceeded the timeout period.

Translate HTML Content (Server-Side Action)

Fields

OpenAI API Key – Provide your OpenAI API key if it’s not already set in the plugin settings.

Target Language – The language to translate into (use the full name in English, e.g., French, Spanish, German).

HTML – The HTML content to translate.


Return Values

Is Successful – true if translation succeeds; false otherwise.

Error Message – Error details (if any).

AI Request Did Time Out – Indicates if the AI request exceeded the timeout period.

Can Translate – Indicates whether the AI model supports translation for the specified language.



Convert HTML to PDF (Server-Side Action)

Fields

HTML – The HTML content to convert into a PDF.

HTML Data Thing (Optional) – The database thing that holds HTML content.

HTML Data Thing Records – The list of thing records containing the HTML content.

HTML Data Thing Field Name – The field name in the thing that stores the HTML content.

Target File Name – Desired name for the generated PDF file (without extension).

PDF Upload URL – The upload destination. If left blank, the plugin will attempt to detect Bubble’s file manager upload URL automatically.

PDF Upload Private – If checked and the upload URL is for Bubble’s file storage, the PDF will be uploaded privately.

PDF Upload Attach To ID – If the file is private, specify the thing ID to associate it with.

Upload Authorization Header – Optional authorization header for secure upload endpoints.

Return Values

Is Successful – true if the PDF generation/upload succeeds; false otherwise.

Error Message – Error details (if any).

PDF URL – The final URL of the uploaded PDF file.

🚀 Implementation Guide

Add the JT PDF Converter element to the page where OCR functionality is required.

Set the Source File URL field in the element to the URL of the PDF or image file.

When the URL is set, the element triggers the event Source File URL Is Updated.
After this event, run the action Convert File to Images.

When Convert File to Images completes, the event Convert File to Images Action Completed will trigger, and the Images exposed state will contain the optimized images.

Recursively call the Convert Image to HTML (server-side action) for each image in the Images list.

If using front-end workflows, you can use the element’s Trigger Recursive Event action to mimic recursion.

If translation is needed, pass the returned HTML from each image into the Translate HTML Content action to translate it.

Once all pages are processed, combine the HTML content and call Convert HTML to PDF to generate and upload the final PDF.
You can provide HTML in two ways:

Option 1: Directly in the HTML field.

Option 2: From the database by setting:

HTML Data Thing – The thing containing HTML content.

HTML Data Thing Records – The list of records containing HTML.

HTML Data Thing Field Name – The field that stores HTML content.



🔗 Link to test app editor - https://bubble.io/page?id=test-38043&tab=Design&name=scanned_pdf_and_image_ocr_with_translation&type=page

🔗 Link to demo page - https://ai-ocr-with-translation.bubbleapps.io/

The demo app includes an OpenAI key, allowing the plugin to be tested without requiring a paid OpenAI account.

🔗 Link to a scanned PDF document that can be used for testing - https://c360fcabb48dbb0154fda5c32a47d619.cdn.bubble.io/f1761225289077x337014401665316900/Random%20Enid%20Blyton%20pg%20scan%20%281%29.pdf

🔗 Link to an Image that can be used for testing - https://c360fcabb48dbb0154fda5c32a47d619.cdn.bubble.io/f1761225644610x824424530069811700/WhatsApp%20Image%202025-07-10%20at%2022.19.42.jpeg

Types

This plugin can be found under the following types:
Background Services   •   Element   •   Event   •   Action

Categories

This plugin can be found under the following categories:
PDF   •   AI   •   Productivity   •   Small Business   •   Image   •   Visual Elements

Resources

Support contact
Tutorial

Rating and reviews

No reviews yet

This plugin has not received any reviews.
Bubble