This plugin extracts text from various file formats, including PDFs, DOCX files, and images. It dynamically detects the file type and applies the appropriate text extraction method. Supports OCR also.
MAIN FEATURES:
➡️ Multi-Format Text Extraction: Extracts text from PDFs, DOCX files, and images (JPG, PNG, TIFF) efficiently.
➡️ OCR Support: Recognizes and extracts text from docs and images that are non editable using Optical Character Recognition (OCR).
➡️ Automatic File Type Detection: Identifies the file format and applies the appropriate text extraction method.
➡️ Auto-Extract Option: Enables automatic text extraction if the auto_extract option is activated.
➡️ Output State Publication: Publishes the extracted text in the plugin's state, allowing integration with other application components or workflows.
This plugin is ideal for applications requiring flexible document processing and OCR capabilities, ensuring efficient text extraction across different formats.
*Note: If you have a feature or plugin request, feel free to contact us.*
1. Search for text_extractor_pro in elements panel and place the element on the page. (Do not hide this element)
2. Upload the document in the provided field or add file URL dynamically. 3. Use plugin's state to extract the "text output" from the file when "Extract Text" action is called. --> Plugin Action "Extract Text" (triggered in workflow, under element actions) file when "Extract Text" action is called. --> Server Side Action "text_extractor" (triggered in workflow, under plugin actions)
Types
This plugin can be found under the following types: