Tiktoken logo

Tiktoken

Published August 2025
   •    Updated February 2026

Plugin details

tiktoken is a BPE (Byte-pair encoding) tokeniser for use with OpenAI's models, forked from the original tiktoken library to provide JS/WASM bindings for NodeJS and other JS runtimes.

Free

For everyone

stars   •   0 ratings
8 installs  
This plugin does not collect or track your personal data.

Other actions

Platform

Web

Contributor details

Esteban Ardila logo
Esteban Ardila
Joined 2021   •   2 Plugins
View contributor profile

Instructions

Add the element Tiktoken
Fields
🔹Text to Chunk: The input document or section of text you want to chunk.
🔹Chunk Size: Maximum number of tokens per chunk. Recommended: 300–800 tokens for embeddings.
🔹Chunk Overlap: Number of overlapping tokens between chunks. Ensures context continuity. Recommended: 30–100.

Exposed states
🔹Chunks: List of chunked text pieces.
🔹Token Count
🔹Token IDs:


✅ Best Practices
✔️ Use 500–800 tokens per chunk for embeddings (keeps semantic coherence).
✔️ Keep overlap between 30–100 tokens to avoid losing context across chunks.
✔️ Pre-chunk large documents before uploading to your Vector DB.
✔️ For PDFs, first extract clean text (avoid page headers/footers before chunking).

Types

This plugin can be found under the following types:

Categories

This plugin can be found under the following categories:
AI   •   Technical   •   Input Forms

Resources

Support contact
Documentation
Tutorial

Rating and reviews

No reviews yet

This plugin has not received any reviews.
Bubble