Project Icon

Tokenizer

Improve tokenization with byte pair encoding for Node.js and .NET

Product DescriptionDiscover the effective Typescript and C# implementations of byte pair encoding tokenizer optimized for OpenAI LLMs, building on OpenAI's tiktoken for smooth prompt tokenization in both Node.js and .NET platforms. Critical for developers moving from Microsoft.DeepDev.TokenizerLib to the enhanced Microsoft.ML.Tokenizers, scheduled for release with .NET 9.0. Contributions are welcome, following Microsoft's trademark and branding guidelines.
Project Details