tanuki.py - Enhance LLM Application Performance through Integrated Functions

Introduction to Tanuki

Tanuki is an innovative tool that allows developers to easily integrate language models (LLMs) into Python applications, using LLMs in place of traditional function bodies. This integration is achieved while maintaining the robustness and reliability expected of typical manually-implemented functions. Utilizing Tanuki, developers can create functions with typed parameters and outputs that exhibit predictable behaviors, akin to conventional programming methods. An exciting feature of Tanuki is its ability to decrease costs and improve speed over time, achieving up to 9-10 times improvement through automatic model refinement.

Features

Easy Integration: Tanuki provides a hassle-free way to add LLM-powered functions into existing workflows. By simply decorating a function stub with @tanuki.patch, developers can guide execution with type hints and docstrings.
Type Safety: Ensures the outputs adhere to set type constraints, protecting against unpredictability associated with LLMs.
Aligned Outputs: By utilizing simple assert statements with @tanuki.align, developers can ensure functions behave as expected, even when using LLMs.
Cost and Latency Reduction: Over time and with increased usage, Tanuki functions become more cost-effective and faster, lowering costs by up to 90% and latency by up to 80%.
Model Support: Offers support for popular model services, including OpenAI and Amazon Bedrock.
RAG Support: Smoothly integrates retrieval-augmented generation techniques, facilitating efficient document retrieval.
Out-of-the-Box Functionality: Functions operate without remote dependencies except for OpenAI.

Installation and Getting Started

Installation of Tanuki is straightforward and can be done via pip or Poetry:

pip install tanuki.py

Or with Poetry:

poetry add tanuki.py

Additionally, you need to set your OpenAI key for model usage:

export OPENAI_API_KEY=sk-...

To begin using Tanuki, create a Python function decorated with @tanuki.patch, optionally setting type hints and docstrings. For added reliability, decorate another function with @tanuki.align and include assert statements to define expected behavior.

How It Works

Tanuki functions are called by developers during the creation of applications. When invoked, these functions use an LLM configured in an n-shot setup to generate a typed response. The number of examples used is influenced by the align statements provided. This process involves post-processing the response to ensure correct type conformity before integrating it into applications. The tool also saves inputs and outputs for future model training, following the notion that increased data leads to smaller, more efficient models over time.

Typed Outputs

Tanuki emphasizes the importance of well-typed parameters and outputs, establishing rules to dictate what responses the function may return. This feature serves to mitigate verbosity and inconsistency that might arise with LLM usage. Developers can utilize complex type descriptions through Pydantic and Literals, ensuring strict conformity to expected results and reducing the need for custom validation code.

Test-Driven Alignment

Test-Driven Alignment (TDA) extends the principles of test-driven development, using tests not just to verify expected behavior, but also to drive the alignment of LLM function behaviors with established expectations. By decorating functions with @tanuki.align and setting assert statements, developers can create a binding contract for function behavior, capturing necessary nuances and refining functionality iteratively.

Scaling and Finetuning

As Tanuki functions are used more frequently, successful outputs are saved to a training dataset. This data helps in distilling smaller, task-specific models, lowering operational costs and enhancing efficiency. Tanuki handles model distillation, so developers benefit from cost and latency improvements without additional efforts in machine learning operations.

Frequently Asked Questions

What is Tanuki? Tanuki is a way to make LLM outputs in Python more predictable and consistent, while also reducing costs and speeds over time.
How does it compare to other frameworks? Tanuki ensures dependable LLM execution and reduces cost/latency through automatic finetuning, distinct from other frameworks which may not provide complete consistency and cost advantages.
Why are typed responses important? Typed responses ensure predictability and allow programs to handle LLM outputs reliably.
Can this be used with other languages or models? Currently, Tanuki only supports Python with OpenAI models, but there are plans to expand compatibility.
What is not suitable for this tool? Tasks relying heavily on context, complex natural language outputs, or time-series data may not see significant benefits from Tanuki.

Simple ToDo List App Example

For practical applications of Tanuki, check out the Simple ToDo List App example, which uses Tanuki functionalities to illustrate an elementary yet effective use-case scenario.