Introducing Instructor-JS
Instructor-JS is an innovative tool for structured data extraction developed using TypeScript. It leverages OpenAI’s powerful function-calling API and Zod, a TypeScript-first schema validation system. This toolkit is designed with simplicity, transparency, and user control in mind, catering to both experienced developers and newcomers.
Installation
Installing Instructor-JS is straightforward and can be achieved through any of the following package managers by running the appropriate command:
-
Bun:
bun add @instructor-ai/instructor zod openai
-
Npm:
npm i @instructor-ai/instructor zod openai
-
Pnpm:
pnpm add @instructor-ai/instructor zod openai
Basic Usage
Using Instructor-JS involves creating an Instructor client that interfaces with OpenAI and applies Zod schemas to validate data. Below is a simple example demonstrating the setup:
import Instructor from "@instructor-ai/instructor";
import OpenAI from "openai";
import { z } from "zod";
const oai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY ?? undefined,
organization: process.env.OPENAI_ORG_ID ?? undefined
});
const client = Instructor({
client: oai,
mode: "TOOLS"
});
const UserSchema = z.object({
age: z.number().describe("The age of the user"),
name: z.string()
});
const user = await client.chat.completions.create({
messages: [{ role: "user", content: "Jason Liu is 30 years old" }],
model: "gpt-3.5-turbo",
response_model: {
schema: UserSchema,
name: "User"
}
});
console.log(user); // Outputs { age: 30, name: "Jason Liu" }
API Reference and Modes
Instructor-JS supports various modes to shape responses from language models, leveraging the zod-stream
package. These modes include:
- TOOLS: Utilizes OpenAI's tool specification.
- JSON: Responds using a JSON schema to guide output.
- MD_JSON: Outputs JSON within a Markdown code block.
- JSON_SCHEMA: Ensures responses conform to a provided JSON schema.
Examples and Streaming
Instructor-JS allows for partial streaming, which provides real-time data extraction. This can be beneficial for interactive experiences or handling large datasets. The library also supports switching between different provider configurations, such as Anyscale or Together, and even allows integration with non-OpenAI providers using the llm-polyglot
library.
Built on Island AI
Instructor-JS is part of the Island AI toolkit, comprising essential packages for handling and streaming structured data with large language models. Notably, it utilizes:
- zod-stream: For interfacing with LLM streams.
- schema-stream: For JSON streaming and model-updating.
- llm-polyglot: A unified interface for multi-provider language models.
Why Choose Instructor-JS?
Instructor-JS offers substantial advantages due to its alignment with the OpenAI SDK, customizability through Zod, and reliability, supported by a vast community. Whether you're validating data or extracting structured information, Instructor-JS provides a robust solution.
Contributing and Support
The project welcomes contributions, whether they're code improvements or new examples. Developers interested in contributing can find issues marked as good-first-issue
open for collaboration.
Instructor-JS is licensed under the MIT License and encourages those interested in porting the tool to other languages to reach out for assistance.
For more comprehensive setup instructions or additional examples, refer to the documentation or visit the repository on GitHub.