node-llama-cpp - Streamlined Local Deployment of AI Models with GPU Utilization and TypeScript Compatibility

Introduction to node-llama-cpp

node-llama-cpp is a cutting-edge solution for enthusiasts and developers eager to run AI models locally on their machines. This project combines convenience and power, enabling users to effectively harness large language models (LLMs) without relying on cloud infrastructure.

Key Features

Local AI Model Execution: Users can execute LLMs directly on their own machines. This feature provides increased control over data and latency, crucial for various applications.
GPU Support: It supports Metal, CUDA, and Vulkan, optimizing performance based on the hardware available. This adaptability means users do not need to manually configure GPU settings.
Pre-Built Binaries: The project offers pre-compiled binaries for macOS, Linux, and Windows, making the installation process straightforward. If binaries are unavailable for a specific setup, node-llama-cpp can build from source using cmake, sidestepping the need for node-gyp or Python.
CLI Usage: For those preferring simplicity, the CLI allows interaction with models without the need to write any code. With a simple command, users can chat with AI models in their terminals.
Model Output Control: It empowers users to enforce models to generate outputs in specific formats, such as JSON. Additionally, it can constrain the model to adhere to a predefined JSON schema, enhancing the interoperability of AI applications.
Function Invocation: Models can be equipped with capabilities to invoke functions to perform actions or retrieve information as required. This feature opens new avenues for dynamic interactions.
Comprehensive Developer Support: The package offers full TypeScript support, ensuring seamless integration and a superior developer experience. Complete documentation and a getting started guide are available to facilitate easy onboarding.
Embedding: It includes support for embeddings, crucial for a wide array of machine learning tasks.

Documentation and Resources

A variety of resources are available for users:

A comprehensive getting started guide to help new users.
In-depth API reference and CLI help sections.
Updates and insights through the blog.
A changelog for tracking project progress and a roadmap outlining future plans.

Try It Without Installing

You can experience node-llama-cpp instantly, simply run the following command in your terminal, allowing for hands-on interaction with an AI model.

npx -y node-llama-cpp chat

Installation and Use

To install the package, use:

npm install node-llama-cpp

Here's a simple usage example in TypeScript:

import {fileURLToPath} from "url";
import path from "path";
import {getLlama, LlamaChatSession} from "node-llama-cpp";

const __dirname = path.dirname(fileURLToPath(import.meta.url));

const llama = await getLlama();
const model = await llama.loadModel({
    modelPath: path.join(__dirname, "models", "Meta-Llama-3.1-8B-Instruct.Q4_K_M.gguf")
});
const context = await model.createContext();
const session = new LlamaChatSession({
    contextSequence: context.getSequence()
});

const q1 = "Hi there, how are you?";
console.log("User: " + q1);

const a1 = await session.prompt(q1);
console.log("AI: " + a1);

const q2 = "Summarize what you said";
console.log("User: " + q2);

const a2 = await session.prompt(q2);
console.log("AI: " + a2);

For a detailed walkthrough, refer to the getting started guide.

Contributing

Contributions to node-llama-cpp are welcome. For more details, please review the contribution guide.

Acknowledgements

node-llama-cpp is built upon the solid foundation of llama.cpp, which provides the underlying AI capabilities.

By inviting users to both explore and contribute, node-llama-cpp not only serves as a powerful tool for AI aficionados but also as a community-driven project aimed at broadening the reach and understanding of local AI solutions.