LlamaGPTJ-chat

LlamaGPTJ-chat Project Overview

Introduction

LlamaGPTJ-chat is a straightforward command-line chat application crafted in C++ designed to run on models such as GPT-J, LLaMA, and MPT. It is built upon the llama.cpp and utilizes the gpt4all-backend to ensure full compatibility with these models.

LlamaGPTJ-chat demo

Project Status

It's important to note that this project is in its early stages of development, which means users might encounter bugs as development progresses.

The documentation within the project is organized into several key sections, including:

Installation
Usage
Supported Models (GPT-J, LLaMA, MPT)
Detailed Command List
Useful Features
License Information

Installation

One of the great aspects of LlamaGPTJ-chat is its compatibility with multiple platforms including Linux, macOS, and Windows. The project provides ready-made binaries, available for download in the Releases section. For developers using modern processors, AVX2 support is enabled for faster performance but can be disabled if you have an older processor.

Steps:

Clone the repository:

git clone --recurse-submodules https://github.com/kuvaus/LlamaGPTJ-chat
cd LlamaGPTJ-chat

You need a model file. Check the supported models section for download links.
Build the project:
```
mkdir build
cd build
cmake ..
cmake --build . --parallel
```
There are specific flags to adjust for older processors or specific operating systems like macOS.

Usage

Once compiled, the chat binary will be located at build/bin/chat, and you can move it to any desired directory. To start a chat:

./chat -m "/path/to/modelfile/ggml-vicuna-13b-1.1-q4_2.bin" -t 4

Replace the path and model name as needed.

Supported Models

The application supports several advanced models:

GPT-J Model

Models like ggml-gpt4all-j can be downloaded directly. They are around 3.8 GB, indicating you’ll need adequate memory as the models are stored in RAM during runtime.

LLaMA Model

These models are available for research purposes, and similar to GPT-J, they require substantial memory. The Vicuna models, as derivatives, are also available in various sizes based on the number of model parameters.

MPT Model

MPT models also require significant memory and come in different variants like the chat and instruct models.

Detailed Command List

To explore extensive functionalities, you can use:

./chat -h

This command provides a comprehensive list of all available options and their descriptions, catering to diverse usage needs. It includes parameters for adjusting prompting, threading, sampling, and context handling.

Useful Features

The application has several key features:

Logging: Save and load chat logs to maintain a history of interactions.
Non-interactive Mode: Run the application to receive responses without interaction.
AI Personalities: Modify templates to give the AI distinct personalities.
Reset Context: Easily reset the chat context to start fresh.
Load from JSON: Load parameters from a JSON file to customize settings. This facilitates easier management of various configurations for different models.

License

The project is licensed under the MIT License, aiming to encourage open collaboration and sharing of knowledge.

In summary, LlamaGPTJ-chat is a versatile and evolving tool, offering a range of configurations and functionalities for users exploring command-line interactions with advanced language models. With support for major platforms and advanced models, it serves as a robust interface for engaging AI-based chat services.