LlamaGPTJ-chat Project Overview
Introduction
LlamaGPTJ-chat is a straightforward command-line chat application crafted in C++ designed to run on models such as GPT-J, LLaMA, and MPT. It is built upon the llama.cpp and utilizes the gpt4all-backend to ensure full compatibility with these models.
Project Status
It's important to note that this project is in its early stages of development, which means users might encounter bugs as development progresses.
Table of Contents
The documentation within the project is organized into several key sections, including:
- Installation
- Usage
- Supported Models (GPT-J, LLaMA, MPT)
- Detailed Command List
- Useful Features
- License Information
Installation
One of the great aspects of LlamaGPTJ-chat is its compatibility with multiple platforms including Linux, macOS, and Windows. The project provides ready-made binaries, available for download in the Releases section. For developers using modern processors, AVX2 support is enabled for faster performance but can be disabled if you have an older processor.
Steps:
-
Clone the repository:
git clone --recurse-submodules https://github.com/kuvaus/LlamaGPTJ-chat cd LlamaGPTJ-chat
-
You need a model file. Check the supported models section for download links.
-
Build the project:
mkdir build cd build cmake .. cmake --build . --parallel
There are specific flags to adjust for older processors or specific operating systems like macOS.
Usage
Once compiled, the chat binary will be located at build/bin/chat
, and you can move it to any desired directory. To start a chat:
./chat -m "/path/to/modelfile/ggml-vicuna-13b-1.1-q4_2.bin" -t 4
Replace the path and model name as needed.
Supported Models
The application supports several advanced models:
GPT-J Model
Models like ggml-gpt4all-j can be downloaded directly. They are around 3.8 GB, indicating you’ll need adequate memory as the models are stored in RAM during runtime.
LLaMA Model
These models are available for research purposes, and similar to GPT-J, they require substantial memory. The Vicuna models, as derivatives, are also available in various sizes based on the number of model parameters.
MPT Model
MPT models also require significant memory and come in different variants like the chat and instruct models.
Detailed Command List
To explore extensive functionalities, you can use:
./chat -h
This command provides a comprehensive list of all available options and their descriptions, catering to diverse usage needs. It includes parameters for adjusting prompting, threading, sampling, and context handling.
Useful Features
The application has several key features:
- Logging: Save and load chat logs to maintain a history of interactions.
- Non-interactive Mode: Run the application to receive responses without interaction.
- AI Personalities: Modify templates to give the AI distinct personalities.
- Reset Context: Easily reset the chat context to start fresh.
- Load from JSON: Load parameters from a JSON file to customize settings. This facilitates easier management of various configurations for different models.
License
The project is licensed under the MIT License, aiming to encourage open collaboration and sharing of knowledge.
In summary, LlamaGPTJ-chat is a versatile and evolving tool, offering a range of configurations and functionalities for users exploring command-line interactions with advanced language models. With support for major platforms and advanced models, it serves as a robust interface for engaging AI-based chat services.