Introduction to the Fltr Project
Fltr is a powerful tool designed for natural language processing, similar to the well-known "grep" command, but specifically tailored for answering natural language questions. It leverages advanced language models such as Mistral 7B or Mixtral 8x7B to process and analyze text data efficiently.
Performance Overview
Fltr runs smoothly on various hardware configurations, demonstrating high-performance capabilities. Here’s a breakdown of its performance based on different setups:
-
Nvidia RTX 3070 with 8GB memory:
- Using Mistral 7B, it processes approximately 52 tokens per second.
- With Mixtral 8x7B, it manages around 28 tokens per second.
-
Intel I5-6500 with 8GB memory:
- For Mistral 7B, it processes about 5 tokens per second.
- While using Mixtral 8x7B, it processes approximately 2 tokens per second.
These metrics highlight how the performance can vary with different hardware configurations, especially in processing speed, which is measured in tokens per second.
Installation
Fltr can be easily installed on Linux (x86_64) and macOS (x86_64 & arm64) systems. If your system has an NVIDIA driver compatible with CUDA 12.1, the tool installs the CUDA version for optimal performance; otherwise, the CPU version is installed.
The installation command for the Mistral 7B version (smaller size, approximately 7GB) is as follows:
curl https://raw.githubusercontent.com/moritztng/fltr/main/install.sh -o install.sh && bash install.sh small && export PATH=$PATH:~/Fltr
To install the larger Mixtral 8x7B version, which is about 48GB, simply replace small
with large
in the command.
Quickstart
Using fltr is straightforward. Here's a quick guide on how to get started:
To analyze a text file, such as emails.txt
, and determine if emails are spam with the Mistral 7B model, use:
fltr --file emails.txt --prompt "Is the following email spam? Email:" --batch-size 32
If you prefer to use Mixtral 8x7B for this process, simply add --large
to the command. The output will include the lines from the file where the response to the prompt is affirmative.
With these features, Fltr offers a flexible and efficient solution for processing and understanding natural language data, making it an excellent tool for users who need to sift through large volumes of text data quickly and accurately.