Aquila2 - Open Source Language Models Enhancing Chatbot Capabilities

Project Introduction: Aquila2

The Aquila2 project, recently made open-source, offers a robust suite of language and chat models designed to facilitate a wide array of AI-driven tasks. This series includes various models ranging in size and capability, most notably Aquila2-7B, Aquila2-34B, and Aquila2-70B-Expr, along with their chat-oriented counterparts like AquilaChat2-7B and AquilaChat2-34B. These models can be accessed and downloaded for use through platforms such as Hugging Face and BAAI ModelHub.

Key Features and Models

The Aquila2 series is methodically structured into base language models and chat models, with each category optimized for different applications:

Base Language Models: These models, including Aquila2-7B, Aquila2-34B, and the experimental Aquila2-70B-Expr, are designed to handle tasks that require significant language processing capabilities.
Chat Models: AquilaChat2-7B, AquilaChat2-34B, and AquilaChat2-70B-Expr are tailored for interaction and communication, boasting specialized versions capable of processing long-form text up to 16k tokens.

Recent Developments

The development of the Aquila2 series is dynamic, with frequent updates enhancing performance and capabilities:

An experimental version of the 70B models was released on November 30, 2023, demonstrating improved interaction capabilities.
A major update on October 25, 2023, announced version 1.2 for the Aquila2-34B models, reflecting a significant improvement in both objective and subjective evaluations on various benchmark datasets.

Performance Evaluation

Base Model

The Aquila2 series models have consistently demonstrated superior performance compared to other models of similar size on numerous benchmark datasets. However, during testing, a data leakage issue was identified with one of the datasets, leading to the exclusion of its results from the final evaluation.

Long-Text Models

For tasks requiring the understanding of long text contexts, models like AquilaChat2-34B-16K exhibit excellent performance, approaching industry standards like GPT-3.5-16K in handling comprehensive language tasks.

Reasoning Abilities

Regarding reasoning tasks, the AquilaChat2-34B models, especially those enhanced with specific methodologies such as SFT and CoT, excel in performing various reasoning operations, including abductive, deductive, and inductive reasoning, surpassing several contemporary models.

Getting Started With Aquila2

Requirements

To begin using Aquila2 models, ensure your system meets the following prerequisites:

Python 3.10+
PyTorch 1.12+, though version 2.0 is recommended
Transformers 4.32+
CUDA 11.4+ for GPU acceleration, especially if using optional tools like flash-attention

Quickstart Guide

For a quick setup, follow the instructions to install necessary packages using pip and consider running the models in a Docker environment if suitable. Detailed examples and scripts are provided to facilitate the execution of these models for inference, highlighting their flexibility and adaptability in different computing environments.

Modelhub and Transformers Integration

Using platforms like BAAI ModelHub or the Hugging Face Transformers library, users can easily implement and test various Aquila2 models. Scripts are tailored for different scales of models, from the compact 7B to the expansive 70B-Expr, allowing for scalable AI solutions.

Experimentation and Quantization

Quantization methods are supported, allowing for minimized memory usage and optimized performance. By installing tools like BitsAndBytes, users can efficiently use quantized models for quick and effective inference.

The Aquila2 project thus offers a comprehensive toolkit for AI development, spanning from robust language models to intricate reasoning and chat capabilities, supported by active development and a community encouraging open participation and feedback.