Yi-1.5 - Yi-1.5 excels in coding, math, reasoning, and instruction handling

Introduction to the Yi-1.5 Project

Yi-1.5 represents a significant advancement in artificial intelligence technology, building on the foundations of its predecessor, Yi. This project is distinguished by its extensive training process, which involves 500 billion tokens from a high-quality corpus and 3 million diverse fine-tuning samples. The result is a highly capable AI that excels in various domains, including programming, mathematics, reasoning, and instruction adherence.

Compared to the original Yi model, Yi-1.5 shows substantial improvements in language understanding, commonsense reasoning, and reading comprehension, offering a more robust performance for complex tasks. To cater to different computing needs, Yi-1.5 is available in three model sizes: 34B, 9B, and 6B.

Recent Developments

On May 13, 2024, the Yi-1.5 series was open-sourced, enhancing its abilities in programming, mathematics, reasoning, and providing guidance, making it accessible for broader development and application.

Getting Started

To begin using Yi-1.5, ensure you have Python 3.10 or a later version installed. Then, set up your environment and install the necessary packages using:

pip install -r requirements.txt

Next, download the models from platforms like Hugging Face, ModelScope, or WiseModel.

Running Locally

For example, to run Yi-1.5-34B in a local environment, you would use the following Python code to initialize and test the model's chat capabilities:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = '<your-model-path>'

tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    device_map="auto",
    torch_dtype='auto'
).eval()

messages = [{"role": "user", "content": "hi"}]
input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, return_tensors='pt')
output_ids = model.generate(input_ids.to('cuda'), eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
print(response)

Using Ollama

Yi-1.5 models can also be run using Ollama, a local deployment tool. Install Ollama and start its service, then execute the models with:

ollama serve
ollama run yi:v1.5

For API usage, Ollama supports an OpenAI-compatible API to chat with Yi-1.5 models.

Deployment Options

Before deploying Yi-1.5, ensure compliance with software and hardware requirements. One deployment method is using vLLM, a comprehensive language model management tool. With vLLM, you can start a server with a chat model and interact via HTTP or Python client.

Demonstrations and Fine-tuning

Yi-1.5's capabilities can be explored through web demos available on platforms like Hugging Face, or you can create a local instance. For tailored applications, several frameworks including LLaMA-Factory and Swift allow for fine-tuning of the Yi models.

API Access

OpenAI-compatible Yi APIs are available via the Yi Platform, providing free tokens and competitive pay-as-you-go options. APIs are also hosted on Replicate and OpenRouter.

License Information

The Yi-1.5 models are distributed under the Apache 2.0 license. If using these models to create derivative works, it is mandatory to attribute the work to 01.AI under the same license terms.