Introduction to the Yi-1.5 Project
Yi-1.5 represents a significant advancement in artificial intelligence technology, building on the foundations of its predecessor, Yi. This project is distinguished by its extensive training process, which involves 500 billion tokens from a high-quality corpus and 3 million diverse fine-tuning samples. The result is a highly capable AI that excels in various domains, including programming, mathematics, reasoning, and instruction adherence.
Compared to the original Yi model, Yi-1.5 shows substantial improvements in language understanding, commonsense reasoning, and reading comprehension, offering a more robust performance for complex tasks. To cater to different computing needs, Yi-1.5 is available in three model sizes: 34B, 9B, and 6B.
Recent Developments
On May 13, 2024, the Yi-1.5 series was open-sourced, enhancing its abilities in programming, mathematics, reasoning, and providing guidance, making it accessible for broader development and application.
Getting Started
To begin using Yi-1.5, ensure you have Python 3.10 or a later version installed. Then, set up your environment and install the necessary packages using:
pip install -r requirements.txt
Next, download the models from platforms like Hugging Face, ModelScope, or WiseModel.
Running Locally
For example, to run Yi-1.5-34B in a local environment, you would use the following Python code to initialize and test the model's chat capabilities:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_path = '<your-model-path>'
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=False)
model = AutoModelForCausalLM.from_pretrained(
model_path,
device_map="auto",
torch_dtype='auto'
).eval()
messages = [{"role": "user", "content": "hi"}]
input_ids = tokenizer.apply_chat_template(conversation=messages, tokenize=True, return_tensors='pt')
output_ids = model.generate(input_ids.to('cuda'), eos_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(output_ids[0][input_ids.shape[1]:], skip_special_tokens=True)
print(response)
Using Ollama
Yi-1.5 models can also be run using Ollama, a local deployment tool. Install Ollama and start its service, then execute the models with:
ollama serve
ollama run yi:v1.5
For API usage, Ollama supports an OpenAI-compatible API to chat with Yi-1.5 models.
Deployment Options
Before deploying Yi-1.5, ensure compliance with software and hardware requirements. One deployment method is using vLLM, a comprehensive language model management tool. With vLLM, you can start a server with a chat model and interact via HTTP or Python client.
Demonstrations and Fine-tuning
Yi-1.5's capabilities can be explored through web demos available on platforms like Hugging Face, or you can create a local instance. For tailored applications, several frameworks including LLaMA-Factory and Swift allow for fine-tuning of the Yi models.
API Access
OpenAI-compatible Yi APIs are available via the Yi Platform, providing free tokens and competitive pay-as-you-go options. APIs are also hosted on Replicate and OpenRouter.
License Information
The Yi-1.5 models are distributed under the Apache 2.0 license. If using these models to create derivative works, it is mandatory to attribute the work to 01.AI under the same license terms.