ollama-python - Efficient Integration of Python 3.8+ with Ollama's Chat Models

Introduction to Ollama Python Library

The Ollama Python Library is a powerful tool designed for developers who wish to seamlessly integrate their Python 3.8+ projects with Ollama's capabilities. Ollama is a platform providing various functionalities related to natural language processing models, and this library acts as an interface to make the interaction smooth and efficient.

Installation

Getting started with the Ollama Python library is straightforward. Developers can install the library using the Python package manager pip. Simply entering the following command in the terminal initiates the installation process:

pip install ollama

Basic Usage

The library provides a straightforward implementation for chatting with models such as llama3.1. A typical usage involves importing the library, setting a model, and inputting messages. For example, to ask why the sky is blue:

import ollama
response = ollama.chat(model='llama3.1', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])
print(response['message']['content'])

This code snippet sends a message to the model and prints out the response, making it easy to interact with advanced language models.

Streaming Responses

The library supports streaming responses, which is ideal for handling large data outputs as it processes them in chunks. By enabling streaming and iterating over the response, developers can handle real-time data more efficiently:

import ollama

stream = ollama.chat(
    model='llama3.1',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

API Features

The Ollama Python library offers a rich set of functionalities mimicking those of the Ollama REST API. Key methods include:

Chat: Engage in conversations with a model.
Generate: Produce text outputs from a prompt.
List: View available models.
Show: Display model details.
Create: Define new models with custom setups.
Copy: Duplicate existing models.
Delete: Remove unwanted models.
Pull/Push: Manage local and remote model copies.
Embed: Process inputs for feature extraction.
Ps: Observe ongoing operations.

Custom Client Configuration

Users can configure a custom client to connect to specific Ollama instances, allowing more tailored control over settings like host connection and request timeout:

from ollama import Client
client = Client(host='http://localhost:11434')
response = client.chat(model='llama3.1', messages=[
  {
    'role': 'user',
    'content': 'Why is the sky blue?',
  },
])

Asynchronous Capabilities

The library also supports asynchronous operations, enabling efficient handling of requests without hindering other ongoing processes. This is essential for scalable applications that require real-time data throughput:

import asyncio
from ollama import AsyncClient

async def chat():
  message = {'role': 'user', 'content': 'Why is the sky blue?'}
  async for part in await AsyncClient().chat(model='llama3.1', messages=[message], stream=True):
    print(part['message']['content'], end='', flush=True)

asyncio.run(chat())

Error Handling

The library provides mechanisms to handle errors gracefully, ensuring developers can anticipate and manage exceptions, such as handling missing models or network issues.

model = 'does-not-yet-exist'

try:
  ollama.chat(model)
except ollama.ResponseError as e:
  print('Error:', e.error)
  if e.status_code == 404:
    ollama.pull(model)

Overall, the Ollama Python library offers a versatile and efficient way to integrate natural language processing capabilities into Python projects, enabling developers to harness advanced models with minimal hassle.