chronos-forecasting - Enhance Time Series Forecasting Using Chronos Language Model Techniques

Chronos: Learning the Language of Time Series

Introduction to Chronos

Chronos is a collection of pretrained models designed for time series forecasting, utilizing language model architectures to interpret temporal sequences. In essence, time series data is converted into a sequence of tokens through scaling and quantization, which is then processed by a language model trained with cross-entropy loss. This process enables Chronos to generate probabilistic forecasts by sampling multiple potential future scenarios based on past data. The training of these models was conducted on both public time series datasets and synthetic data produced using Gaussian processes.

Architecture

Chronos models are built upon the T5 architecture, but with a notable adjustment to the vocabulary size—utilizing 4096 tokens instead of T5's original 32128. This reduction allows for a more efficient parameterization. Here's a quick overview of some available models:

chronos-t5-tiny: 8M parameters
chronos-t5-mini: 20M parameters
chronos-t5-small: 46M parameters
chronos-t5-base: 200M parameters
chronos-t5-large: 710M parameters

Zero-Shot Results

Chronos stands out with impressive zero-shot performance, meaning it can accurately predict on datasets it hasn't seen before. When tested against local models and other pretrained models on 27 different datasets, Chronos performed exceptionally well. For a deeper dive into these evaluations and comparisons, readers are encouraged to refer to the original research paper.

How to Use Chronos

To make predictions with Chronos models:

Installation: You can install Chronos using:
```
pip install git+https://github.com/amazon-science/chronos-forecasting.git
```
For more comprehensive production usage, it's recommended to integrate Chronos with AutoGluon, a tool that facilitates model ensembling and deployment.

Forecasting Example: Here's a snippet on how to perform forecasting:

import pandas as pd
import torch
from chronos import ChronosPipeline

pipeline = ChronosPipeline.from_pretrained(
    "amazon/chronos-t5-small",
    device_map="cuda",
    torch_dtype=torch.bfloat16,
)

df = pd.read_csv("https://raw.githubusercontent.com/AileenNielsen/TimeSeriesAnalysisWithPython/master/data/AirPassengers.csv")

forecast = pipeline.predict(
    context=torch.tensor(df["#Passengers"]),
    prediction_length=12,
    num_samples=20,
)

Visualization: You can visualize the forecast data using matplotlib:

import matplotlib.pyplot as plt
import numpy as np

forecast_index = range(len(df), len(df) + 12)
low, median, high = np.quantile(forecast[0].numpy(), [0.1, 0.5, 0.9], axis=0)

plt.figure(figsize=(8, 4))
plt.plot(df["#Passengers"], color="royalblue", label="historical data")
plt.plot(forecast_index, median, color="tomato", label="median forecast")
plt.fill_between(forecast_index, low, high, color="tomato", alpha=0.3, label="80% prediction interval")
plt.legend()
plt.grid()
plt.show()

Datasets

Datasets utilized in the training and evaluation of Chronos are accessible on HuggingFace, providing a rich resource of both in-domain and zero-shot datasets.

Security and License

Chronos is licensed under Apache-2.0, ensuring that it's open for community use and contribution, with clear guidelines provided for any security issues.

Conclusion

Chronos represents a significant advancement in how time series forecasting can be approached using principles borrowed from language modeling. Its ability to perform well even on unseen data—paired with its integration capabilities with other tools—makes it a versatile option for those working in predictive modeling and analytics.