Chronos: Learning the Language of Time Series¶
Overview¶
Chronos is a novel approach to time-series forecasting that treats time series as a language. It uses transformer architectures similar to large language models (LLMs) to tokenize and predict time-series data. This innovative approach enables zero-shot forecasting across diverse domains.
Paper¶
Chronos: Learning the Language of Time Series
Key Features¶
- ✅ Language model architecture for time series
- ✅ Tokenization-based approach
- ✅ Strong zero-shot capabilities
- ✅ Multiple model sizes
- ✅ Probabilistic forecasting
Model Variants¶
Chronos comes in several sizes:
| Model | Parameters | Use Case |
|---|---|---|
| Chronos-T5-tiny | ~8M | Fast inference, resource-constrained |
| Chronos-T5-mini | ~20M | Balanced performance |
| Chronos-T5-small | ~46M | Good accuracy |
| Chronos-T5-base | ~200M | High accuracy |
| Chronos-T5-large | ~800M | Best performance |
Quick Start¶
from samay.model import ChronosModel
from samay.dataset import ChronosDataset
# Model configuration
config = {
"model_size": "small", # tiny, mini, small, base, large
"context_length": 512,
"prediction_length": 64,
"num_samples": 20,
"temperature": 1.0,
"top_k": 50,
"top_p": 1.0,
}
# Load model
model = ChronosModel(config)
# Load dataset
train_dataset = ChronosDataset(
name="ett",
datetime_col="date",
path="./data/ETTh1.csv",
mode="train",
config=config,
)
# Evaluate (zero-shot)
test_dataset = ChronosDataset(
name="ett",
datetime_col="date",
path="./data/ETTh1.csv",
mode="test",
config=config,
)
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
print(f"Average Loss: {avg_loss}")
Configuration Parameters¶
Model Configuration¶
| Parameter | Type | Default | Description |
|---|---|---|---|
model_size |
str | "small" |
Model size: "tiny", "mini", "small", "base", "large" |
context_length |
int | 512 |
Length of input context |
prediction_length |
int | 64 |
Forecast horizon |
num_samples |
int | 20 |
Number of samples for probabilistic forecasting |
temperature |
float | 1.0 |
Sampling temperature |
top_k |
int | 50 |
Top-k sampling parameter |
top_p |
float | 1.0 |
Nucleus sampling parameter |
tokenizer_class |
str | "MeanScaleUniformBins" |
Tokenizer type |
tokenizer_kwargs |
dict | {"low_limit": -15.0, "high_limit": 15.0} |
Tokenizer parameters |
Example Configurations¶
Fast Inference (Tiny Model)¶
config = {
"model_size": "tiny",
"context_length": 256,
"prediction_length": 32,
"num_samples": 10,
}
Balanced Performance (Small Model)¶
config = {
"model_size": "small",
"context_length": 512,
"prediction_length": 64,
"num_samples": 20,
"temperature": 1.0,
}
High Accuracy (Base Model)¶
config = {
"model_size": "base",
"context_length": 512,
"prediction_length": 96,
"num_samples": 50,
"temperature": 0.8,
}
Dataset¶
ChronosDataset Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
name |
str | None |
Dataset name |
datetime_col |
str | "ds" |
Name of datetime column |
path |
str | Required | Path to CSV file |
mode |
str | None |
"train" or "test" |
batch_size |
int | 16 |
Batch size |
boundaries |
list | [0, 0, 0] |
Custom split boundaries |
stride |
int | 10 |
Stride for sliding window |
config |
dict | None |
Model configuration (used for context/prediction length) |
Data Format¶
CSV file with datetime and value columns:
date,HUFL,HULL,MUFL,MULL,LUFL,LULL,OT
2016-07-01 00:00:00,5.827,2.009,1.599,0.462,5.677,2.009,6.082
2016-07-01 01:00:00,5.693,2.076,1.492,0.426,5.485,1.942,5.947
...
Zero-Shot Forecasting¶
Chronos excels at zero-shot forecasting without any fine-tuning:
from samay.model import ChronosModel
from samay.dataset import ChronosDataset
# Load model
config = {
"model_size": "small",
"context_length": 512,
"prediction_length": 96,
"num_samples": 20,
}
model = ChronosModel(config)
# Load test data directly
test_dataset = ChronosDataset(
name="ett",
datetime_col="date",
path="./data/ETTh1.csv",
mode="test",
config=config,
)
# Zero-shot evaluation (no training!)
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
print(f"Zero-shot Loss: {avg_loss}")
Probabilistic Forecasting¶
Chronos provides probabilistic forecasts through multiple samples:
config = {
"model_size": "small",
"context_length": 512,
"prediction_length": 96,
"num_samples": 100, # Generate 100 samples
"temperature": 1.0,
}
model = ChronosModel(config)
# The model will generate multiple forecast samples
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
# preds shape: (num_windows, num_channels, prediction_length, num_samples)
Analyzing Prediction Uncertainty¶
import numpy as np
import matplotlib.pyplot as plt
# Get all samples
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
# Calculate statistics
mean_pred = np.mean(preds, axis=-1) # Mean across samples
std_pred = np.std(preds, axis=-1) # Std across samples
lower_bound = np.percentile(preds, 10, axis=-1)
upper_bound = np.percentile(preds, 90, axis=-1)
# Plot with uncertainty bands
sample_idx = 0
channel_idx = 0
history = histories[sample_idx, channel_idx, :]
true = trues[sample_idx, channel_idx, :]
pred_mean = mean_pred[sample_idx, channel_idx, :]
pred_lower = lower_bound[sample_idx, channel_idx, :]
pred_upper = upper_bound[sample_idx, channel_idx, :]
plt.figure(figsize=(14, 5))
plt.plot(range(len(history)), history, label="History", linewidth=2)
plt.plot(
range(len(history), len(history) + len(true)),
true,
label="Ground Truth",
linestyle="--",
linewidth=2
)
plt.plot(
range(len(history), len(history) + len(pred_mean)),
pred_mean,
label="Mean Prediction",
linewidth=2
)
plt.fill_between(
range(len(history), len(history) + len(pred_mean)),
pred_lower,
pred_upper,
alpha=0.3,
label="80% Prediction Interval"
)
plt.legend()
plt.title("Chronos Probabilistic Forecasting")
plt.grid(alpha=0.3)
plt.show()
Fine-Tuning¶
While Chronos is designed for zero-shot forecasting, you can fine-tune it on your data:
# Load model
config = {
"model_size": "small",
"context_length": 512,
"prediction_length": 96,
}
model = ChronosModel(config)
# Load training data
train_dataset = ChronosDataset(
name="ett",
datetime_col="date",
path="./data/ETTh1.csv",
mode="train",
config=config,
batch_size=16,
)
# Fine-tune
finetuned_model = model.finetune(
train_dataset,
epochs=5,
learning_rate=1e-5, # Small learning rate
)
# Evaluate
test_dataset = ChronosDataset(
name="ett",
datetime_col="date",
path="./data/ETTh1.csv",
mode="test",
config=config,
)
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
Sampling Strategies¶
Temperature Sampling¶
Control the randomness of predictions:
# Lower temperature = more conservative predictions
config = {
"temperature": 0.5, # More deterministic
# ...
}
# Higher temperature = more diverse predictions
config = {
"temperature": 1.5, # More exploratory
# ...
}
Top-K Sampling¶
Limit sampling to top-k most likely tokens:
Nucleus (Top-P) Sampling¶
Sample from the smallest set of tokens with cumulative probability > p:
Evaluation¶
Basic Evaluation¶
test_dataset = ChronosDataset(
datetime_col="date",
path="./data/ETTh1.csv",
mode="test",
config=config,
)
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
With Custom Metrics¶
from samay.metric import mse, mae, mape
import numpy as np
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
# Use mean of samples for metrics
trues = np.array(trues)
preds = np.mean(np.array(preds), axis=-1) # Average across samples
print(f"MSE: {mse(trues, preds):.4f}")
print(f"MAE: {mae(trues, preds):.4f}")
print(f"MAPE: {mape(trues, preds):.4f}")
Advanced Usage¶
Custom Context Lengths¶
# Short context for simpler patterns
config = {
"context_length": 128,
"prediction_length": 32,
# ...
}
# Long context for complex patterns
config = {
"context_length": 1024,
"prediction_length": 128,
# ...
}
Multivariate Forecasting¶
Chronos handles multivariate data by forecasting each channel independently:
# Your CSV with multiple columns
dataset = ChronosDataset(
datetime_col="date",
path="./data/multivariate.csv",
mode="test",
config=config,
)
# Model will forecast all channels
avg_loss, trues, preds, histories = model.evaluate(dataset)
Tokenization¶
Chronos uses a tokenization approach similar to NLP:
Mean-Scale Uniform Bins¶
The default tokenizer normalizes values and bins them:
config = {
"tokenizer_class": "MeanScaleUniformBins",
"tokenizer_kwargs": {
"low_limit": -15.0, # Lower bound for binning
"high_limit": 15.0, # Upper bound for binning
},
# ...
}
Custom Tokenizer¶
You can customize the tokenization:
config = {
"tokenizer_class": "MeanScaleUniformBins",
"tokenizer_kwargs": {
"low_limit": -10.0,
"high_limit": 10.0,
"n_tokens": 4096, # Vocabulary size
},
}
Visualization¶
Multiple Forecast Samples¶
import matplotlib.pyplot as plt
import numpy as np
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
# Plot multiple samples
sample_idx = 0
channel_idx = 0
num_samples_to_plot = 10
history = histories[sample_idx, channel_idx, :]
true = trues[sample_idx, channel_idx, :]
plt.figure(figsize=(14, 5))
plt.plot(range(len(history)), history, label="History", linewidth=2, color='blue')
plt.plot(
range(len(history), len(history) + len(true)),
true,
label="Ground Truth",
linestyle="--",
linewidth=2,
color='green'
)
# Plot multiple forecast samples
for i in range(num_samples_to_plot):
pred_sample = preds[sample_idx, channel_idx, :, i]
plt.plot(
range(len(history), len(history) + len(pred_sample)),
pred_sample,
alpha=0.3,
color='red'
)
# Plot mean prediction
pred_mean = np.mean(preds[sample_idx, channel_idx, :, :], axis=-1)
plt.plot(
range(len(history), len(history) + len(pred_mean)),
pred_mean,
label="Mean Prediction",
linewidth=2,
color='red'
)
plt.legend()
plt.title("Chronos: Multiple Forecast Samples")
plt.grid(alpha=0.3)
plt.show()
Tips and Best Practices¶
1. Model Selection¶
- Use tiny/mini for fast inference and experimentation
- Use small/base for production applications
- Use large for highest accuracy (if resources allow)
2. Context Length¶
- Longer context captures more patterns but is slower
- Start with 512, adjust based on your data
3. Number of Samples¶
- More samples = better uncertainty estimation
- 20-50 samples is usually sufficient
- Use fewer samples for faster inference
4. Zero-Shot vs Fine-Tuning¶
- Try zero-shot first (Chronos is designed for this)
- Fine-tune only if zero-shot performance is insufficient
- Use very small learning rates when fine-tuning
Common Issues¶
CUDA Out of Memory¶
# Use smaller model
config = {
"model_size": "tiny", # Instead of "base"
# ...
}
# Reduce batch size
dataset = ChronosDataset(
batch_size=4, # Instead of 16
# ...
)
# Reduce context length
config = {
"context_length": 256, # Instead of 512
# ...
}
Slow Inference¶
# Use smaller model
config = {
"model_size": "mini",
# ...
}
# Reduce number of samples
config = {
"num_samples": 10, # Instead of 50
# ...
}
Poor Predictions¶
# Try larger model
config = {
"model_size": "base", # Instead of "small"
# ...
}
# Increase context length
config = {
"context_length": 1024, # More context
# ...
}
# Adjust temperature
config = {
"temperature": 0.8, # Less randomness
# ...
}
API Reference¶
For detailed API documentation, see:
Examples¶
See the Examples page for complete working examples.