TimesFM: Time Series Foundation Model¶

A Decoder-Only Foundation Model for Time-Series Forecasting by Google Research

Overview¶

TimesFM (Time Series Foundation Model) is a decoder-only architecture developed by Google Research for time-series forecasting. It's designed for efficient zero-shot forecasting across diverse domains.

Paper¶

A decoder-only foundation model for time-series forecasting

Key Features¶

✅ Decoder-only transformer architecture
✅ Efficient zero-shot forecasting
✅ Patch-based input processing
✅ Multiple quantile predictions
✅ Fast inference on GPU

Quick Start¶

from samay.model import TimesfmModel
from samay.dataset import TimesfmDataset

# Model configuration
repo = "google/timesfm-1.0-200m-pytorch"
config = {
    "context_len": 512,
    "horizon_len": 192,
    "backend": "gpu",
    "per_core_batch_size": 32,
    "input_patch_len": 32,
    "output_patch_len": 128,
    "num_layers": 20,
    "model_dims": 1280,
    "quantiles": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
}

# Load model
tfm = TimesfmModel(config=config, repo=repo)

# Load dataset
train_dataset = TimesfmDataset(
    name="ett",
    datetime_col='date',
    path='data/ETTh1.csv',
    mode='train',
    context_len=config["context_len"],
    horizon_len=config["horizon_len"]
)

# Evaluate (zero-shot)
avg_loss, trues, preds, histories = tfm.evaluate(train_dataset)
print(f"Average Loss: {avg_loss}")

Model Variants¶

TimesFM comes in multiple sizes:

Model	Parameters	Repository
TimesFM 1.0 (200M)	200M	`google/timesfm-1.0-200m-pytorch`
TimesFM 2.0 (500M)	500M	`google/timesfm-2.0-500m-pytorch`

Choosing a Model¶

# Smaller, faster model
repo = "google/timesfm-1.0-200m-pytorch"

# Larger, more accurate model
repo = "google/timesfm-2.0-500m-pytorch"

Configuration Parameters¶

Model Configuration¶

Parameter	Type	Default	Description
`context_len`	int	`512`	Length of historical context
`horizon_len`	int	`192`	Forecast horizon
`backend`	str	`"gpu"`	Backend: `"gpu"` or `"cpu"`
`per_core_batch_size`	int	`32`	Batch size per core
`input_patch_len`	int	`32`	Length of input patches
`output_patch_len`	int	`128`	Length of output patches
`num_layers`	int	`20`	Number of transformer layers
`model_dims`	int	`1280`	Model dimension
`quantiles`	list	`[0.1, ..., 0.9]`	Quantiles for prediction intervals

Example Configurations¶

Standard Configuration (200M Model)¶

config = {
    "context_len": 512,
    "horizon_len": 192,
    "backend": "gpu",
    "per_core_batch_size": 32,
    "input_patch_len": 32,
    "output_patch_len": 128,
    "num_layers": 20,
    "model_dims": 1280,
    "quantiles": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
}

Larger Model (500M)¶

config = {
    "context_len": 512,
    "horizon_len": 192,
    "backend": "gpu",
    "per_core_batch_size": 32,
    "input_patch_len": 32,
    "output_patch_len": 128,
    "num_layers": 50,  # More layers
    "model_dims": 1280,
    "quantiles": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
}

CPU Inference¶

config = {
    "context_len": 512,
    "horizon_len": 96,
    "backend": "cpu",  # Use CPU
    "per_core_batch_size": 8,  # Smaller batch
    # ... other configs
}

Dataset¶

TimesfmDataset Parameters¶

Parameter	Type	Default	Description
`name`	str	`None`	Dataset name
`datetime_col`	str	`"ds"`	Name of datetime column
`path`	str	Required	Path to CSV file
`mode`	str	`"train"`	`"train"` or `"test"`
`context_len`	int	`128`	Length of input context
`horizon_len`	int	`32`	Forecast horizon
`freq`	str	`"h"`	Frequency: `"h"`, `"d"`, `"w"`, etc.
`normalize`	bool	`False`	Whether to normalize data
`stride`	int	`10`	Stride for sliding window
`batchsize`	int	`4`	Batch size

Data Format¶

CSV file with datetime and value columns:

date,HUFL,HULL,MUFL,MULL,LUFL,LULL,OT
2016-07-01 00:00:00,5.827,2.009,1.599,0.462,5.677,2.009,6.082
2016-07-01 01:00:00,5.693,2.076,1.492,0.426,5.485,1.942,5.947
...

Zero-Shot Forecasting¶

TimesFM excels at zero-shot forecasting:

from samay.model import TimesfmModel
from samay.dataset import TimesfmDataset

# Load model
repo = "google/timesfm-1.0-200m-pytorch"
config = {
    "context_len": 512,
    "horizon_len": 192,
    "backend": "gpu",
    "per_core_batch_size": 32,
    "input_patch_len": 32,
    "output_patch_len": 128,
    "num_layers": 20,
    "model_dims": 1280,
    "quantiles": [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
}
tfm = TimesfmModel(config=config, repo=repo)

# Load test data
test_dataset = TimesfmDataset(
    name="ett",
    datetime_col='date',
    path='data/ETTh1.csv',
    mode='test',
    context_len=config["context_len"],
    horizon_len=config["horizon_len"]
)

# Zero-shot evaluation (no training!)
avg_loss, trues, preds, histories = tfm.evaluate(test_dataset)
print(f"Zero-shot Loss: {avg_loss}")

Evaluation¶

Basic Evaluation¶

test_dataset = TimesfmDataset(
    name="ett",
    datetime_col='date',
    path='data/ETTh1.csv',
    mode='test',
    context_len=512,
    horizon_len=192
)

avg_loss, trues, preds, histories = tfm.evaluate(test_dataset)

With Custom Metrics¶

from samay.metric import mse, mae, mape
import numpy as np

avg_loss, trues, preds, histories = tfm.evaluate(test_dataset)

trues = np.array(trues)
preds = np.array(preds)

print(f"MSE: {mse(trues, preds):.4f}")
print(f"MAE: {mae(trues, preds):.4f}")
print(f"MAPE: {mape(trues, preds):.4f}")

Quantile Predictions¶

TimesFM provides prediction intervals via quantiles:

config = {
    # ... other configs
    "quantiles": [0.1, 0.25, 0.5, 0.75, 0.9],  # 10%, 25%, median, 75%, 90%
}

tfm = TimesfmModel(config=config, repo=repo)

# The model will output predictions for each quantile
avg_loss, trues, preds, histories = tfm.evaluate(test_dataset)

# preds shape: (num_samples, num_channels, horizon_len, num_quantiles)

Visualizing Prediction Intervals¶

import matplotlib.pyplot as plt
import numpy as np

# Assuming preds has shape (num_samples, num_channels, horizon_len, num_quantiles)
median_idx = 2  # Index of 0.5 quantile
lower_idx = 0   # Index of 0.1 quantile
upper_idx = 4   # Index of 0.9 quantile

sample_idx = 0
channel_idx = 0

history = histories[sample_idx, channel_idx, :]
true = trues[sample_idx, channel_idx, :]

# Assuming the model returns median predictions
pred_median = preds[sample_idx, channel_idx, :]

plt.figure(figsize=(14, 5))
plt.plot(range(len(history)), history, label="History", linewidth=2)
plt.plot(
    range(len(history), len(history) + len(true)),
    true,
    label="Ground Truth",
    linestyle="--",
    linewidth=2
)
plt.plot(
    range(len(history), len(history) + len(pred_median)),
    pred_median,
    label="Prediction (Median)",
    linewidth=2
)
plt.legend()
plt.title("TimesFM Forecasting with Prediction Intervals")
plt.grid(alpha=0.3)
plt.show()

Handling Different Frequencies¶

TimesFM supports various time frequencies:

# Hourly data
dataset = TimesfmDataset(
    datetime_col='date',
    path='data/hourly.csv',
    freq='h',
    # ...
)

# Daily data
dataset = TimesfmDataset(
    datetime_col='date',
    path='data/daily.csv',
    freq='d',
    # ...
)

# Weekly data
dataset = TimesfmDataset(
    datetime_col='date',
    path='data/weekly.csv',
    freq='w',
    # ...
)

# Monthly data
dataset = TimesfmDataset(
    datetime_col='date',
    path='data/monthly.csv',
    freq='m',
    # ...
)

Normalization¶

TimesFM can optionally normalize data:

# With normalization
train_dataset = TimesfmDataset(
    name="ett",
    datetime_col='date',
    path='data/ETTh1.csv',
    mode='train',
    context_len=512,
    horizon_len=192,
    normalize=True,  # Enable normalization
)

# Denormalize predictions
avg_loss, trues, preds, histories = tfm.evaluate(train_dataset)
denormalized_preds = train_dataset._denormalize_data(preds)

Advanced Usage¶

Custom Context Lengths¶

# Short context for fast inference
config = {
    "context_len": 128,
    "horizon_len": 64,
    # ...
}

# Long context for better accuracy
config = {
    "context_len": 1024,
    "horizon_len": 256,
    # ...
}

Batch Processing¶

# Larger batches for throughput
config = {
    "per_core_batch_size": 64,
    # ...
}

# Smaller batches for memory efficiency
config = {
    "per_core_batch_size": 8,
    # ...
}

Visualization¶

import matplotlib.pyplot as plt
import numpy as np

avg_loss, trues, preds, histories = tfm.evaluate(test_dataset)

trues = np.array(trues)
preds = np.array(preds)
histories = np.array(histories)

# Plot multiple channels
fig, axes = plt.subplots(2, 2, figsize=(16, 10))
axes = axes.flatten()

for i in range(4):
    ax = axes[i]

    history = histories[0, i, :]
    true = trues[0, i, :]
    pred = preds[0, i, :]

    ax.plot(range(len(history)), history, label="History", alpha=0.7)
    ax.plot(
        range(len(history), len(history) + len(true)),
        true,
        label="Ground Truth",
        linestyle="--"
    )
    ax.plot(
        range(len(history), len(history) + len(pred)),
        pred,
        label="Prediction"
    )
    ax.set_title(f"Channel {i}")
    ax.legend()
    ax.grid(alpha=0.3)

plt.tight_layout()
plt.show()

Tips and Best Practices¶

1. Model Selection¶

Use 200M model for faster inference
Use 500M model for higher accuracy

2. Context Length¶

Longer context (512-1024) for complex patterns
Shorter context (128-256) for simpler patterns and speed

3. Zero-Shot vs Fine-Tuning¶

TimesFM is designed for zero-shot forecasting
Fine-tuning is not typically required

4. GPU Memory¶

Reduce per_core_batch_size if OOM
Use CPU backend for very limited memory

Common Issues¶

CUDA Out of Memory¶

# Reduce batch size
config = {
    "per_core_batch_size": 8,  # Lower value
    # ...
}

# Or use CPU
config = {
    "backend": "cpu",
    # ...
}

Slow Inference¶

# Use smaller model
repo = "google/timesfm-1.0-200m-pytorch"

# Reduce context length
config = {
    "context_len": 256,  # Instead of 512
    # ...
}

API Reference¶

For detailed API documentation, see:

Examples¶

See the Examples page for complete working examples.