Skip to content

TinyTimeMixer: Fast Pre-trained Models for Time Series

Lightweight and Efficient Time-Series Foundation Model

Overview

TinyTimeMixer (TTM) is a compact and efficient time-series forecasting model designed for fast inference and low memory footprint. It uses a mixer-based architecture that balances performance with computational efficiency, making it ideal for resource-constrained environments.

Paper

TinyTimeMixer: Fast Pre-trained Models for Time Series

Key Features

  • ✅ Lightweight architecture (compact model size)
  • ✅ Fast inference speed
  • ✅ Low memory footprint
  • ✅ Competitive forecasting accuracy
  • ✅ Efficient training and fine-tuning
  • ✅ Multivariate time-series support

Quick Start

from samay.model import TinyTimeMixerModel
from samay.dataset import TinyTimeMixerDataset

# Model configuration
config = {
    "context_len": 512,
    "horizon_len": 96,
    "model_size": "tiny",
}

# Load model
model = TinyTimeMixerModel(config)

# Load dataset
train_dataset = TinyTimeMixerDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    context_len=config["context_len"],
    horizon_len=config["horizon_len"],
)

# Fine-tune
finetuned_model = model.finetune(train_dataset, epochs=10)

# Evaluate
test_dataset = TinyTimeMixerDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    context_len=config["context_len"],
    horizon_len=config["horizon_len"],
)

avg_loss, trues, preds, histories = model.evaluate(test_dataset)
print(f"Average Loss: {avg_loss}")

Model Variants

TinyTimeMixer comes in different sizes:

Variant Parameters Speed Accuracy
Tiny ~1M Fastest Good
Small ~5M Fast Better
Base ~15M Moderate Best

Configuration Parameters

Model Configuration

Parameter Type Default Description
context_len int 512 Length of input context
horizon_len int 96 Forecast horizon
model_size str "tiny" Model size: "tiny", "small", "base"
d_model int 64 Model dimension
n_heads int 4 Number of attention heads
n_layers int 4 Number of mixer layers
dropout float 0.1 Dropout rate

Example Configurations

Tiny Model (Fast Inference)

config = {
    "context_len": 512,
    "horizon_len": 96,
    "model_size": "tiny",
    "d_model": 64,
    "n_layers": 4,
}

Small Model (Balanced)

config = {
    "context_len": 512,
    "horizon_len": 96,
    "model_size": "small",
    "d_model": 128,
    "n_layers": 6,
}

Base Model (High Accuracy)

config = {
    "context_len": 512,
    "horizon_len": 192,
    "model_size": "base",
    "d_model": 256,
    "n_layers": 8,
}

Dataset

TinyTimeMixerDataset Parameters

Parameter Type Default Description
name str None Dataset name
datetime_col str "ds" Name of datetime column
path str Required Path to CSV file
mode str None "train" or "test"
context_len int 512 Length of input context
horizon_len int 64 Forecast horizon
batch_size int 128 Batch size
boundaries list [0, 0, 0] Custom split boundaries
stride int 10 Stride for sliding window

Data Format

CSV file with datetime and value columns:

date,HUFL,HULL,MUFL,MULL,LUFL,LULL,OT
2016-07-01 00:00:00,5.827,2.009,1.599,0.462,5.677,2.009,6.082
2016-07-01 01:00:00,5.693,2.076,1.492,0.426,5.485,1.942,5.947
...

Training

Basic Training

from samay.model import TinyTimeMixerModel
from samay.dataset import TinyTimeMixerDataset

# Configure model
config = {
    "context_len": 512,
    "horizon_len": 96,
    "model_size": "tiny",
}

model = TinyTimeMixerModel(config)

# Load training data
train_dataset = TinyTimeMixerDataset(
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    context_len=512,
    horizon_len=96,
    batch_size=128,
)

# Fine-tune
finetuned_model = model.finetune(
    train_dataset,
    epochs=20,
    learning_rate=1e-3,
)

Training with Validation

# Training dataset
train_dataset = TinyTimeMixerDataset(
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    context_len=512,
    horizon_len=96,
    boundaries=[0, 10000, 15000],  # Custom split
)

# Validation dataset
val_dataset = TinyTimeMixerDataset(
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="val",
    context_len=512,
    horizon_len=96,
    boundaries=[0, 10000, 15000],
)

# Fine-tune with validation
finetuned_model = model.finetune(
    train_dataset,
    val_dataset=val_dataset,
    epochs=20,
    learning_rate=1e-3,
)

Evaluation

Basic Evaluation

test_dataset = TinyTimeMixerDataset(
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    context_len=512,
    horizon_len=96,
)

avg_loss, trues, preds, histories = model.evaluate(test_dataset)
print(f"Average Test Loss: {avg_loss}")

With Custom Metrics

from samay.metric import mse, mae, mape, rmse
import numpy as np

avg_loss, trues, preds, histories = model.evaluate(test_dataset)

trues = np.array(trues)
preds = np.array(preds)

print(f"MSE:  {mse(trues, preds):.4f}")
print(f"MAE:  {mae(trues, preds):.4f}")
print(f"RMSE: {rmse(trues, preds):.4f}")
print(f"MAPE: {mape(trues, preds):.4f}%")

Zero-Shot Forecasting

TinyTimeMixer supports zero-shot forecasting:

# Load pre-trained model
config = {
    "context_len": 512,
    "horizon_len": 96,
    "model_size": "tiny",
}

model = TinyTimeMixerModel(config)

# Test on new data without training
test_dataset = TinyTimeMixerDataset(
    datetime_col="date",
    path="./data/new_domain.csv",
    mode="test",
    context_len=512,
    horizon_len=96,
)

# Zero-shot evaluation
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

Multivariate Forecasting

TinyTimeMixer handles multivariate data efficiently:

# Your CSV with multiple value columns
dataset = TinyTimeMixerDataset(
    datetime_col="date",
    path="./data/multivariate.csv",  # Multiple columns
    mode="train",
    context_len=512,
    horizon_len=96,
)

# Model forecasts all channels simultaneously
avg_loss, trues, preds, histories = model.evaluate(dataset)

# Results shape: (num_windows, num_channels, horizon_len)
print(f"Predictions shape: {preds.shape}")

Advanced Usage

Custom Context Lengths

# Short context for simple patterns
config = {
    "context_len": 256,
    "horizon_len": 64,
    "model_size": "tiny",
}

# Long context for complex patterns
config = {
    "context_len": 1024,
    "horizon_len": 192,
    "model_size": "small",
}

Batch Size Tuning

# Large batch for faster training (if memory allows)
dataset = TinyTimeMixerDataset(
    # ...
    batch_size=256,
)

# Small batch for memory efficiency
dataset = TinyTimeMixerDataset(
    # ...
    batch_size=32,
)

Stride Configuration

# Smaller stride for more training samples
dataset = TinyTimeMixerDataset(
    # ...
    stride=1,  # Overlapping windows
)

# Larger stride for faster iteration
dataset = TinyTimeMixerDataset(
    # ...
    stride=96,  # Non-overlapping windows
)

Visualization

Single Channel Forecast

import matplotlib.pyplot as plt
import numpy as np

avg_loss, trues, preds, histories = model.evaluate(test_dataset)

trues = np.array(trues)
preds = np.array(preds)
histories = np.array(histories)

# Plot first window, first channel
window_idx = 0
channel_idx = 0

history = histories[window_idx, channel_idx, :]
true = trues[window_idx, channel_idx, :]
pred = preds[window_idx, channel_idx, :]

plt.figure(figsize=(14, 5))
plt.plot(range(len(history)), history, label="History (512 steps)", linewidth=2)
plt.plot(
    range(len(history), len(history) + len(true)),
    true,
    label="Ground Truth (96 steps)",
    linestyle="--",
    linewidth=2
)
plt.plot(
    range(len(history), len(history) + len(pred)),
    pred,
    label="TinyTimeMixer Prediction",
    linewidth=2
)
plt.axvline(x=len(history), color='gray', linestyle=':', alpha=0.5)
plt.legend()
plt.title("TinyTimeMixer Time Series Forecasting")
plt.xlabel("Time Step")
plt.ylabel("Value")
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

Multiple Channels

fig, axes = plt.subplots(2, 2, figsize=(16, 10))
axes = axes.flatten()

for i in range(min(4, trues.shape[1])):
    ax = axes[i]

    history = histories[0, i, :]
    true = trues[0, i, :]
    pred = preds[0, i, :]

    ax.plot(range(len(history)), history, label="History", alpha=0.7)
    ax.plot(
        range(len(history), len(history) + len(true)),
        true,
        label="Ground Truth",
        linestyle="--"
    )
    ax.plot(
        range(len(history), len(history) + len(pred)),
        pred,
        label="Prediction"
    )
    ax.set_title(f"Channel {i}")
    ax.legend()
    ax.grid(alpha=0.3)

plt.suptitle("TinyTimeMixer Multi-Channel Forecasting")
plt.tight_layout()
plt.show()

Error Distribution

import matplotlib.pyplot as plt
import numpy as np

avg_loss, trues, preds, histories = model.evaluate(test_dataset)

trues = np.array(trues)
preds = np.array(preds)

# Calculate errors
errors = trues - preds

plt.figure(figsize=(12, 5))

# Error distribution
plt.subplot(1, 2, 1)
plt.hist(errors.flatten(), bins=50, alpha=0.7, edgecolor='black')
plt.xlabel("Prediction Error")
plt.ylabel("Frequency")
plt.title("Error Distribution")
plt.grid(alpha=0.3)

# Error over time
plt.subplot(1, 2, 2)
mean_abs_errors = np.mean(np.abs(errors), axis=(0, 1))
plt.plot(mean_abs_errors)
plt.xlabel("Time Step")
plt.ylabel("Mean Absolute Error")
plt.title("Error Over Forecast Horizon")
plt.grid(alpha=0.3)

plt.tight_layout()
plt.show()

Performance Comparison

Speed Benchmark

import time

models = [
    ("Tiny", {"model_size": "tiny", "d_model": 64}),
    ("Small", {"model_size": "small", "d_model": 128}),
    ("Base", {"model_size": "base", "d_model": 256}),
]

for name, model_config in models:
    config = {
        "context_len": 512,
        "horizon_len": 96,
        **model_config
    }

    model = TinyTimeMixerModel(config)

    # Measure inference time
    start_time = time.time()
    avg_loss, trues, preds, histories = model.evaluate(test_dataset)
    elapsed_time = time.time() - start_time

    print(f"{name} Model:")
    print(f"  Loss: {avg_loss:.4f}")
    print(f"  Time: {elapsed_time:.2f}s")
    print()

Tips and Best Practices

1. Model Selection

  • Use Tiny for edge devices and real-time applications
  • Use Small for balanced performance and speed
  • Use Base when accuracy is more important than speed

2. Context Length

  • Longer context captures more patterns but is slower
  • Match context to your data's seasonal patterns
  • Start with 512 and adjust based on results

3. Batch Size

  • TinyTimeMixer supports large batch sizes (128-256)
  • Larger batches = faster training
  • Reduce batch size if OOM errors occur

4. Training Duration

  • TinyTimeMixer trains quickly (10-20 epochs often sufficient)
  • Monitor validation loss to avoid overfitting
  • Early stopping is recommended

Common Issues

CUDA Out of Memory

# Use smaller model
config = {
    "model_size": "tiny",
    "d_model": 64,
    # ...
}

# Reduce batch size
dataset = TinyTimeMixerDataset(
    batch_size=32,  # Instead of 128
    # ...
)

# Reduce context/horizon
config = {
    "context_len": 256,  # Instead of 512
    "horizon_len": 48,   # Instead of 96
}

Slow Training

# Increase batch size (if memory allows)
dataset = TinyTimeMixerDataset(
    batch_size=256,  # Larger batch
    # ...
)

# Reduce model size
config = {
    "model_size": "tiny",  # Smaller model
    # ...
}

Poor Accuracy

# Use larger model
config = {
    "model_size": "base",  # Larger model
    "d_model": 256,
    "n_layers": 8,
}

# Increase context length
config = {
    "context_len": 1024,  # More context
    # ...
}

# Train longer
model.finetune(train_dataset, epochs=50)  # More epochs

Efficient Deployment

CPU Inference

TinyTimeMixer is efficient on CPU:

import torch

# Force CPU usage
device = torch.device("cpu")
model = TinyTimeMixerModel(config).to(device)

# Inference is still fast!
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

Model Export

Export for production deployment:

# Save model
model.save("tinytimemixer_model.pt")

# Load model
loaded_model = TinyTimeMixerModel.load("tinytimemixer_model.pt")

Quantization (for even faster inference)

import torch

# Quantize model for faster inference
quantized_model = torch.quantization.quantize_dynamic(
    model,
    {torch.nn.Linear},
    dtype=torch.qint8
)

# Use quantized model
avg_loss, trues, preds, histories = quantized_model.evaluate(test_dataset)

API Reference

For detailed API documentation, see:


Examples

See the Examples page for complete working examples.


Comparison with Other Models

Feature TinyTimeMixer LPTM TimesFM MOMENT
Speed ⭐⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐⭐⭐ ⭐⭐⭐
Memory ⭐⭐⭐⭐⭐ ⭐⭐⭐ ⭐⭐ ⭐⭐⭐
Accuracy ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ ⭐⭐⭐⭐
Edge Deployment ⭐⭐⭐⭐⭐ ⭐⭐ ⭐⭐ ⭐⭐⭐

Use TinyTimeMixer when: - You need fast inference - Memory is limited - Deploying on edge devices - Real-time forecasting is required