Skip to content

LPTM: Large Pre-trained Time Series Model

Large Pre-trained Time Series Models for Cross-Domain Time Series Analysis Tasks

Overview

LPTM (Large Pre-trained Time Series Model) is a foundational model designed for general-purpose time-series forecasting. It uses a transformer-based architecture with a unique segmentation module that adaptively identifies patterns in time-series data.

Paper

Large Pre-trained time series models for cross-domain Time series analysis tasks

Key Features

  • ✅ Pre-trained on large-scale time-series data
  • ✅ Adaptive segmentation for pattern discovery
  • ✅ Supports forecasting, classification, and anomaly detection
  • ✅ Efficient fine-tuning with frozen encoders
  • ✅ Handles multivariate time series

Quick Start

from samay.model import LPTMModel
from samay.dataset import LPTMDataset

# Configure the model
config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}

# Load model
model = LPTMModel(config)

# Load dataset
train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
)

# Fine-tune
finetuned_model = model.finetune(train_dataset)

# Evaluate
test_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    horizon=192,
)

avg_loss, trues, preds, histories = model.evaluate(test_dataset)

Configuration Parameters

Model Configuration

Parameter Type Default Description
task_name str "forecasting" Task type: "forecasting", "classification", "detection"
forecast_horizon int 192 Number of time steps to predict
freeze_encoder bool True Whether to freeze the patch embedding layer
freeze_embedder bool True Whether to freeze the transformer encoder
freeze_head bool False Whether to freeze the forecasting head
freeze_segment bool True Whether to freeze the segmentation module
head_dropout float 0.0 Dropout rate for the forecasting head
weight_decay float 0.0 Weight decay for regularization
max_patch int 16 Maximum patch size for segmentation

Example Configurations

Zero-Shot Forecasting

config = {
    "task_name": "forecasting",
    "forecast_horizon": 96,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": True,  # Keep all layers frozen
}

Fine-Tuning for Domain Adaptation

config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,  # Only train the head
    "head_dropout": 0.1,
    "weight_decay": 0.001,
}

Full Fine-Tuning

config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": False,
    "freeze_embedder": False,
    "freeze_head": False,
    "head_dropout": 0.1,
}

Dataset

LPTMDataset Parameters

Parameter Type Default Description
name str None Dataset name (for metadata)
datetime_col str None Name of the datetime column
path str Required Path to CSV file
mode str "train" "train" or "test"
horizon int 0 Forecast horizon length
batchsize int 16 Batch size for training
boundaries list [0, 0, 0] Custom train/val/test split indices
stride int 10 Stride for sliding window
seq_len int 512 Input sequence length
task_name str "forecasting" Task type

Data Format

Your CSV file should have: - A datetime column (e.g., date) - One or more value columns

Example:

date,HUFL,HULL,MUFL,MULL,LUFL,LULL,OT
2016-07-01 00:00:00,5.827,2.009,1.599,0.462,5.677,2.009,6.082
2016-07-01 01:00:00,5.693,2.076,1.492,0.426,5.485,1.942,5.947
...


Training

Fine-Tuning

# Create training dataset
train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
    batchsize=16,
)

# Fine-tune the model
finetuned_model = model.finetune(
    train_dataset,
    epochs=5,
    learning_rate=1e-4,
)

Custom Training Loop

For more control, you can implement a custom training loop:

import torch
from torch.optim import Adam

# Get data loader
train_loader = train_dataset.get_data_loader()

# Setup optimizer
optimizer = Adam(model.parameters(), lr=1e-4)

# Training loop
for epoch in range(5):
    total_loss = 0
    for batch in train_loader:
        optimizer.zero_grad()

        # Forward pass
        loss = model.compute_loss(batch)

        # Backward pass
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    avg_loss = total_loss / len(train_loader)
    print(f"Epoch {epoch}: Loss = {avg_loss:.4f}")

Evaluation

Basic Evaluation

test_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    horizon=192,
)

avg_loss, trues, preds, histories = model.evaluate(test_dataset)
print(f"Average Test Loss: {avg_loss}")

Custom Metrics

from samay.metric import mse, mae, mape

# Get predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

# Calculate custom metrics
import numpy as np
trues = np.array(trues)
preds = np.array(preds)

mse_score = mse(trues, preds)
mae_score = mae(trues, preds)
mape_score = mape(trues, preds)

print(f"MSE: {mse_score:.4f}")
print(f"MAE: {mae_score:.4f}")
print(f"MAPE: {mape_score:.4f}")

Tasks

1. Forecasting

Predict future values:

config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)

2. Anomaly Detection

Detect anomalies in time series:

config = {
    "task_name": "detection",
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)

dataset = LPTMDataset(
    name="ecg",
    datetime_col="date",
    path="./data/ECG5000.csv",
    mode="train",
    task_name="detection",
)

3. Classification

Classify time series:

config = {
    "task_name": "classification",
    "num_classes": 5,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)

Advanced Usage

Handling Multivariate Time Series

LPTM naturally handles multivariate data:

# Your CSV with multiple columns
# date,sensor1,sensor2,sensor3,...
train_dataset = LPTMDataset(
    datetime_col="date",
    path="./data/multivariate.csv",
    mode="train",
    horizon=192,
)

Custom Data Splits

# Specify exact boundaries
train_dataset = LPTMDataset(
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
    boundaries=[0, 10000, 15000],  # Train: 0-10000, Val: 10000-15000, Test: 15000-end
)

Denormalizing Predictions

# Get normalized predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

# Denormalize using the dataset's scaler
denormalized_preds = test_dataset._denormalize_data(preds)
denormalized_trues = test_dataset._denormalize_data(trues)

Visualization

Plotting Forecasts

import matplotlib.pyplot as plt
import numpy as np

# Get predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

trues = np.array(trues)
preds = np.array(preds)
histories = np.array(histories)

# Plot a specific channel and time window
channel_idx = 0
time_index = 0

history = histories[time_index, channel_idx, :]
true = trues[time_index, channel_idx, :]
pred = preds[time_index, channel_idx, :]

plt.figure(figsize=(14, 5))
plt.plot(range(len(history)), history, label="History (512 steps)", linewidth=2)
plt.plot(
    range(len(history), len(history) + len(true)),
    true,
    label="Ground Truth (192 steps)",
    linestyle="--",
    linewidth=2,
)
plt.plot(
    range(len(history), len(history) + len(pred)),
    pred,
    label="Prediction (192 steps)",
    linewidth=2,
)
plt.axvline(x=len(history), color='gray', linestyle=':', alpha=0.5)
plt.legend()
plt.title("LPTM Time Series Forecasting")
plt.xlabel("Time Step")
plt.ylabel("Value")
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

Tips and Best Practices

1. Choose Appropriate Forecast Horizons

  • Short-term: 24-96 steps
  • Medium-term: 192-336 steps
  • Long-term: 720+ steps

2. Fine-Tuning Strategy

  • Start with frozen encoder and embedder
  • Only train the forecasting head
  • If results are unsatisfactory, gradually unfreeze layers

3. Batch Size

  • Larger batch sizes (32-64) for stable training
  • Smaller batch sizes (8-16) if GPU memory is limited

4. Data Preprocessing

  • LPTM handles normalization internally
  • Ensure datetime column is properly formatted
  • Handle missing values before loading

Common Issues

Out of Memory

Reduce batch size or forecast horizon:

config = {
    "forecast_horizon": 96,  # Instead of 192
}

dataset = LPTMDataset(
    batchsize=8,  # Instead of 16
    # ...
)

Poor Performance

Try full fine-tuning:

config = {
    "freeze_encoder": False,
    "freeze_embedder": False,
    "freeze_head": False,
    "head_dropout": 0.1,
    "weight_decay": 0.001,
}


API Reference

For detailed API documentation, see:


Examples

See the Examples page for complete working examples.