LPTM: Large Pre-trained Time Series Model¶

Large Pre-trained Time Series Models for Cross-Domain Time Series Analysis Tasks

Overview¶

LPTM (Large Pre-trained Time Series Model) is a foundational model designed for general-purpose time-series forecasting. It uses a transformer-based architecture with a unique segmentation module that adaptively identifies patterns in time-series data.

Paper¶

Large Pre-trained time series models for cross-domain Time series analysis tasks

Key Features¶

✅ Pre-trained on large-scale time-series data
✅ Adaptive segmentation for pattern discovery
✅ Supports forecasting, classification, and anomaly detection
✅ Efficient fine-tuning with frozen encoders
✅ Handles multivariate time series

Quick Start¶

from samay.model import LPTMModel
from samay.dataset import LPTMDataset

# Configure the model
config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}

# Load model
model = LPTMModel(config)

# Load dataset
train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
)

# Fine-tune
finetuned_model = model.finetune(train_dataset)

# Evaluate
test_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    horizon=192,
)

avg_loss, trues, preds, histories = model.evaluate(test_dataset)

Configuration Parameters¶

Model Configuration¶

Parameter	Type	Default	Description
`task_name`	str	`"forecasting"`	Task type: `"forecasting"`, `"classification"`, `"detection"`
`forecast_horizon`	int	`192`	Number of time steps to predict
`freeze_encoder`	bool	`True`	Whether to freeze the patch embedding layer
`freeze_embedder`	bool	`True`	Whether to freeze the transformer encoder
`freeze_head`	bool	`False`	Whether to freeze the forecasting head
`freeze_segment`	bool	`True`	Whether to freeze the segmentation module
`head_dropout`	float	`0.0`	Dropout rate for the forecasting head
`weight_decay`	float	`0.0`	Weight decay for regularization
`max_patch`	int	`16`	Maximum patch size for segmentation

Example Configurations¶

Zero-Shot Forecasting¶

config = {
    "task_name": "forecasting",
    "forecast_horizon": 96,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": True,  # Keep all layers frozen
}

Fine-Tuning for Domain Adaptation¶

config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,  # Only train the head
    "head_dropout": 0.1,
    "weight_decay": 0.001,
}

Full Fine-Tuning¶

config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": False,
    "freeze_embedder": False,
    "freeze_head": False,
    "head_dropout": 0.1,
}

Dataset¶

LPTMDataset Parameters¶

Parameter	Type	Default	Description
`name`	str	`None`	Dataset name (for metadata)
`datetime_col`	str	`None`	Name of the datetime column
`path`	str	Required	Path to CSV file
`mode`	str	`"train"`	`"train"` or `"test"`
`horizon`	int	`0`	Forecast horizon length
`batchsize`	int	`16`	Batch size for training
`boundaries`	list	`[0, 0, 0]`	Custom train/val/test split indices
`stride`	int	`10`	Stride for sliding window
`seq_len`	int	`512`	Input sequence length
`task_name`	str	`"forecasting"`	Task type

Data Format¶

Your CSV file should have: - A datetime column (e.g., date) - One or more value columns

Example:

date,HUFL,HULL,MUFL,MULL,LUFL,LULL,OT
2016-07-01 00:00:00,5.827,2.009,1.599,0.462,5.677,2.009,6.082
2016-07-01 01:00:00,5.693,2.076,1.492,0.426,5.485,1.942,5.947
...

Training¶

Fine-Tuning¶

# Create training dataset
train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
    batchsize=16,
)

# Fine-tune the model
finetuned_model = model.finetune(
    train_dataset,
    epochs=5,
    learning_rate=1e-4,
)

Custom Training Loop¶

For more control, you can implement a custom training loop:

import torch
from torch.optim import Adam

# Get data loader
train_loader = train_dataset.get_data_loader()

# Setup optimizer
optimizer = Adam(model.parameters(), lr=1e-4)

# Training loop
for epoch in range(5):
    total_loss = 0
    for batch in train_loader:
        optimizer.zero_grad()

        # Forward pass
        loss = model.compute_loss(batch)

        # Backward pass
        loss.backward()
        optimizer.step()

        total_loss += loss.item()

    avg_loss = total_loss / len(train_loader)
    print(f"Epoch {epoch}: Loss = {avg_loss:.4f}")

Evaluation¶

Basic Evaluation¶

test_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    horizon=192,
)

avg_loss, trues, preds, histories = model.evaluate(test_dataset)
print(f"Average Test Loss: {avg_loss}")

Custom Metrics¶

from samay.metric import mse, mae, mape

# Get predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

# Calculate custom metrics
import numpy as np
trues = np.array(trues)
preds = np.array(preds)

mse_score = mse(trues, preds)
mae_score = mae(trues, preds)
mape_score = mape(trues, preds)

print(f"MSE: {mse_score:.4f}")
print(f"MAE: {mae_score:.4f}")
print(f"MAPE: {mape_score:.4f}")

Tasks¶

1. Forecasting¶

Predict future values:

config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)

2. Anomaly Detection¶

Detect anomalies in time series:

config = {
    "task_name": "detection",
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)

dataset = LPTMDataset(
    name="ecg",
    datetime_col="date",
    path="./data/ECG5000.csv",
    mode="train",
    task_name="detection",
)

3. Classification¶

Classify time series:

config = {
    "task_name": "classification",
    "num_classes": 5,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)

Advanced Usage¶

Handling Multivariate Time Series¶

LPTM naturally handles multivariate data:

# Your CSV with multiple columns
# date,sensor1,sensor2,sensor3,...
train_dataset = LPTMDataset(
    datetime_col="date",
    path="./data/multivariate.csv",
    mode="train",
    horizon=192,
)

Custom Data Splits¶

# Specify exact boundaries
train_dataset = LPTMDataset(
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
    boundaries=[0, 10000, 15000],  # Train: 0-10000, Val: 10000-15000, Test: 15000-end
)

Denormalizing Predictions¶

# Get normalized predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

# Denormalize using the dataset's scaler
denormalized_preds = test_dataset._denormalize_data(preds)
denormalized_trues = test_dataset._denormalize_data(trues)

Visualization¶

Plotting Forecasts¶

import matplotlib.pyplot as plt
import numpy as np

# Get predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)

trues = np.array(trues)
preds = np.array(preds)
histories = np.array(histories)

# Plot a specific channel and time window
channel_idx = 0
time_index = 0

history = histories[time_index, channel_idx, :]
true = trues[time_index, channel_idx, :]
pred = preds[time_index, channel_idx, :]

plt.figure(figsize=(14, 5))
plt.plot(range(len(history)), history, label="History (512 steps)", linewidth=2)
plt.plot(
    range(len(history), len(history) + len(true)),
    true,
    label="Ground Truth (192 steps)",
    linestyle="--",
    linewidth=2,
)
plt.plot(
    range(len(history), len(history) + len(pred)),
    pred,
    label="Prediction (192 steps)",
    linewidth=2,
)
plt.axvline(x=len(history), color='gray', linestyle=':', alpha=0.5)
plt.legend()
plt.title("LPTM Time Series Forecasting")
plt.xlabel("Time Step")
plt.ylabel("Value")
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

Tips and Best Practices¶

1. Choose Appropriate Forecast Horizons¶

Short-term: 24-96 steps
Medium-term: 192-336 steps
Long-term: 720+ steps

2. Fine-Tuning Strategy¶

Start with frozen encoder and embedder
Only train the forecasting head
If results are unsatisfactory, gradually unfreeze layers

3. Batch Size¶

Larger batch sizes (32-64) for stable training
Smaller batch sizes (8-16) if GPU memory is limited

4. Data Preprocessing¶

LPTM handles normalization internally
Ensure datetime column is properly formatted
Handle missing values before loading

Common Issues¶

Out of Memory¶

Reduce batch size or forecast horizon:

config = {
    "forecast_horizon": 96,  # Instead of 192
}

dataset = LPTMDataset(
    batchsize=8,  # Instead of 16
    # ...
)

Poor Performance¶

Try full fine-tuning:

config = {
    "freeze_encoder": False,
    "freeze_embedder": False,
    "freeze_head": False,
    "head_dropout": 0.1,
    "weight_decay": 0.001,
}

API Reference¶

For detailed API documentation, see:

Examples¶

See the Examples page for complete working examples.