LPTM: Large Pre-trained Time Series Model¶
Overview¶
LPTM (Large Pre-trained Time Series Model) is a foundational model designed for general-purpose time-series forecasting. It uses a transformer-based architecture with a unique segmentation module that adaptively identifies patterns in time-series data.
Paper¶
Large Pre-trained time series models for cross-domain Time series analysis tasks
Key Features¶
- ✅ Pre-trained on large-scale time-series data
- ✅ Adaptive segmentation for pattern discovery
- ✅ Supports forecasting, classification, and anomaly detection
- ✅ Efficient fine-tuning with frozen encoders
- ✅ Handles multivariate time series
Quick Start¶
from samay.model import LPTMModel
from samay.dataset import LPTMDataset
# Configure the model
config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
# Load model
model = LPTMModel(config)
# Load dataset
train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
)
# Fine-tune
finetuned_model = model.finetune(train_dataset)
# Evaluate
test_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    horizon=192,
)
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
Configuration Parameters¶
Model Configuration¶
| Parameter | Type | Default | Description | 
|---|---|---|---|
| task_name | str | "forecasting" | Task type: "forecasting","classification","detection" | 
| forecast_horizon | int | 192 | Number of time steps to predict | 
| freeze_encoder | bool | True | Whether to freeze the patch embedding layer | 
| freeze_embedder | bool | True | Whether to freeze the transformer encoder | 
| freeze_head | bool | False | Whether to freeze the forecasting head | 
| freeze_segment | bool | True | Whether to freeze the segmentation module | 
| head_dropout | float | 0.0 | Dropout rate for the forecasting head | 
| weight_decay | float | 0.0 | Weight decay for regularization | 
| max_patch | int | 16 | Maximum patch size for segmentation | 
Example Configurations¶
Zero-Shot Forecasting¶
config = {
    "task_name": "forecasting",
    "forecast_horizon": 96,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": True,  # Keep all layers frozen
}
Fine-Tuning for Domain Adaptation¶
config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,  # Only train the head
    "head_dropout": 0.1,
    "weight_decay": 0.001,
}
Full Fine-Tuning¶
config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": False,
    "freeze_embedder": False,
    "freeze_head": False,
    "head_dropout": 0.1,
}
Dataset¶
LPTMDataset Parameters¶
| Parameter | Type | Default | Description | 
|---|---|---|---|
| name | str | None | Dataset name (for metadata) | 
| datetime_col | str | None | Name of the datetime column | 
| path | str | Required | Path to CSV file | 
| mode | str | "train" | "train"or"test" | 
| horizon | int | 0 | Forecast horizon length | 
| batchsize | int | 16 | Batch size for training | 
| boundaries | list | [0, 0, 0] | Custom train/val/test split indices | 
| stride | int | 10 | Stride for sliding window | 
| seq_len | int | 512 | Input sequence length | 
| task_name | str | "forecasting" | Task type | 
Data Format¶
Your CSV file should have:
- A datetime column (e.g., date)
- One or more value columns
Example:
date,HUFL,HULL,MUFL,MULL,LUFL,LULL,OT
2016-07-01 00:00:00,5.827,2.009,1.599,0.462,5.677,2.009,6.082
2016-07-01 01:00:00,5.693,2.076,1.492,0.426,5.485,1.942,5.947
...
Training¶
Fine-Tuning¶
# Create training dataset
train_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
    batchsize=16,
)
# Fine-tune the model
finetuned_model = model.finetune(
    train_dataset,
    epochs=5,
    learning_rate=1e-4,
)
Custom Training Loop¶
For more control, you can implement a custom training loop:
import torch
from torch.optim import Adam
# Get data loader
train_loader = train_dataset.get_data_loader()
# Setup optimizer
optimizer = Adam(model.parameters(), lr=1e-4)
# Training loop
for epoch in range(5):
    total_loss = 0
    for batch in train_loader:
        optimizer.zero_grad()
        # Forward pass
        loss = model.compute_loss(batch)
        # Backward pass
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
    avg_loss = total_loss / len(train_loader)
    print(f"Epoch {epoch}: Loss = {avg_loss:.4f}")
Evaluation¶
Basic Evaluation¶
test_dataset = LPTMDataset(
    name="ett",
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="test",
    horizon=192,
)
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
print(f"Average Test Loss: {avg_loss}")
Custom Metrics¶
from samay.metric import mse, mae, mape
# Get predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
# Calculate custom metrics
import numpy as np
trues = np.array(trues)
preds = np.array(preds)
mse_score = mse(trues, preds)
mae_score = mae(trues, preds)
mape_score = mape(trues, preds)
print(f"MSE: {mse_score:.4f}")
print(f"MAE: {mae_score:.4f}")
print(f"MAPE: {mape_score:.4f}")
Tasks¶
1. Forecasting¶
Predict future values:
config = {
    "task_name": "forecasting",
    "forecast_horizon": 192,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)
2. Anomaly Detection¶
Detect anomalies in time series:
config = {
    "task_name": "detection",
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)
dataset = LPTMDataset(
    name="ecg",
    datetime_col="date",
    path="./data/ECG5000.csv",
    mode="train",
    task_name="detection",
)
3. Classification¶
Classify time series:
config = {
    "task_name": "classification",
    "num_classes": 5,
    "freeze_encoder": True,
    "freeze_embedder": True,
    "freeze_head": False,
}
model = LPTMModel(config)
Advanced Usage¶
Handling Multivariate Time Series¶
LPTM naturally handles multivariate data:
# Your CSV with multiple columns
# date,sensor1,sensor2,sensor3,...
train_dataset = LPTMDataset(
    datetime_col="date",
    path="./data/multivariate.csv",
    mode="train",
    horizon=192,
)
Custom Data Splits¶
# Specify exact boundaries
train_dataset = LPTMDataset(
    datetime_col="date",
    path="./data/ETTh1.csv",
    mode="train",
    horizon=192,
    boundaries=[0, 10000, 15000],  # Train: 0-10000, Val: 10000-15000, Test: 15000-end
)
Denormalizing Predictions¶
# Get normalized predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
# Denormalize using the dataset's scaler
denormalized_preds = test_dataset._denormalize_data(preds)
denormalized_trues = test_dataset._denormalize_data(trues)
Visualization¶
Plotting Forecasts¶
import matplotlib.pyplot as plt
import numpy as np
# Get predictions
avg_loss, trues, preds, histories = model.evaluate(test_dataset)
trues = np.array(trues)
preds = np.array(preds)
histories = np.array(histories)
# Plot a specific channel and time window
channel_idx = 0
time_index = 0
history = histories[time_index, channel_idx, :]
true = trues[time_index, channel_idx, :]
pred = preds[time_index, channel_idx, :]
plt.figure(figsize=(14, 5))
plt.plot(range(len(history)), history, label="History (512 steps)", linewidth=2)
plt.plot(
    range(len(history), len(history) + len(true)),
    true,
    label="Ground Truth (192 steps)",
    linestyle="--",
    linewidth=2,
)
plt.plot(
    range(len(history), len(history) + len(pred)),
    pred,
    label="Prediction (192 steps)",
    linewidth=2,
)
plt.axvline(x=len(history), color='gray', linestyle=':', alpha=0.5)
plt.legend()
plt.title("LPTM Time Series Forecasting")
plt.xlabel("Time Step")
plt.ylabel("Value")
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()
Tips and Best Practices¶
1. Choose Appropriate Forecast Horizons¶
- Short-term: 24-96 steps
- Medium-term: 192-336 steps
- Long-term: 720+ steps
2. Fine-Tuning Strategy¶
- Start with frozen encoder and embedder
- Only train the forecasting head
- If results are unsatisfactory, gradually unfreeze layers
3. Batch Size¶
- Larger batch sizes (32-64) for stable training
- Smaller batch sizes (8-16) if GPU memory is limited
4. Data Preprocessing¶
- LPTM handles normalization internally
- Ensure datetime column is properly formatted
- Handle missing values before loading
Common Issues¶
Out of Memory¶
Reduce batch size or forecast horizon:
config = {
    "forecast_horizon": 96,  # Instead of 192
}
dataset = LPTMDataset(
    batchsize=8,  # Instead of 16
    # ...
)
Poor Performance¶
Try full fine-tuning:
config = {
    "freeze_encoder": False,
    "freeze_embedder": False,
    "freeze_head": False,
    "head_dropout": 0.1,
    "weight_decay": 0.001,
}
API Reference¶
For detailed API documentation, see:
Examples¶
See the Examples page for complete working examples.