如何在Intel GPU（XPU/torch.xpu）上使用NeuralForecast与PyTorch Lightning？求单XPU设备下无需IPEX的最简可行方案

阿华AIGC实验室

2026-4-27

Solution for Running NeuralForecast on Intel XPU (Without IPEX)

Since PyTorch already supports Intel XPU via torch.xpu, but PyTorch Lightning (PL) doesn't have built-in XPU accelerator support yet, we can bypass PL's device detection logic with a minimal, clean workaround that doesn't require modifying library code. Here's how to do it for your single-XPU setup:

Step 1: Force Default Device to XPU

First, set PyTorch's default device to your XPU at the very start of your code. This ensures all tensors (model params, training data, etc.) are created directly on the XPU:

import torch
torch.set_default_device('xpu:0')  # Use 'xpu:0' for single device

Step 2: Initialize NeuralForecast Models Normally

You don't need to manually move models to XPU—since we set the default device, all model parameters will be loaded onto XPU automatically:

from neuralforecast import NeuralForecast
from neuralforecast.models import NBEATS

# Example model initialization (adjust params to your use case)
models = [NBEATS(input_size=24, h=12, max_steps=1000)]
nf = NeuralForecast(models=models, freq='H')

Step 3: Configure PyTorch Lightning Trainer

The key here is to avoid PL's automatic accelerator detection (which would default to CPU). Instead, set devices=1 and skip the accelerator parameter entirely. We can also disable some optional features to avoid device-related warnings:

from pytorch_lightning import Trainer

trainer = Trainer(
    devices=1,
    max_epochs=10,
    enable_model_summary=False,  # Optional: Avoids device-related summary errors
    logger=False  # Optional: Simplifies debugging
)

Step 4: Run Training as Usual

Pass your custom Trainer to NeuralForecast's fit method—everything will run on the XPU:

nf.fit(df=your_training_dataframe, trainer=trainer)

Verify It's Working

To confirm your model is running on XPU, print the device of any model parameter before training:

print(next(models[0].parameters()).device)  # Should output 'xpu:0'

Why This Works

torch.set_default_device overrides PyTorch's default tensor creation location, so all model and data tensors live on XPU from the start.
By omitting the accelerator parameter in Trainer, PL won't force-move tensors to CPU—it will use the device they're already on.
NeuralForecast inherits all device settings from PyTorch and the custom Trainer, no library modifications needed.

Notes

Make sure your PyTorch version supports Intel XPU (PyTorch 2.1+ with Intel's XPU drivers installed).
This is a temporary workaround until PyTorch Lightning adds native XPU support.
No need for IPEX—we're using PyTorch's native torch.xpu interface directly.

内容的提问来源于stack exchange，提问作者Marek Ozana