A Comprehensive Approach to Time Series Forecasting with LSTM
Written on
In this article, we delve into a dependable approach for forecasting time series using Long Short-Term Memory (LSTM) networks. Previously, we examined how to prepare data for LSTM forecasting and introduced two effective techniques: Iterative Substitution and the Single Test Set method. While both methods yield reliable results, the latter tends to achieve superior accuracy.
We also touched on the various strategies for fine-tuning hyperparameters in LSTM models. Importantly, we clarified that while transforming a time series into a feature domain, the temporal data is preserved rather than lost. Relying on traditional methods, initially designed for deep learning, to maintain this temporal data can be ineffective. Further insights can be found here.
The Breakthrough
We are now ready to present a foolproof, advanced technique for forecasting multiple future time steps in a time series. Although the Single Test Set method was previously effective, it faced challenges in capturing the seasonality inherent in time series data. To tackle this issue, we have developed an innovative solution that allows the algorithm to recognize seasonality when it exists.
As with all LSTM applications, proper data preparation remains vital for constructing a resilient model. We will refine the Single Test Set method to enable the model to learn any present seasonality.
Below is the initial Single Test Set method:
Next, we will extend this to a 23-step time series, assuming a 3-step seasonal cycle.
Conditional formatting was employed to visualize the seasonality.
To effectively capture seasonality, we choose cyclical sets for our train/test split, ensuring that our training set maintains a cyclical synchronization with the test set. Let’s visualize this approach:
The following code implements this approach:
train_ratio = 0.8
cycle = 12
in_out_offset = 1
def split_sequences_comparable_S(sequences, in_out_offset, train_ratio, cycle):
n_cycles = int(len(sequences) / cycle)
start = len(sequences) - n_cycles * cycle
sequences = sequences[start:]
# Train-test split
train_cycles = int(train_ratio * n_cycles)
train_size = train_cycles * cycle
# Cycles
cycles_out = n_cycles - train_cycles
cycles_in = cycles_out + in_out_offset
n_samples = train_cycles - cycles_in - cycles_out + 1
X, y = list(), list()
for i in range(n_samples):
# find the end of this pattern
end_ix = (i + cycles_in) * cycle
out_end_ix = end_ix + cycles_out * cycle
# gather input and output parts of the pattern
seq_x, seq_y = sequences[i * cycle:end_ix], sequences[end_ix:out_end_ix]
X.append(seq_x)
y.append(seq_y)
# End sets
X_test = sequences[train_size - cycles_in * cycle:train_size]
X_forecast = sequences[-cycles_in * cycle:]
return np.array(X), np.array(y), np.array(X_test), np.array(X_forecast), train_size + start, start, cycles_in * cycle, cycles_out * cycle
In this code, users must specify the cycle length and the in_out_offset in addition to the train ratio. The cycle length should align with the nature of the data; for instance, a 12-month cycle is suitable for monthly passenger enplanement data. The train_ratio determines the number of cycles out, while users can modify the cycles in by defining the offset between cycles in and out. In the visual examples provided earlier, the cycle length is set to 3, with 2 cycles out and 3 cycles in, resulting in an in_out_offset of 1.
Here’s how to execute the function and plot the outcomes:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import r2_score, mean_absolute_percentage_error, mean_absolute_error
from keras.models import Sequential
from keras.layers import LSTM, Dense
from keras.optimizers import Adam
X, y, X_test, X_forecast, train_size, start, n_steps_in, n_steps_out = split_sequences_comparable_S(ts_data, in_out_offset, train_ratio, cycle)
X = X.reshape(X.shape[0], X.shape[1])
X_test = X_test.reshape(X_test.shape[1], X_test.shape[0])
y = y.reshape(y.shape[0], y.shape[1])
y_test = ts_data[train_size:]
y_test = y_test.reshape(y_test.shape[1], y_test.shape[0])
# Create and train LSTM model
model = Sequential()
model.add(LSTM(units=72, activation='tanh', input_shape=(n_steps_in, 1)))
model.add(Dense(units=n_steps_out))
model.compile(loss='mean_squared_error', optimizer='Adam', metrics=['mape'])
model.fit(x=X, y=y, epochs=500, batch_size=18, verbose=2)
lstm_predictions = model.predict(X_test)
predictions = lstm_predictions.reshape(lstm_predictions.shape[1])
lstm_fitted = model.predict(X)
fits = n_steps_out + (lstm_fitted.shape[0]-1) * cycle
fitted = np.full((lstm_fitted.shape[0], train_size), np.nan)
for i in range(lstm_fitted.shape[0]):
for j in range(n_steps_out):
fitted[i, train_size - fits + i * cycle + j] = lstm_fitted[i, j]fitted = np.nanmean(fitted, axis=0)
mape = mean_absolute_percentage_error(y_test, lstm_predictions)
r2 = r2_score(ts_data[train_size - fits:train_size], fitted[train_size - fits:])
date_range = pd.date_range(start='1990-01-01', end='2023-09-30', freq='M')
# Plot actual, fits, and forecasts
plt.figure(figsize=(10, 6))
plt.plot(date_range, ts_data, label='Actual', color='blue')
plt.plot(date_range[:train_size], fitted, label='Fitted', color='green')
plt.plot(date_range[train_size:], predictions, label='Forecast', color='red')
plt.title('FSC - Short - PassengersnCyclic LSTM Forecast')
plt.xlabel('Date')
plt.ylabel('Passengers')
plt.legend()
plt.text(0.05, 0.05, f'R2 = {r2*100:.2f}%nMAPE = {mape*100:.2f}%', transform=plt.gca().transAxes, fontsize=12)
plt.grid(True)
plt.show()
The Consideration
The effectiveness of this method is best illustrated by visualising the results:
This plot is vital because standard performance metrics may not fully capture the true accuracy of our model, which is a common issue in forecasting algorithms. For instance, a simple algorithm that predicts the next point as the current one might achieve a low score with conventional error metrics, thus appearing to perform well. This is why alternative metrics, such as MASE, are often employed in assessing forecasting algorithms.
MASE = MAE of Model Predictions / MAE of Naïve Forecast on Training Data
Where:
- MAE of Model Predictions is the mean absolute error of your predictions.
- MAE of Naïve Forecast is determined using the naïve forecasting approach on your training data, where each point predicts the next.
This is the code to calculate MASE:
naive_forecasts = ts_data[train_size - fits:train_size - 1] # all but the last
actuals = ts_data[train_size - fits + 1:train_size] # all but the first
mae_naive = mean_absolute_error(actuals, naive_forecasts)
mae_model = mean_absolute_error(y_test, lstm_predictions)
mase = mae_model / mae_naive
Next, we will compare the two models and their MASE values.
This improvement is significant, and the results would likely have been even better without the disruptions caused by the 2020 pandemic, which severely impacted global air travel.
This success was achievable only by preparing the data in accordance with its seasonal patterns.
Stay tuned for the next article, where we will explore exogenous variables and their integration into LSTM and ARIMA models.