Time series use cases¶

Analysis¶

General analysis¶

Since timeseries usecase are regressions, you’ll find the same level of analytics than for its tabular counterpart.

Time gauge¶

We recall the selection criteria entered by the user on the time gauge:

Feature importance¶

The goal of the timeserie modelisation is to find automatically new temporal features that will increase the predictive power of the model. Temporel features will be created based on statistical signifiance such as autocorrelation function (ACF), partial autocorrelation function (PACF), correlation with the TARGET, …

Created features can be found in the feature importance:

They are constructed with the name of the original feature, followed by some moving agregate functions:

featurename_lag_X = lag (offset) of X timestep of featurename
featurename_min_a_b = minimum of featurename between a and b timestep
featurename_max_a_b = maximum of featurename between a and b timestep
featurename_mean_a_b = mean (moving average) of featurename between a and b timestep
featurename_bollinger_upper_a_b = upper bound of bolliger (~ moving average + sd) of featurename between a and b timestep
featurename_bollinger_lower_a_b = lower bound of bollinger (~ moving average - sd) of featurename between a and b timestep

Please keep in mind that featurename can be the TARGET or any feature present in the dataset.

Predictions¶

When forecasting, it is necessary to send a historical dataset of at least the same length as the interval between the 2 boundaries of the historical window. This set will be completely filled with the actual data (including the target) and will be completed with the data to be forecasted:

The target that will be absent -> Prevision.io will detect the period to be predicted from the moment the target ceases to be known
The data will be filled in a priori
Non a priori data will be missing

The output of this step will be a file (time, value) filled over the forecasted period. In addition, if the historical period is longer than the length of the window, forecasts will be made using this data and will allow a test score to be calculated directly in the application.

In case of problems¶

During training¶

Given the complexity of time series modeling, it is essential that the data set respects the following constraints during the learning phase:

Check that the target is numeric
Check the constraints on the temporal window and the history window
Check that a time column is filled in in ISO 8601 format (or in classic formats, such as DD/MM/YYYYY or DD-MM-YYYY hh:mm for example)
Check that the time spacing is consistent for at least 80% of the data (e.g.: You send a series of one day at the hourly step. If more than 5 data are missing, the calculation will not be successful)
Check, when there is a group, that the columns designated as such identify a unique time series (i.e. a maximum value on a timestamp)
Check, when there is a group, that the time step is consistent between the groups
Check that the time steps and the number of missing data respect the rules mentioned above, including all intersections induced by the possible presence of groups

Remarks:

Evaluation is performed on a time split cross validation
In case of multiple lines on the same timestamp, only the first event is kept
In case of missing timestamps, the last known value is propagated to the next known timestamp
Each group must contain at least 3 observations. If this is not the case, the group will be deleted from the dataset

During forecast¶

I have a file containing 0 forecasts¶

Make sure you have provided a dataset with a missing target starting from a given timestamp. If the target column is still filled, we cannot extend the forecast, especially if your use case contains a priori groups and features.

The prediction returns inconsistent results¶

Check that the a priori features in particular are correctly filled in for the values to be forecasted.

Check that all the labelled data corresponding to the history window is filled in. Missing data will be imputed as equal to the mean of the target, which can screw results.

Check that the difference between the time of training and the prediction is not too high. Time series may require more frequent re-training than other use cases because of natural target drift.

The prediction returns an error¶

In general, check that you provide a sufficient history consistent with the definition of your use case.

If your dataset contains groups:

Check that the groups are temporally consistent, i.e. for each group there are as many time steps as the others
Check that no new groups appear at the time of the forecast