What does a higher smoothing constant mean?

(Fourth in a series)

In last week’s Forecast Friday post, we discussed moving average forecasting methods, both simple and weighted. When a time series is stationary, that is, exhibits no discernable trend or seasonality and is subject only to the randomness of everyday existence, then moving average methods – or even a simple average of the entire series – are useful for forecasting the next few periods. However, most time series are anything but stationary: retail sales have trend, seasonal, and cyclical elements, while public utilities have trend and seasonal components that impact the usage of electricity and heat. Hence, moving average forecasting approaches may provide less than desirable results. Moreover, the most recent sales figures typically are more indicative of future sales, so there is often a need to have a forecasting system that places greater weight on more recent observations. Enter exponential smoothing.

Unlike moving average models, which use a fixed number of the most recent values in the time series for smoothing and forecasting, exponential smoothing incorporates all values time series, placing the heaviest weight on the current data, and weights on older observations that diminish exponentially over time. Because of the emphasis on all previous periods in the data set, the exponential smoothing model is recursive. When a time series exhibits no strong or discernable seasonality or trend, the simplest form of exponential smoothing – single exponential smoothing – can be applied. The formula for single exponential smoothing is:

Ŷt+1 = αYt + (1-α) Ŷt

In this equation, Ŷt+1 represents the forecast value for period t + 1; Yt is the actual value of the current period, t; Ŷt is the forecast value for the current period, t; and α is the smoothing constant, or alpha, a number between 0 and 1. Alpha is the weight you assign to the most recent observation in your time series. Essentially, you are basing your forecast for the next period on the actual value for this period, and the value you forecasted for this period, which in turn was based on forecasts for periods before that.

Let’s assume you’ve been in business for 10 weeks and want to forecast sales for the 11th week. Sales for those first 10 weeks are:

Week (t)

Sales (Yt)

1

200

2

215

3

210

4

220

5

230

6

220

7

235

8

215

9

220

10

210

From the equation above, you know that in order to come up with a forecast for week 11, you need forecasted values for weeks 10, 9, and all the way down to week 1. You also know that week 1 does not have any preceding period, so it cannot be forecasting. And, you need to determine the smoothing constant, or alpha, to use for your forecasts.

Determining the Initial Forecast

The first step in constructing your exponential smoothing model is to generate a forecast value for the first period in your time series. The most common practice is to set the forecasted value of week 1 equal to the actual value, 200, which we will do in our example. Another approach would be that if you have prior sales data to this, but are not using it in your construction of the model, you might take an average of a couple of immediately prior periods and use that as the forecast. How you determine your initial forecast is subjective.

How Big Should Alpha Be?

This too is a judgment call, and finding the appropriate alpha is subject to trial and error. Generally, if your time series is very stable, a small α is appropriate. Visual inspection of your sales on a graph is also useful in trying to pinpoint an alpha to start with. Why is the size of α important? Because the closer α is to 1, the more weight that is assigned to the most recent value in determining your forecast, the more rapidly your forecast adjusts to patterns in your time series and the less smoothing that occurs. Likewise, the closer α is to 0, the more weight that is placed on earlier observations in determining the forecast, the more slowly your forecast adjusts to patterns in the time series, and the more smoothing that occurs. Let’s visually inspect the 10 weeks of sales:

What does a higher smoothing constant mean?

The Exponential Smoothing Process

The sales appear somewhat jagged, oscillating between 200 and 235. Let’s start with an alpha of 0.5. That gives us the following table:

Week (t)

Sales (Yt)

Forecast for This Period (Ŷt)

1

200

200.0

2

215

200.0

3

210

207.5

4

220

208.8

5

230

214.4

6

220

222.2

7

235

221.1

8

215

228.0

9

220

221.5

10

210

220.8

Notice how, even though your forecasts aren’t precise, when your actual value for a particular week is higher than what you forecasted (weeks 2 through 5, for example), your forecasts for each of the subsequent weeks (weeks 3 through 6) adjust upward; when your actual values are lower than your forecast (e.g., weeks 6, 8, 9, and 10), your forecasts for the following week adjusts downward. Also notice that, as you move to later periods, your earlier forecasts play less and less of a role in your later forecasts, as their weight diminishes exponentially. Just by looking at the table above, you know that the forecast for week 11 will be lower than 220.8, your forecast for week 10:

Ŷ11 = 0.5Y10 + (1-0.5) Ŷ10

= 0.5(210) + 0.5(220.8)

= 105 + 110.4

=215.4

So, based on our alpha and our past sales, our best guess is that sales in week 11 will be 215.4. Take a look at the graph of actual vs. forecasted sales for weeks 1-10:

What does a higher smoothing constant mean?

Notice that the forecasted sales are smoother than actual, and you can see how the forecasted sales line adjusts to spikes and dips in the actual sales time series.

What if we Had Used a Smaller or Larger Alpha?

We’ll demonstrate by using both an alpha of .30 and one of .70. That gives us the following table and graph:

Week (t)

Sales (Yt)

Forecast α=0.50

Forecast α=0.30

Forecast α=0.70

1

200

200.0

200.0

200.0

2

215

200.0

200.0

200.0

3

210

207.5

204.5

210.5

4

220

208.8

206.2

210.2

5

230

214.4

210.3

217.0

6

220

222.2

216.2

226.1

7

235

221.1

217.3

221.8

8

215

228.0

222.6

231.1

9

220

221.5

220.4

219.8

10

210

220.8

220.2

219.9

What does a higher smoothing constant mean?

As you can see, the smaller the α, the smoother the curve for forecasted sales; the larger the α, the bumpier the curve, as you can see as you move from .30 to .50 to .70. Notice how much faster an α of .70 adjusts to the actual sales than the smaller α’s. The forecasts for week 11 would be 217.2 with an α=.30 and 213 with an α=.70.

Which α is best?

As with moving average models, the Mean Absolute Deviation (MAD) can be used to determining which alpha best fits the data. The MADs for each alpha are computed below:

Week

Absolute Deviations

α=.30

α=.50

α=.70

1

2

15.0

15.0

15.0

3

5.5

2.5

0.5

4

13.9

11.3

9.8

5

19.7

15.6

13.0

6

3.8

2.2

6.1

7

17.7

13.9

13.2

8

7.6

13.0

16.1

9

0.4

1.5

0.2

10

10.2

10.8

9.9

MAD=

9.4

8.6

8.4

Using an alpha of 0.70, we end up with the lowest MAD of the three constants. Keep in mind that judging the dependability of forecasts isn’t always about minimizing MAD. MAD, after all, is an average of deviations. Notice how dramatically the absolute deviations for each of the alphas change from week to week. Forecasts might be more reliable using an alpha that produces a higher MAD, but has less variance among its individual deviations.

Limits on Exponential Smoothing

Exponential smoothing is not intended for long-term forecasting. Usually it is used to predict one or two, but rarely more than three periods ahead. Also, if there is a sudden drastic change in the level of sales or values, and the time series continues at that new level, then the algorithm will be slow to catch up with the sudden change. Hence, there will be greater forecasting error. In situations like that, it would be best to ignore the previous periods before the change, and begin the exponential smoothing process with the new level. Finally, this post discussed single exponential smoothing, which is used when there is no noticeable seasonality or trend in the data. When there is a noticeable trend or seasonal pattern in the data, single exponential smoothing will yield significant forecast error. Double exponential smoothing is needed here to adjust for those patterns. We will cover double exponential smoothing in next week’s Forecast Friday post.

Still don’t know why our Forecast Friday posts appear on Thursday? Find out at: http://tinyurl.com/26cm6ma

 

What is the best smoothing constant?

α = the smoothing constant, a value from 0 to 1. When α is close to zero, smoothing happens more slowly. Following this, the best value for α is the one that results in the smallest mean squared error (MSE).

How do you interpret exponential smoothing?

Forecasts produced using exponential smoothing methods are weighted averages of past observations, with the weights decaying exponentially as the observations get older. In other words, the more recent the observation the higher the associated weight.

What does a higher alpha mean in exponential smoothing?

Alpha. This numeric value, between 0 and 1, controls the calculation. A smaller value (closer to 0) creates a smoother (slowly changing) line similar to a moving average with a large number of periods. A high value for alpha tracks the data more closely by giving more weight to recent data.

What the smoothing constant α represents?

ALPHA is the smoothing parameter that defines the weighting and should be greater than 0 and less than 1. ALPHA equal 0 sets the current smoothed point to the previous smoothed value and ALPHA equal 1 sets the current smoothed point to the current point (i.e., the smoothed series is the original series).