Regression analysis, as described in previous post, can be used to quantify relationships between variables. However, data collection can be a problem if the regression model includes a large number of independent variables. When changes in a variable show discernable patterns over time, time-series analysis is an alternative method for forecasting future values.
The focus of time-series analysis is to identify the components of change in the data. Traditionally, these components are divided into four categories:
- Cyclical patterns
- Random fluctuations
A trend is a long-term increase or decrease in the variable. For example, the time series of population in India exhibits an upward trend, while the trend for endangered species, such as the tiger, is downward. The seasonal component represents changes that occur at regular intervals. A large increase in sales of umbrellas during the monsoon would be an example of seasonality.
Analysis of a time series may suggest that there are cyclical patterns, defined as sustained periods of high values followed by low values. Business cycles fit this category. Finally, the remaining variation in a variable that does not follow any discernable pattern is due to random fluctuations. Various methods can be used to determine trends, seasonality, and any cyclical patterns in time-series data. However, by definition, changes in the variable due to random factors are not predictable. The larger the random component of a time series, the less accurate the forecasts based on those data.
One of the most commonly used forecasting techniques is trend projection. As the name suggests, this approach is based on the assumption that there is an identifiable trend in a time series of data. Trend projection can also be used as the starting point for identifying seasonal and cyclical variations.
Table -3 is a time series of a firm’s quarterly sales over a three-year time span. These data are used to illustrate graphical and statistical trend projection and also to describe a method for making seasonal adjustments to a forecast.
Statistical Curve Fitting : Basically, this involves using the ordinary least-squares concept developed above to estimate the parameters of the equation. Suppose that an analyst determines that a forecast will be made assuming that there will be a constant rate of change in sales from one period to the next. That is, the firm’s sales will change by the same amount between two periods. The time-series data of Table-3 are to be used to estimate that rate of change.
Statistically, this involves estimating the parameters of the equation
St = So + bt
where S denotes sales and t indicates the time period. The two parameters to be estimated are So and b. The value of So corresponds vertical intercept of the line and the parameter b is the constant rate of change and corresponds to the slope. Many hand calculators can estimate the parameters of equation. Specific procedures vary from model to model, but usually the only requirement is that the user input the data and push one or two designated keys. The machine then returns the estimated parameters. For the data of Table-3, the quarters would have to be inputted as sequential numbers starting with 1. That is, 1996: I would be entered as 1, 1996: II would be entered as 2, and so forth. Based on the data from the table, the equation is estimated as
St = 281.394 + 12.811t
The interpretation of the equation is that the estimated constant rate of increase in sales per quarter is Rs. 12.811 lakhs. A forecast of sales for any future quarter, St, can be obtained by substituting in the appropriate value for t. For example, the third quarter of 1999 is the 15th observation of the time series. Thus, the estimated sales for that quarter would be 281.394 + 12.811(15), or Rs. 473.56 lakhs.
Now suppose that the individual responsible for the forecast wants to estimate a percentage rate of change in sales. That is, it is assumed that sales will increase by a constant percent each period. This relationship can be expressed mathematically as
St= St-1(1 + g)
St-1 = St-2(1 + g)
where g is the constant percentage rate of change, or the growth rate. These two equations imply that
St = St-2(1 + g)2
and, in general,
St = So(l + g)t
As shown, the parameters of this equation cannot be estimated using ordinary least squares. The problem is that the equation is not linear. However, there is a simple transformation of the equation that allows it to be estimated using ordinary least squares.
Take logs, the result is
ln St = ln [So(l + g)t]
But the logarithm of a product is just the sum of the logarithms. Thus
ln St = ln So + ln[(l + g)t]
The right-hand side of the equation can be further simplified by noting that
ln [(l + g)t] = t[ln(l + g)]
ln St = ln So + t[ln(l + g)]
This equation is linear in form. This can be seen by making the following substitutions:
Yt = ln St
Yo= ln So
b = ln(l + g)
Thus the new equation is
Yt = Yo + bt
which is linear.
The parameters of this equation can easily be estimated using a hand calculator. The key is to recognize that the sales data have been translated into logarithms. Thus, instead of S1, it is in Si that must be entered as data. However, note that the t values have not been transformed, Hence for the first quarter of 1996, the data to be entered are In 300 = 5.704 and 1; for the second quarter, In 305 = 5.720 and 2; and so forth. The transformed data are provided in Table-4
Using the ordinary least-squares method, the estimated parameters of the equation based on the data from Table 4 are
Yt = 5.6623 + 0.03531
But these parameters are generated from the logarithms of the data. Thus, for interpretation in terms of the original data, they must be converted back based on the relationships In So = Yo= 5.6623 and ln (1 + g) = b = 0.0353. Taking the antilogs yields So = 287.810 and 1 + g = 1.0359. Substituting these values for So and 1 + g back into the original equation of St = So(l + g)t gives
St = 287.810(1.0359)t
where 287.810 is sales (in lakhs of rupees) in period 0 and the estimated growth rate, g, is 0.0359 or 3.59 per cent.
To forecast sales in a future quarter, the appropriate value of 1 is substituted into the equation. For example, predicted sales in the third quarter of 1999 (i.e., the fifteenth quarter) would be 287.810 (1.0359)15, or Rs 488.51 lakhs.
Seasonal Variation in Time-Series Data
Seasonal fluctuations in time-series data are not uncommon. In particular, a large increase in sales for the fourth quarter is a characteristic of certain industries. Indeed, some retailing firms make large amounts of their total sales during the Diwali period. Other business activities have their own seasonal sales patterns. Electric companies serving hot, humid areas have distinct peak sales periods during the summer months because of the extensive use of air conditioning. Similarly, demand for accountants’ services increases in the first quarter as income tax deadlines approach.
A close examination of the data in Table-4 indicates that the quarterly sales increases are not uniformly distributed over the year. The increases from the first quarter to the second, and from the fourth quarter to the first, tend to be small, while the fourth-quarter increase is consistently larger than that of other quarters. That is, the data exhibits seasonal fluctuations.
Pronounced seasonal variations can cause serious errors in forecasts based on time-series data. For example, Table-4 indicates that actual sales for the fourth quarter 1998 were Rs. 445 lakhs. But if the estimated equation is used to predict sales for that period (using the constant rate of change model), the predicted total is 281.394 +12.811(12), or Rs. 435.13 lakhs. The large difference between actual and predicted sales occurs because the equation does not take into account the fourth quarter sales jump. Rather, the predicted value from the equation represents an averaging of individual quarters. Thus, sales will be underestimated for the strong fourth quarter. Conversely the predicting equation may overestimate sales for other quarters.
The accuracy of the forecast can be improved by seasonally adjusting the data. Probably the most common method of adjustment is the ratio-to-trend approach. Its use can be illustrated using the data from Table-4 based on predicting equation,
St = 281.394 + 12.811t
actual and calculated fourth-quarter sales are shown in Table 6.5. The final column of the table is the ratio of actual to predicted sales for the fourth quarter. This ratio is a measure of the seasonal error in the forecast.
As shown, for the three-year period, average actual sales for the fourth quarter were 102 percent of the average forecasted sales for that quarter. The factor 1.02 can be used to adjust future fourth-quarter sales estimates. For example, if the objective is to predict sales for the fourth quarter of 1998, the predicting equation generates an estimate of Rs. 435.13 lakhs. Multiplying this number by the 1.020 adjustment factor, the forecast is increased to Rs. 443.8 lakhs, which is close to the actual sales of Rs. 445 lakhs for that quarter. A similar technique could be used to make a downward adjustment for predicted sales in other quarters.
Seasonal adjustment can improve forecasts based on trend projection. However, trend projection still has some shortcomings. One is that it is primarily limited to short-term predictions. If the trend is extrapolated much beyond the last data point, the accuracy of the forecast diminishes rapidly. Another limitation is that factors such as changes in relative prices and fluctuations in the rate of economic growth are not considered. Rather, the trend projection approach assumes that historical relationships will not change.
Trend projection is actually just regression analysis where the only independent variable is time. One characteristic of this method is that each observation has the same weight. That is, the effect of the initial data point on the estimated coefficients is just as great as the last data point. If there has been little or no change in the pattern over the entire time series, this is not a problem. However, in some cases, more recent observations will contain more accurate information about the future than those at the beginning of the series. For example, the sales history of the last three months may be more relevant in forecasting future sales than data for sales 10 years in the past.
Exponential smoothing is a technique of time-series forecasting that gives greater weight to more recent observations. The first step is to choose a smoothing constant, a, where 0 < a < 1.0. If there are n observations in a time series, the forecast for the next period (i.e., n + 1) is calculated as a weighted average of the observed value of the series at period n and the forecasted value for that same period. That is,
Fn+1= a Xn + (12 – a)Fn
where Fn+1 is the forecast value for the next period, Xn is the observed value for the last observation, and Fn is a forecast of the value for the last period in the time series. The forecasted values for Fn and all the earlier periods are calculated in the same manner. Specifically,
Ft = a Xt–1 + (1 – a)Ft–1
starting with the second observation (i.e., t = 2) and going to the last (i.e., t = n ). Note that equation cannot be used to forecast F1 because there is no XO or FO. This problem is usually solved by assuming that the forecast for the first period is equal to the observed value for that period. That is, F1 = X1. Using the equation it can be seen that this implies that the second-period forecast is just the observed value for the first period, or F1 = X1.
The exponential smoothing constant chosen determines the weight that is given to different observations in the time series. As a approaches 1.0, more recent observations are given greater weight. For example, if a = 1.0, then (1- a) = 0 and the equations indicate that the forecast is determined only by the actual observation for the last period. In contrast, lower values for a give greater weight to observations from previous periods.
Assume that a firm’s sales over the last 10 weeks are as shown in Table-6. By assumption, F2 = F1 = X1 if a = 0.20, then
F3 = 0.20(4.30) + 0.80(400) = 406.0 and
F4 = 0.20(420) + 0.80(406) = 408.8
The forecasted values for four different values of a are provided in Table-6. The table also shows forecasted sales for the next period after the end of the time series data, or week 11. Using a = 0.20, the forecasted sales value for the 11th week is computed to be
F11 = 0.20(420) + 0.80(435.7) = 432.56
Table-6 suggests why this method is referred to as smoothing technique. Consider the forecasts based on a = 0.20. Note that the smoothed data show much less fluctuation than the original sales data. Note also that as a increases, the fluctuations in the Ft increase, because the forecasts give more weight to the last observed value in the time series.
Choice of a Smoothing Constant
Any value of a could be used as the smoothing constant. One criterion for selecting this value might be the analyst’s intuitive judgment regarding the weight that should be given to more recent data points. But there is also an empirical basis for selecting the value of a. Remember that the coefficients of a regression equation are chosen to minimize the sum of squared deviations between observed and predicted values. This same method can be used to determine the smoothing constant.
The term (Xt – Ft)2 is the square of the deviation between the actual time-series data and the forecast for the same period. Thus, by adding these values for each observation, the sum of the squared deviations can be computed as
These results suggest that, of the four values of the smoothing constant, a = 0.60 provides the best forecasts using these data. However, it should be noted that there may be values of a between 0.60 and 0.80 or between 0.40 and 0.60 that yield even better results.
Evaluation of Exponential Smoothing
One advantage of exponential smoothing is that it allows more recent data to be given greater weight in analyzing time-series data. Another is that, as additional observations become available, it is easy to update the forecasts. There is no need to re-estimate the equations, as would be required with trend projection.
The primary disadvantage of exponential smoothing is that it does not provide very accurate forecasts if there is a significant trend in the data. If the time trend is positive, forecasts based on exponential smoothing will be likely to be too low, while a negative time trend will result in estimates that are too high. Simple exponential smoothing works best when there is no discernable time trend in the data. There are, however, more sophisticated forms of exponential smoothing that allow both trends and seasonality to be accounted for in making forecasts.
Barometric forecasting is based on the observed relationships between different economic indicators. It is used to give the decision maker an insight into the direction of likely future demand changes, although it cannot usually be used to quantify them.
Five different types of indicators may be used. Firstly, there are leading indicators which run in advance of changes in demand for a particular product. An example of these might be an increase in the number of building permits granted which would lead to an increase in demand for building-related products such as wood, concrete and so on. Secondly, there are coincident indicators which occur alongside changes in demand. Retail sales would fall into this category, as an increase in sales would generate an increase in demand for the manufacturers of the goods concerned. Thirdly, there are lagging indicators which run behind changes in demand. New industrial investment by firms is often said to fall into this category. In this case it is argued that firms will only invest in new production facilities when demand is already firmly established. Thus increased investment is a sign, or confirmation, that an initial increase in demand has already taken place. This may well indicate that the economy is improving, for example, so that further changes in the level of demand can be expected in the near future.
One particular problem with each of these three types of indicator is that single indicators do not always prove to be accurate in predicting changes in demand. For this reason, groups of indicators may be used instead. The fourth and fifth types of indicator fall into this category. These are composite indices and diffusion indices respectively. Composite indices are made up of weighted averages of several leading indicators which demonstrate an overall trend. Diffusion indices are groups of leading indicators whose directional shifts are analysed separately. If more than half of the leading indicators included within them are rising, demand is forecast to rise and vice versa. Again, it is important to note that it is the direction of change that is the basis of the prediction, the actual size o of the change cannot be measured. In addition, the situation is complicated by the fact that there may be variations in the length of the lead time between the various indicators. This means that the accuracy of predictions may be reduced.
FORECASTING METHODS: REGRESSION MODELS
You have seen how regression analysis is used in the estimating process. In this part you will see several applications of multiple regression analysis to the forecasting process. In this section we shall forecast demand by using data for Big Sky Foods (BSF) a company selling groceries.
Using the OLS method of estimation available in Excel or any standard statistical package, the demand function we estimated was
Q = 15.939 – 9.057P + .009INC + 5.092PC
where Q = sales; P = BSF’s price; INC= income; PC = price charged by BSF’s major competitor. This model can be used to forecast sales, assuming that forecasts of the independent variables are available.
Big Sky Foods has access to forecasts from one of the macro-econometric service firms that provide a good estimate of the income variable by quarter for one year ahead. In addition, BSF has had reasonable success using a simple exponential smoothing model (with w = .8) to predict the competitor’s price one quarter in advance. And, of course, BSF controls its own price.
Assume that BSF plans to price at 5.85 next quarter, that the competitor’s price is forecast to be 4.99, and that income is forecast to be 4800. Sales for BSF can then be forecast as follows:
Q = 15.939 – 9.057(5.85) + .009(4800) + 5.092(4.99)
Q = 31.565
Notice that, in making this forecast, BSF starts with an economic forecast that provides a projection for income and an exponential smoothing model that provides a projected value for the competitor’s price. These are then combined with the multiple regression model of demand and BSF’s own pricing plan to arrive at a forecast for sales. BSF can then use this procedure to experiment with the effect of different prices or to make forecasts based on differing forecasts of the other independent variables.