# REGRESSION ANALYSIS

|A manager most often determine the total cost of producing various levels of output. The relation between total cost (C) and quantity (Q) is,

**C = a + bQ + cQ ^{2}**

**+ dQ**

^{3}Where a, b, c and d are parameters of the equation. Parameters are coefficients in an equation that determine the exact mathematical relation among the variables in the equation. If the numerical values of parameters are determined, the manager knows the quantitative relation between output and total cost. If the value of parameters of cost equation are calculated to be a = 1262, b = 1, c = –0.03 and d = 0.005, the equation becomes,

**C = 1262 + 1Q –0.03Q ^{2}**

**+ 0.005Q**

^{3}This equation can be used to compute the total cost of producing various levels of output. If, for example, the manager wishes to produce 30 units of output, the total cost can be calculated as

**C = 1262 + 30 –0.03(30) ^{ 2}**

**+ 0.005(30)**

^{ 3}**= Rs. 1400**

Thus, in order for the cost function to be useful for decision making, the manager must know the numerical value of the parameters.

The values of the parameters are often obtained by using a technique called regression analysis. It determines the mathematical relation between a dependent variable and one or more explanatory variables.

** Dependent variable:** The variable whose variation is to be explained.

** Explanatory variables:** The variables that are thought to cause the dependent variable to take on different values.

In the simple regression model, the dependent variable Y is related to only one explanatory variable X, and the relation between X and Y is linear:

**Y = a + bX**

**Figure-1 : Relation between Sales and Advertising expenditure**

This is the equation for a straight line, with X plotted along the horizontal axis and Y along the vertical axis. The parameter a is called the intercept parameter because it gives the value of Y at the point where the regression line crosses the Y-axis. (X is equal to zero at this point). The parameter b is called the slope parameter because it gives the slope of the regression line. The slope of a line measures the rate of change in Y as X changes (DY/DX); it is therefore the change in Y per unit change in X.

**Intercept parameter: **The parameter that gives the value of Y at the point where the regression line crosses the Y-axis.

**Slope parameter: **The slope of the regression line, b = ¦Y/¦X, or the change in Y associated with a one-unit change in X.

Y and X are linearly related in a regression model. The effect of change in X on the value of Y is constant. A one-unit change in X causes Y to change by a constant b units.

The figure shows the true relation between sales and advertising expenditures. If a firm chooses to spend nothing on A, its sales are expected to be 100 crores per month. If the firm spends 30 crores on A then it can expect sales of 250 crores (=100 + 5 × 30). ¦S/¦A = 5, i.e., for every 1 unit increase on advertising, the firm can expect a 5 unit increase in sales. Regression involves identifying and calculating specific relationships between the independent variables and the dependent variable. It involves a number of stages which are described in another section.

**SPECIFYING THE REGRESSION EQUATION**

The first thing that the organisation carrying out the regression analysis needs to do is to determine the range of variables which may affect demand for the product concerned. For example, the own price of a good might reasonably be expected to be a determinant of demand for most products, as would any advertising being done by the firm. The question of whether there are any substitute or complementary goods which need to be taken into account could then be raised. In the case of, an expensive consumer durable good, the cost and availability of credit might be a consideration. Any special ‘other’ factors affecting demand could then be identified and so on. This choice of variables has to be made before it is possible to progress to the next stage.

**Data Collection**

Once the relevant variables have been identified, quantitative data need to be assembled for each of them. This will be easier for some of the variables than for others. In dealing with an established product, for example, the firm might reasonably be expected to have access to a range of information regarding the variables which it controls such as own price and advertising. What may be more

difficult to obtain, however, is information about competitors’ products. On this front, price data can be obtained through observing retail prices, as this information by definition is in the public domain and cannot be hidden. This requires continued market observation, perhaps over a long period of time. Likewise, information about product design changes can be obtained by buying the competitors’ product(s), but this may be expensive if there are many on the market. Confidential, commercially sensitive information such as actual advertising expenditure by competitors and their proposed new products present much more difficult problems in terms of access and may have to be left out of the process altogether. Data on levels of disposable income, population variables, interest rates and credit availability are easier to obtain, for example from government statistics, but other variables are more problematic. How can things like expectations and tastes be measured for instance? In these cases the available data, perhaps resulting from market surveys, may be qualitative rather than quantitative. Some means of conversion need to be found if they are to be included in the regression analysis at all. These are the things which the decision maker needs to keep in mind while collecting and selecting data on the relevant variables. Once the first two steps have been completed, the next stage is to specify the likely form of the regression equation. There are two main forms which are used in practice -the linear demand function and the non-linear or power function. Both treat the demand for the product as the dependent variable, while the independent variables are those which have previously been identified as having an effect on demand. If, for example, the firm had decided that the only variables affecting demand for a particular product with its own price and advertising levels then the linear demand function would be written as:

**Q = a + bP + cA**

Alternatively, under these conditions the exponential (power) function would be written as:

**Q = a P ^{b}**

**A**

^{c}In each case, the ‘a’ term represents **the intercept** of the-line drawn from the equation with the vertical axis. The b and c terms represent the **regression coefficients** with respect to own price and advertising respectively. These show the impact of each of these variables on product demand. Once they have been estimated it is possible to predict the level of demand for any set of values of the independent variables simply by substituting them into the equation. The exponential form of the equation has the advantage that it can be rewritten to give direct estimates of the **respective elasticities of demand** for the independent variables. This is done by taking the log-linear form of the equation which in this case would be:

**log Q = log a + b log P + c log A**

Where b and c are the own price and advertising elasticities of demand respectively. This is a much easier approach than calculating elasticities through use of the linear form which involves using the equation:

ED = b

to calculate the elasticities in each case. In this case values of P and Q need to be obtained from the data set. Usually average values are substituted in the above equation to estimate elasticities.

Which of the two forms of equation is chosen depends upon the expected relationship between the variables being included. In practice, however, the actual relationship between them may not be known in advance. In this case, the decision maker may experiment with both forms of equation in order to find the one which most closely fits the data.