# Least-Squares Regression Method

Least-squares linear regression is a statistical technique that may be used to estimate the total cost at the given level of activity (units, labor/machine hours etc.) based on past cost data. It mathematically fits a straight cost line over a scatter-chart of a number of activity and total-cost pairs in such a way that the sum of squares of the vertical distances between the scattered points and the cost line is minimized. The term least-squares regression implies that the ideal fitting of the regression line is achieved by minimizing the sum of squares of the distances between the straight line and all the points on the graph.

Assuming that the cost varies along y-axis and activity levels along x-axis, the required cost line may be represented in the form of following equation:

y = a + bx

In the above equation, *a* is the y-intercept of the line and it equals the approximate fixed cost at any level of activity. Whereas *b* is the slope of the line and it equals the average variable cost per unit of activity.

## Formulas

By using mathematical techniques beyond the scope of this article, the following formulas to calculate *a* and *b* may be derived:

$$ Unit\ Variable\ Cost\ (b)=\frac{n\sum xy-\left(\sum x\right)\left(\sum y\right)}{n\sum x^2-\left(\sum x\right)^2} $$

$$ Total\ Fixed\ Cost\ (a)=\frac{\sum y-b\sum x}{n} $$

Where,

*n* is number of pairs of units—total-cost used in the calculation;

*Σy* is the sum of total costs of all data pairs;

*Σx* is the sum of units of all data pairs;

*Σxy* is the sum of the products of cost and units of all data pairs; and

*Σx ^{2}* is the sum of squares of units of all data pairs.

The following example based on the same data as in high-low method tries to illustrate the usage of least squares linear regression method to split a mixed cost into its fixed and variable components:

## Example

Based on the following data of number of units produced and the corresponding total cost, estimate the total cost of producing 4,000 units. Use the least-squares linear regression method.

Month | Units | Cost |
---|---|---|

1 | 1,520 | $36,375 |

2 | 1,250 | 38,000 |

3 | 1,750 | 41,750 |

4 | 1,600 | 42,360 |

5 | 2,350 | 55,080 |

6 | 2,100 | 48,100 |

7 | 3,000 | 59,000 |

8 | 2,750 | 56,800 |

**Solution:**

x | y | x^{2} | xy |
---|---|---|---|

1,520 | $36,375 | 2,310,400 | 55,290,000 |

1,250 | 38,000 | 1,562,500 | 47,500,000 |

1,750 | 41,750 | 3,062,500 | 73,062,500 |

1,600 | 42,360 | 2,560,000 | 67,776,000 |

2,350 | 55,080 | 5,522,500 | 129,438,000 |

2,100 | 48,100 | 4,410,000 | 101,010,000 |

3,000 | 59,000 | 9,000,000 | 177,000,000 |

2,750 | 56,800 | 7,562,500 | 156,200,000 |

16,320 | 377,465 | 35,990,400 | 807,276,500 |

We have,

n = 8;

Σx = 16,320;

Σy = 377,465;

Σx^{2} = 35,990,400; and

Σxy = 807,276,500

Calculating the average variable cost per unit:

$$ b=\frac{8 \times 807,276,500 - 16,320 \times 377,465}{8 \times 35,990,400 - 16,320^2} \approx 13.8 $$

Calculating the approximate total fixed cost:

$$ a=\frac{377,465 - 13.8078 \times 16,320}{8} \approx 19,015 $$

The cost-volume formula now becomes:

y = 19,015 + 13.8x

At 4,000 activity level, the estimated total cost is $74,215 [= 19,015 + 13.8 × 4,000].

Written by Irfanullah Jan and last revised on