Sunday, June 9, 2019

A Simple Derivation of the Factors in the Least Squares Fit Error Formulas


  The most difficult part of finding expressions for the expected values of the slope and intercept as functions of the random errors involved in the least squares fits is getting a handle on the problem. One can start by writing the x value as a sum of an expected value, ⟨x⟩, a displacement, Δx along the x-axis and a random error, δx and doing the same for the y value. To simplify the derivation we start with two variables u and v instead of x and y and combine the expressions to get an expression for the product then determine the expected values of the terms. The expression can be simplified considerably since the expected value of the differences and the expected value of uncorrelated differences are zero.


Setting u and v equal to x gives the formula for the standard deviation of x as a function of the standard deviation of the errors in x and similarly for y. When u=x and v=y we have to be a little more careful since the Δx and Δy are correlated due to their dependence on the equation for the line. As shown through the "diagnostic" the numerator for the ordinary least squares for the slope is essentially a constant with some errors due to the expected values of the product terms not being exactly zero. Evaluating this factor using the estimated expected values and exact values gives results that agree fairly well.

⟨xy⟩-⟨x⟩⟨y⟩=0.1797 vs sσx0²=0.1812.

With errors present the expect values for errors of the slope and intercept for the fits are not zero as one might expect but their deviation appears to be negligible for small random errors.

Supplemental (Jun 12): Note a systematic error in a linear fit might raise some doubts about the validity of the Gauss-Markov theorem.

No comments:

Post a Comment