CHD's Memory: Least Squares Fit

Sunday, January 24, 2016

Least Squares Fit

I could have sworn that I'd created a post about the process of understanding how Tom used least squares fit to determine the slope of a set of data points. But I can't find it, so here is where I will go over the process in order.

First off, the concept of finding a least-squares fit can be understood simply as finding a linear function that most closely fits a set of points in (x,y) where for all (x,y), the sum of the squares of the differences of y=f(x) to the corresponding point (which is to say the errors) is a minimum.

This turns out to be something natural to solve using matrices, because the set of errors which is y - (mx + b) for all y's and x's turns out to be just a set of matrices:

[y0] - (m[x0] + b) = [e0]
[y1] - (m[x1] + b) = [e1]
. . . .
. . . .
[yn] - (m[xn] + b) = [en]

Some more links:

An example of a C program to compute standard deviation: http://alstatr.blogspot.com/2013/09/cc-variance-and-standard-deviation.html

An example of C code for computing the mean (as well as median and mode): http://stackoverflow.com/questions/20636914/c-mean-median-and-mode

A fairly obtuse presentation of least-squares-fitting basic concepts: http://mathworld.wolfram.com/LeastSquaresFitting.html