Moving Linear Regression

Linear regression is one of my favorite models. While some people tout the models’ interpretability, I love linear models because of their flexibility and simplicity. Sometimes linear models use multiple variables. Sometimes, they use just one predictor variable. The usual picture of regression looks like this:

Linear regression is loved by many

The time rate of change of a signal can be used to predict in many real world phenomena. Time derivatives can be informative of which part of a cycle the data comes from. Predictive models that use this derivative directly can behave erratically if the source of derivative information is not smooth. The good old linear regression, when swept over windows of time series of data, can produce a smooth estimate of the time series rate of change.

Simulated position of a bouncing ball with noise

Maybe a first instinct at getting an estimate of the derivative is to divide the differences between terms in the series by the time step. When there is noise in the signal, this finite difference estimate of the derivative is not very good. The noise is amplified by the operation.

Bouncing ball position and first difference

One way to help this is to apply smoothing to the signal itself before calculating the finite differences. For example. here is the first difference derivative estimate of the exponentially weighted moving average of the data.

Derivative of EWM smoothed data is smoother

The local linear regression can provide a noise robust way to estimate the derivative that could have some advantages. First, it is simple. The derivative estimate is a linear function of the data in the window. It is a dot product of the data vector and a special vector.

Bouncing ball position and derivative estimates, two methods

This regression uses time values as its predictors, its X. The naive approach to moving linear regression would be to always update your predictor values at each time step. This is a window moving over the time series. Fortunately, in linear regression with a constant term, the slope estimate is not affected by adding a constant to the predictor. When you add or subtract a constant from the predictor, doing so changes the estimate of the Y intercept term but does not affect the slope.

Whenever I need to estimate the slope of a noisy signal, I use windowed linear regression because it is fast, can be expressed as a convolution, and provides noise reduction. The width of the window affects how much noise reduction and can be tuned for the noise level of the data.

Dan Snyder

Data vis for my hobbies: vinyl records, plants, computers

Previous
Previous

Agentic Coding FFT in Javascript

Next
Next

Musical Intervals and Lissajous curves