Least Squares Moving Average (LSMA)

Haiyue
11min

I. What is LSMA

The Least Squares Moving Average (LSMA) is a moving average based on linear regression, also known as the Linear Regression Value or End Point Moving Average. Within a sliding window, it fits a best-fit straight line to the price data using the least squares method, then takes the value of that line at the rightmost point (most recent time) as the indicator output.

Historical Background

The mathematical foundation of LSMA — the least squares method — was independently discovered by Carl Friedrich Gauss and Adrien-Marie Legendre in the early 19th century. The application of linear regression to financial time series matured in the second half of the 20th century. LSMA became practical as a technical indicator after computers became widely available, since each time point requires a complete linear regression fitting, making the computation far more demanding than simple weighted averages.

Indicator Classification

  • Type: Overlay indicator, plotted on the price chart
  • Category: Moving Averages
  • Default Parameters: Period n=20n = 20, data source is closing price (Close)
The Essence of LSMA

LSMA is not a traditional “average” in the conventional sense, but rather the endpoint prediction of a linear regression. It assumes that prices within the window follow a linear trend and outputs the best estimate of that trend at the current moment. This gives LSMA a natural forward-looking quality in trending markets.


II. Mathematical Principles and Calculation

Linear Regression Basics

Within the window [tn+1,t][t-n+1, t], LSMA fits a linear model to the price series:

P^i=a+bxi\hat{P}_i = a + b \cdot x_i

Where xi=ixˉx_i = i - \bar{x} (centered time index), i=0,1,,n1i = 0, 1, \ldots, n-1.

Using the least squares method, the slope bb and intercept aa are:

b=i=0n1(xixˉ)(PiPˉ)i=0n1(xixˉ)2b = \frac{\sum_{i=0}^{n-1} (x_i - \bar{x})(P_i - \bar{P})}{\sum_{i=0}^{n-1} (x_i - \bar{x})^2}

a=Pˉbxˉa = \bar{P} - b \cdot \bar{x}

Where xˉ=n12\bar{x} = \frac{n-1}{2} and Pˉ=1nPi\bar{P} = \frac{1}{n}\sum P_i.

LSMA Value

LSMA takes the fitted value at the rightmost endpoint of the window (x=n1x = n - 1):

LSMAt=a+b(n1)LSMA_t = a + b \cdot (n - 1)

Simplified Formula

Through derivation, LSMA can be expressed as a weighted average of prices, with the weight formula:

wi=6(2in+1)n(n21)+1nw_i = \frac{6(2i - n + 1)}{n(n^2 - 1)} + \frac{1}{n}

Where i=0,1,,n1i = 0, 1, \ldots, n-1 (0 being the oldest data). These weights sum to 1, assigning the highest weight to the newest data and the lowest (potentially negative) weight to the oldest data.

Negative Weights

LSMA weights can be negative! For the oldest data points in the window, negative weights mean that old data has a reverse contribution to the LSMA value. This is a unique characteristic that distinguishes LSMA from all other moving averages, and is the mathematical reason it can “look ahead.”

Relationship with Other Regression Indicators

IndicatorMeaning
LSMA (endpoint value)Value of the regression line at x=n1x = n-1
Linear Regression SlopeSlope bb, reflecting trend strength
Linear Regression InterceptIntercept aa
R-SquaredGoodness of fit, reflecting the reliability of the linear trend

III. Python Implementation

import numpy as np
import pandas as pd

def lsma(close: pd.Series, period: int = 20) -> pd.Series:
    """
    Calculate the Least Squares Moving Average (LSMA)

    Parameters
    ----------
    close : pd.Series
        Closing price series
    period : int
        Calculation period, default is 20

    Returns
    -------
    pd.Series
        LSMA value series (linear regression endpoint value)
    """
    def _linreg_endpoint(window):
        x = np.arange(period)
        slope, intercept = np.polyfit(x, window, 1)
        return intercept + slope * (period - 1)

    return close.rolling(window=period, min_periods=period).apply(
        _linreg_endpoint, raw=True
    )


def lsma_numpy(close: np.ndarray, period: int = 20) -> np.ndarray:
    """
    Manual LSMA implementation using numpy (with slope output)
    """
    n = len(close)
    result = np.full(n, np.nan)
    slopes = np.full(n, np.nan)

    x = np.arange(period, dtype=float)
    x_mean = x.mean()
    x_var = np.sum((x - x_mean) ** 2)

    for i in range(period - 1, n):
        window = close[i - period + 1 : i + 1]
        p_mean = window.mean()

        # Least squares method
        b = np.sum((x - x_mean) * (window - p_mean)) / x_var
        a = p_mean - b * x_mean

        result[i] = a + b * (period - 1)
        slopes[i] = b

    return result, slopes


def lsma_weights(period: int = 20) -> np.ndarray:
    """
    Calculate the equivalent weight vector for LSMA
    """
    n = period
    i = np.arange(n)
    weights = 6 * (2 * i - n + 1) / (n * (n**2 - 1)) + 1 / n
    return weights


# ========== Usage Example ==========
if __name__ == "__main__":
    np.random.seed(42)
    dates = pd.date_range("2024-01-01", periods=120, freq="D")
    price = 100 + np.cumsum(np.random.randn(120) * 0.8)

    df = pd.DataFrame({
        "date": dates,
        "open":  price + np.random.randn(120) * 0.3,
        "high":  price + np.abs(np.random.randn(120) * 0.6),
        "low":   price - np.abs(np.random.randn(120) * 0.6),
        "close": price,
        "volume": np.random.randint(1000, 10000, size=120),
    })
    df.set_index("date", inplace=True)

    # LSMA compared with other moving averages
    df["SMA_20"]  = df["close"].rolling(20).mean()
    df["EMA_20"]  = df["close"].ewm(span=20, adjust=False).mean()
    df["LSMA_20"] = lsma(df["close"], period=20)

    print("=== SMA vs EMA vs LSMA Comparison ===")
    print(df[["close", "SMA_20", "EMA_20", "LSMA_20"]].tail(10))

    # Get slope information
    lsma_vals, slope_vals = lsma_numpy(df["close"].values, period=20)
    df["slope"] = slope_vals
    df["trend"] = np.where(slope_vals > 0, "UP",
                  np.where(slope_vals < 0, "DOWN", "FLAT"))

    print("\n=== LSMA Slope and Trend Direction ===")
    print(df[["close", "LSMA_20", "slope", "trend"]].tail(10))

    # Display LSMA weights
    w = lsma_weights(20)
    print("\n=== LSMA(20) Weight Distribution ===")
    print(f"Oldest data weight: {w[0]:.4f}")
    print(f"Middle data weight: {w[10]:.4f}")
    print(f"Newest data weight: {w[19]:.4f}")
    print(f"Sum of weights: {w.sum():.6f}")
    print(f"Number of negative weights: {(w < 0).sum()}")

IV. Problems the Indicator Solves

1. Quantitative Trend Direction Assessment

LSMA’s slope directly provides the direction and strength of the trend:

  • Slope > 0: Price exhibits an uptrend within the window
  • Slope < 0: Price exhibits a downtrend within the window
  • Absolute value of slope: Strength of the trend

2. Low-Lag Trend Tracking

Since LSMA takes the endpoint value of the regression line (not the center value), it inherently has an “extrapolation” effect, making it closer to the current price and lower in lag than SMA during trends.

3. Trend Quality Assessment

When used in combination with R-Squared (coefficient of determination), trend “quality” can be assessed:

  • R-Squared close to 1: Price is highly linear; trend is reliable
  • R-Squared close to 0: No clear linear trend; signals are unreliable

4. Regression Channels

LSMA can be combined with Standard Error to form a Linear Regression Channel, similar to Bollinger Bands but based on regression rather than simple averages.

5. Mean Reversion Strategies

When price deviates significantly from LSMA, it is expected to revert toward LSMA. This mean-reversion property can be used for:

  • Overbought/oversold assessment
  • Pairs trading spread analysis
  • Dynamic stop-loss/take-profit
Combination Strategy

LSMA + R-Squared is a powerful trend trading framework:

  1. R-Squared > 0.5 -> Trend is valid
  2. LSMA slope > 0 -> Uptrend, consider going long
  3. Price pulls back to near LSMA from above -> Entry point

V. Advantages, Disadvantages, and Use Cases

Advantages

AdvantageDescription
Low lagEndpoint extrapolation effect gives LSMA lower lag than SMA
Trend quantificationSlope directly quantifies trend direction and strength
Theoretically rigorousBased on statistical least squares method with solid mathematical foundation
Multi-dimensional informationA single regression yields endpoint value, slope, and R-Squared simultaneously

Disadvantages

DisadvantageDescription
Negative weightsNegative weights on old data may cause LSMA values to exceed the price range
Fails with nonlinear trendsAssumes linear relationship; fits poorly in curving trends
High computationEach time point requires a complete regression; O(n)O(n) complexity
Overshoot riskMay overshoot at trend endpoints due to the extrapolation effect

Use Cases

  • Trend trading: Use slope to determine trend direction and strength
  • Mean reversion: Trade based on price deviation from LSMA
  • Regression channels: Combine with standard error to build dynamic channels
  • Quantitative research: As a fundamental tool for linear trend analysis

Comparison with Similar Indicators

FeatureSMAEMALSMAHMA
LagHighMediumLowVery low
SmoothnessHighMediumMediumHigh
Can overshootNoNoYesYes
Extra informationNoneNoneSlope / R2None
Weight characteristicsEqualExponentialIncludes negative weightsLinear
Notes
  1. LSMA may frequently cross the price line in ranging markets and should not be used as a standalone entry signal.
  2. When R-Squared is very low (< 0.2), both the LSMA endpoint value and slope are unreliable; LSMA-based trading should be paused.
  3. LSMA’s negative weight property means its value may briefly exceed the highest price or fall below the lowest price within the window — this is a mathematical characteristic, not a bug.