Linear Regression Curve
Linear Regression Curve (LRC)
- Category: Trend — Overlay Indicator
- Default Parameters: period=20
- Output: Fitted linear regression value at each time point (connected into a curve)
- Applicable Markets: Stocks, Futures, Forex, Cryptocurrencies
I. What is the Linear Regression Curve
The Linear Regression Curve (LRC) is a technical indicator that applies classic statistical ordinary least squares (OLS) linear regression to financial time series. At each time point, it fits a linear regression to the most recent periods of closing prices and takes the fitted value at the current moment as its output. Connecting these fitted values produces a smooth trend curve.
The linear regression method itself was independently proposed by Carl Friedrich Gauss and Adrien-Marie Legendre in the early 19th century. Its application in technical analysis developed alongside the proliferation of computers.
Relationship to LSMA
The Linear Regression Curve is numerically equivalent to the Least Squares Moving Average (LSMA). Both produce identical results — they take the fitted value at the end of a rolling regression window. The difference lies mainly in naming and conceptual emphasis:
- LSMA: Emphasizes the “moving average” concept, highlighting it as a smoothing method with reduced lag
- LRC: Emphasizes the “regression fitting” concept, focusing on trend slope and goodness of fit
Linear regression-based technical indicators include: Linear Regression Curve (LRC), Linear Regression Slope (LR Slope), Linear Regression Channel (LR Channel), Standard Error Bands (SE Bands), and R-Squared. This article focuses on LRC.
II. Mathematical Principles and Calculation
Ordinary Least Squares
Given a rolling window of length with price sequence and corresponding independent variable (time index) (typically ).
The linear regression model is:
The least squares solution for slope and intercept :
Where and .
Linear Regression Curve Value
The LRC output at time is the regression line’s value at the window’s endpoint:
That is, after fitting a line on the window , take the value at .
Simplified Formulas
Since , we can pre-compute:
The slope can therefore be simplified to:
Significance of the Slope
The slope directly reflects trend direction and strength:
- : Price exhibits an uptrend within the window
- : Price exhibits a downtrend within the window
- The larger , the steeper the trend
Calculation Steps
- Set the regression window length (default 20)
- For each time point , take the closing prices in the interval
- Fit using least squares
- Output (the fitted value at the window’s endpoint)
- Optionally output the slope and R-Squared for supplementary analysis
III. Python Implementation
import numpy as np
import pandas as pd
def linear_regression_curve(df: pd.DataFrame,
period: int = 20,
column: str = 'close') -> pd.DataFrame:
"""
Calculate the Linear Regression Curve (LRC) and related statistics.
Parameters
----------
df : DataFrame, must contain the specified price column
period : Regression window length, default 20
column : Column name for regression, default 'close'
Returns
----------
DataFrame with 'lrc' (regression value), 'slope', 'r_squared' (R^2) columns
"""
prices = df[column].values
n = len(prices)
lrc = np.full(n, np.nan)
slope = np.full(n, np.nan)
intercept = np.full(n, np.nan)
r_squared = np.full(n, np.nan)
# Pre-compute x-related constants
x = np.arange(1, period + 1, dtype=float)
sum_x = x.sum()
sum_x2 = (x ** 2).sum()
mean_x = x.mean()
for i in range(period - 1, n):
y = prices[i - period + 1: i + 1]
mean_y = y.mean()
sum_xy = np.sum(x * y)
sum_y = y.sum()
# Slope
denominator = period * sum_x2 - sum_x ** 2
b = (period * sum_xy - sum_x * sum_y) / denominator
a = mean_y - b * mean_x
# LRC value (fitted value at window endpoint)
lrc[i] = a + b * period
slope[i] = b
intercept[i] = a
# R-Squared
y_hat = a + b * x
ss_res = np.sum((y - y_hat) ** 2)
ss_tot = np.sum((y - mean_y) ** 2)
if ss_tot > 0:
r_squared[i] = 1 - ss_res / ss_tot
else:
r_squared[i] = 1.0 # All values are identical
result = pd.DataFrame({
'lrc': lrc,
'slope': slope,
'intercept': intercept,
'r_squared': r_squared
}, index=df.index)
return result
def linear_regression_curve_vectorized(df: pd.DataFrame,
period: int = 20,
column: str = 'close') -> pd.Series:
"""
Vectorized version using pandas rolling + apply.
Returns only the LRC values, suitable for fast computation.
"""
x = np.arange(1, period + 1, dtype=float)
sum_x = x.sum()
sum_x2 = (x ** 2).sum()
mean_x = x.mean()
def _fit(y):
sum_xy = np.sum(x * y)
sum_y = y.sum()
b = (period * sum_xy - sum_x * sum_y) / (period * sum_x2 - sum_x ** 2)
a = y.mean() - b * mean_x
return a + b * period
return df[column].rolling(window=period).apply(_fit, raw=True)
# ========== Usage Example ==========
if __name__ == "__main__":
np.random.seed(42)
n = 100
dates = pd.date_range('2024-01-01', periods=n, freq='B')
# Simulate price data with trend
trend = np.linspace(0, 20, n)
cycle = 5 * np.sin(np.linspace(0, 6 * np.pi, n))
noise = np.cumsum(np.random.randn(n) * 0.3)
price = 100 + trend + cycle + noise
df = pd.DataFrame({
'open': price + np.random.uniform(-0.5, 0.5, n),
'high': price + np.abs(np.random.randn(n)) * 1.2,
'low': price - np.abs(np.random.randn(n)) * 1.2,
'close': price + np.random.uniform(-0.3, 0.3, n),
'volume': np.random.randint(1000, 5000, n)
}, index=dates)
# Calculate LRC for multiple periods
lrc_20 = linear_regression_curve(df, period=20)
lrc_50 = linear_regression_curve(df, period=50)
merged = df[['close']].copy()
merged['lrc_20'] = lrc_20['lrc']
merged['slope_20'] = lrc_20['slope']
merged['r_sq_20'] = lrc_20['r_squared']
merged['lrc_50'] = lrc_50['lrc']
print("Linear Regression Curve data for the last 15 periods:")
print(merged.tail(15).to_string())
# Trend assessment example
latest = merged.dropna().iloc[-1]
if latest['slope_20'] > 0 and latest['r_sq_20'] > 0.7:
print(f"\nCurrently in a strong uptrend (slope={latest['slope_20']:.4f}, R2={latest['r_sq_20']:.3f})")
elif latest['slope_20'] < 0 and latest['r_sq_20'] > 0.7:
print(f"\nCurrently in a strong downtrend (slope={latest['slope_20']:.4f}, R2={latest['r_sq_20']:.3f})")
else:
print(f"\nNo clear trend (slope={latest['slope_20']:.4f}, R2={latest['r_sq_20']:.3f})")
# Compare lag with SMA
merged['sma_20'] = df['close'].rolling(20).mean()
merged['lrc_minus_sma'] = merged['lrc_20'] - merged['sma_20']
print(f"\nAverage difference between LRC and SMA: {merged['lrc_minus_sma'].dropna().mean():.4f}")
print("(Positive means LRC is above SMA, indicating LRC tracks price more closely in uptrends)")
The window-by-window loop version has complexity, which may be slow for large datasets or short-period backtests. The vectorized version using pandas rolling is slightly faster but fundamentally remains . For higher performance, consider using Numba or C extensions.
IV. Problems the Indicator Solves
1. Reducing Moving Average Lag
A standard SMA has approximately periods of lag. LRC fits a straight line and takes the endpoint value, tracking the current price more closely in trends and effectively reducing lag.
| Indicator | Position in Uptrend | Position in Downtrend |
|---|---|---|
| SMA | Relatively far below price | Relatively far above price |
| LRC | Close to price or slightly ahead | Close to price or slightly ahead |
2. Trend Direction and Strength Quantification
- Slope : Uptrend; the larger , the steeper the rise
- Slope : Downtrend; the larger , the steeper the decline
- Slope near 0: Sideways consolidation
3. Trend Reliability Assessment
(coefficient of determination) measures the quality of the linear fit:
- : Highly linear price movement, strong and reliable trend
- : Some trend present, but with significant fluctuations
- : No clear linear trend, market is in a ranging state
4. Crossover Signals
- Price crosses above LRC: Potential buy signal
- Price crosses below LRC: Potential sell signal
- Short-period LRC crosses above long-period LRC: Trend turns bullish
Only trust LRC crossover signals when to significantly reduce false signals during choppy markets.
V. Advantages, Disadvantages, and Use Cases
Advantages
- Minimal lag: Among all moving average-type indicators, LRC has the least lag
- Solid mathematical foundation: Based on the classic least squares method with strong theoretical support
- Multi-dimensional information: Simultaneously provides trend value, slope, and goodness of fit (R^2)
- Excellent in trending markets: Reflects direction changes faster than SMA/EMA in clear trends
- Highly extensible: Naturally extends to linear regression channels, standard error bands, and other derivatives
Disadvantages
- Overfitting in ranging markets: In trendless markets, LRC fluctuates frequently with price, producing many false signals
- Sensitive to endpoint data: The last few data points in the window disproportionately influence the fit; a single outlier can significantly alter the slope
- Higher computational cost: Per-window regression is more expensive than SMA and EMA
- “Leading” may be “overshooting”: At trend endpoints, LRC may extrapolate excessively, generating more extreme signals than warranted
Use Cases
| Scenario | Suitability | Notes |
|---|---|---|
| Trend identification and confirmation | High | Slope + R^2 combination assesses trend quality |
| Replacing traditional MA as a signal line | High | Reduces lag-induced delayed entries |
| Mean reversion strategies | Medium | Price reverts when deviating too far from LRC |
| High-frequency / tick data | Medium | Beware of computational performance |
| Pure ranging strategies | Low | LRC signals are unstable in sideways markets |
Comparison with Similar Indicators
| Dimension | LRC/LSMA | SMA | EMA | HMA |
|---|---|---|---|---|
| Lag | Lowest | Highest | Medium | Low |
| Smoothness | Medium | High | Medium | Medium |
| Endpoint Sensitivity | High | Low | Medium | Medium-High |
| Computational Complexity | High | Low | Low | Medium |
| Additional Information | Slope, R^2 | None | None | None |
- Use LRC + R-Squared combination: Only trust trend signals when R^2 is sufficiently high
- Combine with a linear regression channel (LRC +/- k standard errors) to build a complete trend trading system
- Short-period LRC (e.g., 10 periods) captures short swings; long-period LRC (e.g., 50 periods) confirms the larger trend
- Monitor the rate of change in slope: Slope turning from positive to negative is an early signal of trend weakening