Home Linear Regression
Post
Cancel

Linear Regression

Given a dataset $\mathcal D = \lbrace(\boldsymbol {x_i},y_i)\rbrace_{i=1}^N$, where $\boldsymbol x \in \mathbb R^d$. we try to get a weight vector $(w_1,w_2,…,w_d)^\text T$ such that

\[y_i \approx \sum_{i=1}^d w_ix_i + b \tag{4.1}\]

Set $\boldsymbol w =(b,w_1,…,w_d)^\text T$, $\boldsymbol x=(1,x_1,…,x_d)^\text T$. Then the linear combination of $\boldsymbol w$ and $\boldsymbol x$ is a hypothesis recorded as $h(\boldsymbol x)$.

Based on least square method, we have squared error $E(\hat y,y)=(\hat y-y)^2$. And our goal is to minimize it. Next we do some calculations as follows:

\[\begin{split} E(\boldsymbol w) &= \frac{1}{N}\sum_{n=1}^N \left(\boldsymbol w^\text T \boldsymbol x_n - y_n\right)^2 \\ \\ &=\frac{1}{N}\left|\left| \begin{array}{ccc} \boldsymbol x_1^{\text T} \boldsymbol w - y_1 \\ \boldsymbol x_2^{\text T} \boldsymbol w - y_2 \\ \vdots \\ \boldsymbol x_N^{\text T} \boldsymbol w - y_N \\ \end{array} \right|\right|^2 \\ \\ &=\frac{1}{N}||\boldsymbol {Xw-y}||^2 \end{split} \tag{4.2}\]

where $\boldsymbol X=(\boldsymbol x_1^\text{T};\boldsymbol x_2^\text{T};…;\boldsymbol x_N^\text{T})$, $\boldsymbol y=(y_1,y_2,…,y_N)^\text T$.

Then our goal is

\[\min_{\boldsymbol w}\frac{1}{N}||\boldsymbol {Xw-y}||^2 \tag{4.3}\]

The target function $E(\boldsymbol w)$ is continuous, differentiable and convex. So we have the neccessary condition of optimal $\boldsymbol w$ :

\[\nabla E(\boldsymbol w)=\boldsymbol 0 \tag{4.4}\]

After derivation, we have

\[\frac{2}{N}(\boldsymbol {X}^\text T\boldsymbol {Xw}-\boldsymbol {X}^\text T\boldsymbol y)=\boldsymbol 0 \tag{4.5}\]

If matrix $\boldsymbol {X}^\text T\boldsymbol {X}$ is invertible, it’s easy to get the solution of $(4.5)$:

\[\boldsymbol w_{\text{lin}}=(\boldsymbol {X}^\text T\boldsymbol {X})^{-1}\boldsymbol {X}^\text T\boldsymbol y \tag{4.6}\]

Even if $\boldsymbol {X}^\text T\boldsymbol {X}$ is a singular matrix, No need to worry too much because most of programs computing inverse matrix can deal with it easily.

This post is licensed under CC BY 4.0 by the author.