Home Linear Regression
Post
Cancel

Linear Regression

Given a dataset D={(xi,yi)}i=1N, where xRd. we try to get a weight vector (w1,w2,,wd)T such that

(4.1)yii=1dwixi+b

Set w=(b,w1,,wd)T, x=(1,x1,,xd)T. Then the linear combination of w and x is a hypothesis recorded as h(x).

Based on least square method, we have squared error E(y^,y)=(y^y)2. And our goal is to minimize it. Next we do some calculations as follows:

(4.2)E(w)=1Nn=1N(wTxnyn)2=1N||x1Twy1x2Twy2xNTwyN||2=1N||Xwy||2

where X=(x1T;x2T;;xNT), y=(y1,y2,,yN)T.

Then our goal is

(4.3)minw1N||Xwy||2

The target function E(w) is continuous, differentiable and convex. So we have the neccessary condition of optimal w :

(4.4)E(w)=0

After derivation, we have

(4.5)2N(XTXwXTy)=0

If matrix XTX is invertible, it’s easy to get the solution of (4.5):

(4.6)wlin=(XTX)1XTy

Even if XTX is a singular matrix, No need to worry too much because most of programs computing inverse matrix can deal with it easily.

This post is licensed under CC BY 4.0 by the author.