python實(shí)現(xiàn)線(xiàn)性回歸的示例代碼
1線(xiàn)性回歸
1.1簡(jiǎn)單線(xiàn)性回歸

在簡(jiǎn)單線(xiàn)性回歸中,通過(guò)調(diào)整a和b的參數(shù)值,來(lái)擬合從x到y(tǒng)的線(xiàn)性關(guān)系。下圖為進(jìn)行擬合所需要優(yōu)化的目標(biāo),也即是MES(Mean Squared Error),只不過(guò)省略了平均的部分(除以m)。

對(duì)于簡(jiǎn)單線(xiàn)性回歸,只有兩個(gè)參數(shù)a和b,通過(guò)對(duì)MSE優(yōu)化目標(biāo)求極值(最小二乘法),即可求得最優(yōu)a和b如下,所以在訓(xùn)練簡(jiǎn)單線(xiàn)性回歸模型時(shí),也只需要根據(jù)數(shù)據(jù)求解這兩個(gè)參數(shù)值即可。

下面使用波士頓房?jī)r(jià)數(shù)據(jù)集中,索引為5的特征RM (average number of rooms per dwelling)來(lái)進(jìn)行簡(jiǎn)單線(xiàn)性回歸。其中使用的評(píng)價(jià)指標(biāo)為:





# 以sklearn的形式對(duì)simple linear regression 算法進(jìn)行封裝
import numpy as np
import sklearn.datasets as datasets
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from sklearn.metrics import mean_squared_error,mean_absolute_error
np.random.seed(123)
class SimpleLinearRegression():
def __init__(self):
"""
initialize model parameters
self.a_=None
self.b_=None
def fit(self,x_train,y_train):
training model parameters
Parameters
----------
x_train:train x ,shape:data [N,]
y_train:train y ,shape:data [N,]
assert (x_train.ndim==1 and y_train.ndim==1),\
"""Simple Linear Regression model can only solve single feature training data"""
assert len(x_train)==len(y_train),\
"""the size of x_train must be equal to y_train"""
x_mean=np.mean(x_train)
y_mean=np.mean(y_train)
self.a_=np.vdot((x_train-x_mean),(y_train-y_mean))/np.vdot((x_train-x_mean),(x_train-x_mean))
self.b_=y_mean-self.a_*x_mean
def predict(self,input_x):
make predictions based on a batch of data
input_x:shape->[N,]
assert input_x.ndim==1 ,\
"""Simple Linear Regression model can only solve single feature data"""
return np.array([self.pred_(x) for x in input_x])
def pred_(self,x):
give a prediction based on single input x
return self.a_*x+self.b_
def __repr__(self):
return "SimpleLinearRegressionModel"
if __name__ == '__main__':
boston_data = datasets.load_boston()
x = boston_data['data'][:, 5] # total x data (506,)
y = boston_data['target'] # total y data (506,)
# keep data with target value less than 50.
x = x[y < 50] # total x data (490,)
y = y[y < 50] # total x data (490,)
plt.scatter(x, y)
plt.show()
# train size:(343,) test size:(147,)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3)
regs = SimpleLinearRegression()
regs.fit(x_train, y_train)
y_hat = regs.predict(x_test)
rmse = np.sqrt(np.sum((y_hat - y_test) ** 2) / len(x_test))
mse = mean_squared_error(y_test, y_hat)
mae = mean_absolute_error(y_test, y_hat)
# notice
R_squared_Error = 1 - mse / np.var(y_test)
print('mean squared error:%.2f' % (mse))
print('root mean squared error:%.2f' % (rmse))
print('mean absolute error:%.2f' % (mae))
print('R squared Error:%.2f' % (R_squared_Error))輸出結(jié)果:
mean squared error:26.74
root mean squared error:5.17
mean absolute error:3.85
R squared Error:0.50
數(shù)據(jù)的可視化:

1.2 多元線(xiàn)性回歸

多元線(xiàn)性回歸中,單個(gè)x的樣本擁有了多個(gè)特征,也就是上圖中帶下標(biāo)的x。
其結(jié)構(gòu)可以用向量乘法表示出來(lái):
為了便于計(jì)算,一般會(huì)將x增加一個(gè)為1的特征,方便與截距bias計(jì)算。


而多元線(xiàn)性回歸的優(yōu)化目標(biāo)與簡(jiǎn)單線(xiàn)性回歸一致。

通過(guò)矩陣求導(dǎo)計(jì)算,可以得到方程解,但求解的時(shí)間復(fù)雜度很高。

下面使用正規(guī)方程解的形式,來(lái)對(duì)波士頓房?jī)r(jià)的所有特征做多元線(xiàn)性回歸。
import numpy as np
from PlayML.metrics import r2_score
from sklearn.model_selection import train_test_split
import sklearn.datasets as datasets
from PlayML.metrics import root_mean_squared_error
np.random.seed(123)
class LinearRegression():
def __init__(self):
self.coef_=None # coeffient
self.intercept_=None # interception
self.theta_=None
def fit_normal(self, x_train, y_train):
"""
use normal equation solution for multiple linear regresion as model parameters
Parameters
----------
theta=(X^T * X)^-1 * X^T * y
assert x_train.shape[0] == y_train.shape[0],\
"""size of the x_train must be equal to y_train """
X_b=np.hstack([np.ones((len(x_train), 1)), x_train])
self.theta_=np.linalg.inv(X_b.T.dot(X_b)).dot(X_b.T).dot(y_train) # (featere,1)
self.coef_=self.theta_[1:]
self.intercept_=self.theta_[0]
def predict(self,x_pred):
"""給定待預(yù)測(cè)數(shù)據(jù)集X_predict,返回表示X_predict的結(jié)果向量"""
assert self.intercept_ is not None and self.coef_ is not None, \
"must fit before predict!"
assert x_pred.shape[1] == len(self.coef_), \
"the feature number of X_predict must be equal to X_train"
X_b=np.hstack([np.ones((len(x_pred),1)),x_pred])
return X_b.dot(self.theta_)
def score(self,x_test,y_test):
Calculate evaluating indicator socre
---------
x_test:x test data
y_test:true label y for x test data
y_pred=self.predict(x_test)
return r2_score(y_test,y_pred)
def __repr__(self):
return "LinearRegression"
if __name__ == '__main__':
# use boston house price dataset for test
boston_data = datasets.load_boston()
x = boston_data['data'] # total x data (506,)
y = boston_data['target'] # total y data (506,)
# keep data with target value less than 50.
x = x[y < 50] # total x data (490,)
y = y[y < 50] # total x data (490,)
# train size:(343,) test size:(147,)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3,random_state=123)
regs = LinearRegression()
regs.fit_normal(x_train, y_train)
# calc error
score=regs.score(x_test,y_test)
rmse=root_mean_squared_error(y_test,regs.predict(x_test))
print('R squared error:%.2f' % (score))
print('Root mean squared error:%.2f' % (rmse))輸出結(jié)果:
R squared error:0.79
Root mean squared error:3.36
1.3 使用sklearn中的線(xiàn)性回歸模型
import sklearn.datasets as datasets
from sklearn.linear_model import LinearRegression
import numpy as np
from sklearn.model_selection import train_test_split
from PlayML.metrics import root_mean_squared_error
np.random.seed(123)
if __name__ == '__main__':
# use boston house price dataset
boston_data = datasets.load_boston()
x = boston_data['data'] # total x size (506,)
y = boston_data['target'] # total y size (506,)
# keep data with target value less than 50.
x = x[y < 50] # total x size (490,)
y = y[y < 50] # total x size (490,)
# train size:(343,) test size:(147,)
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=123)
regs = LinearRegression()
regs.fit(x_train, y_train)
# calc error
score = regs.score(x_test, y_test)
rmse = root_mean_squared_error(y_test, regs.predict(x_test))
print('R squared error:%.2f' % (score))
print('Root mean squared error:%.2f' % (rmse))
print('coeffient:',regs.coef_.shape)
print('interception:',regs.intercept_.shape)R squared error:0.79 Root mean squared error:3.36 coeffient: (13,) interception: ()
到此這篇關(guān)于python實(shí)現(xiàn)線(xiàn)性回歸的文章就介紹到這了,更多相關(guān)python線(xiàn)性回歸內(nèi)容請(qǐng)搜索腳本之家以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持腳本之家!
- python實(shí)現(xiàn)線(xiàn)性回歸算法
- python深度總結(jié)線(xiàn)性回歸
- python機(jī)器學(xué)習(xí)基礎(chǔ)線(xiàn)性回歸與嶺回歸算法詳解
- Python線(xiàn)性回歸圖文實(shí)例詳解
- python數(shù)據(jù)分析之線(xiàn)性回歸選擇基金
- python基于numpy的線(xiàn)性回歸
- Python實(shí)現(xiàn)多元線(xiàn)性回歸的梯度下降法
- Python構(gòu)建簡(jiǎn)單線(xiàn)性回歸模型
- Python反向傳播實(shí)現(xiàn)線(xiàn)性回歸步驟詳細(xì)講解
- python繪制y關(guān)于x的線(xiàn)性回歸線(xiàn)性方程圖像實(shí)例
- python實(shí)現(xiàn)線(xiàn)性回歸的示例代碼
相關(guān)文章
python實(shí)現(xiàn)NB-IoT模塊遠(yuǎn)程控制
這篇文章主要為大家詳細(xì)介紹了python實(shí)現(xiàn)NB-IoT模塊遠(yuǎn)程控制,具有一定的參考價(jià)值,感興趣的小伙伴們可以參考一下2018-06-06
python利用pymysql和openpyxl實(shí)現(xiàn)操作MySQL數(shù)據(jù)庫(kù)并插入數(shù)據(jù)
這篇文章主要為大家詳細(xì)介紹了如何使用Python連接MySQL數(shù)據(jù)庫(kù),并從Excel文件中讀取數(shù)據(jù),將其插入到MySQL數(shù)據(jù)庫(kù)中,有需要的小伙伴可以參考一下2023-10-10
更換Django默認(rèn)的模板引擎為jinja2的實(shí)現(xiàn)方法
今天小編就為大家分享一篇更換Django默認(rèn)的模板引擎為jinja2的實(shí)現(xiàn)方法,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2018-05-05
在Python中使用判斷語(yǔ)句和循環(huán)的教程
這篇文章主要介紹了在Python中使用判斷語(yǔ)句和循環(huán)的教程,是Python學(xué)習(xí)當(dāng)中的基礎(chǔ)知識(shí),代碼基于Python2.x,需要的朋友可以參考下2015-04-04
tensorflow+k-means聚類(lèi)簡(jiǎn)單實(shí)現(xiàn)貓狗圖像分類(lèi)的方法
這篇文章主要介紹了tensorflow+k-means聚類(lèi)簡(jiǎn)單實(shí)現(xiàn)貓狗圖像分類(lèi),本文給大家介紹的非常詳細(xì),對(duì)大家的學(xué)習(xí)或工作具有一定的參考借鑒價(jià)值,需要的朋友可以參考下2021-04-04
python類(lèi)和函數(shù)中使用靜態(tài)變量的方法
這篇文章主要介紹了python類(lèi)和函數(shù)中使用靜態(tài)變量的方法,實(shí)例分析了三種常用的實(shí)現(xiàn)技巧,具有一定參考借鑒價(jià)值,需要的朋友可以參考下2015-05-05
Python 通過(guò)分隔符分割文件后按特定次序重新組合的操作
這篇文章主要介紹了Python 通過(guò)分隔符分割文件后按特定次序重新組合的操作,具有很好的參考價(jià)值,希望對(duì)大家有所幫助。一起跟隨小編過(guò)來(lái)看看吧2021-04-04
用Python爬取某乎手機(jī)APP數(shù)據(jù)
最近爬取的數(shù)據(jù)都是網(wǎng)頁(yè)端,今天來(lái)教大家如何爬取手機(jī)端app數(shù)據(jù)(本文以ios蘋(píng)果手機(jī)為例,其實(shí)安卓跟ios差不多)! 本文將以『某乎』為實(shí)戰(zhàn)案例,手把手教你從配置到代碼一步一步的爬取App數(shù)據(jù),需要的朋友可以參考下2021-06-06

