首页 > > 网络编程 > 其它 >

梯度下降、过拟合和归一化

2018-09-10 01:06:49来源：博客园阅读 ()

好的课程应该分享给更多人：人工智能视频列表-尚学堂，点开任意一个之后会发现他们会提供系列课程整合到一起的百度网盘下载地址，包括视频+代码+资料，免费的优质资源。当然，现在共享非常多，各种mooc、博客、论坛等，很容易就可以找到各种各样的知识，能走到哪一步，都在我们自己。希望我能够一直坚持下去，加油！

梯度下降法

看这个吧，简书上的：深入浅出--梯度下降法及其实现

批量梯度下降

　　·初始化W，即随机W，给初值

　　· 沿着负梯度方向迭代，更新后的W使得损失函数J(w)更小

　　· 如果W维度是几百维度，直接算SVD也是可以的，几百维度以上一般是梯度下降算法

# 批量梯度下降
import numpy as np

# 自己创建建数据，哈哈
X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
X_b = np.c_[np.ones((100, 1)), X]

learning_rate = 0.1     # 学习率，步长=学习率x梯度
n_iterations = 1000     # 迭代次数,一般不设置阈值，只设置超参数，迭代次数
m = 100     # m个样本

theta = np.random.randn(2, 1)   # 初始化参数theta，w0，...,wn
count = 0   # 计数

for iteration in range(n_iterations):
    count += 1
    # 求梯度
    gradients = 1/m * X_b.T.dot(X_b.dot(theta)-y)
    # 迭代更新theta值
    theta = theta - learning_rate * gradients
    # print(count, theta)
    
print(count, theta)

随机梯度下降

　　· 优先选择随机梯度下降

　　· 有些时候随机梯度下降可以跳出局部最小值

import numpy as np

X = 2 * np.random.rand(100, 1)
y = 4 + 3 * X + np.random.randn(100, 1)
X_b = np.c_[np.ones((100, 1)), X]

n_epochs = 500 
t0, t1 = 5, 50
m = 100

def learning_schedule(t):
    return t0/(t + t1)

# 随机初始化参数值
theta = np.random.randn(2, 1)

for epoch in range(n_epochs):
    for i in range(m):
        random_index = np.random.randint(m)
        xi = X_b[random_index:random_index+1]
        yi = y[random_index:random_index+1]
        gradients = 2*xi.T.dot(xi.dot(theta)-yi)
        learning_rate = learning_schedule(epoch*m + i)
        theta = theta - learning_rate * gradients

print(theta)