Chainer教程 1.2 & 1.3 chainer

Posted on Sun 03 June 2018 in MachineLearning • 2 min read

安装与Demo

本节用于简单介绍如何安装chainer以及利用其完成一个简单的线性回归模型。

安装

chainer的安装比较简单,由于在2.0中cupy从chainer中分离。仅仅cpu版本的安装速度极快即不需要安装cupy。 安装命令如下

    pip install chainer;pip install cupy

安装过程如下所示。

    Collecting chainer
      Downloading https://files.pythonhosted.org/packages/12/ed/8b923bc28345c5b3e53358ba7e5e09b02142fc612378fd90986cf40073ef/chainer-5.4.0.tar.gz (525kB)
        100% |████████████████████████████████| 532kB 109kB/s 
    Requirement already satisfied: filelock in /home/r08ust/anaconda3/lib/python3.7/site-packages (from chainer) (3.0.10)
    Requirement already satisfied: numpy>=1.9.0 in /home/r08ust/anaconda3/lib/python3.7/site-packages (from chainer) (1.15.4)
    Requirement already satisfied: protobuf>=3.0.0 in /home/r08ust/anaconda3/lib/python3.7/site-packages (from chainer) (3.6.1)
    Requirement already satisfied: six>=1.9.0 in /home/r08ust/anaconda3/lib/python3.7/site-packages (from chainer) (1.12.0)
    Requirement already satisfied: setuptools in /home/r08ust/anaconda3/lib/python3.7/site-packages (from protobuf>=3.0.0->chainer) (40.6.3)
    Building wheels for collected packages: chainer
      Running setup.py bdist_wheel for chainer ... done
      Stored in directory: /home/r08ust/.cache/pip/wheels/eb/18/d2/5e85cbd7f32026e5e72cc466a5a17fd1939e99ffeeaaea267b
    Successfully built chainer
    Installing collected packages: chainer
    Successfully installed chainer-5.4.0
    Collecting cupy
      Downloading https://files.pythonhosted.org/packages/cd/d6/532e5da87f3b513cd0b98bcbf9a58fb6758598039944c42cb93d13b71a5f/cupy-5.4.0.tar.gz (2.5MB)
        100% |████████████████████████████████| 2.5MB 90kB/s 
    Requirement already satisfied: numpy>=1.9.0 in /home/r08ust/anaconda3/lib/python3.7/site-packages (from cupy) (1.15.4)
    Requirement already satisfied: six>=1.9.0 in /home/r08ust/anaconda3/lib/python3.7/site-packages (from cupy) (1.12.0)
    Collecting fastrlock>=0.3 (from cupy)
      Using cached https://files.pythonhosted.org/packages/89/50/e2ca3f37b783975a7e1f4e7d81a62d6fe269254cdfc46610d8fe42a3f38f/fastrlock-0.4-cp37-cp37m-manylinux1_x86_64.whl
    Building wheels for collected packages: cupy
      Running setup.py bdist_wheel for cupy ... done
      Stored in directory: /home/r08ust/.cache/pip/wheels/77/44/1f/138aca0bd8e7dd12d307eb06b595b15de4f92f2223cf95645d
    Successfully built cupy
    Installing collected packages: fastrlock, cupy
    Successfully installed cupy-5.4.0 fastrlock-0.4
In [1]:
#Define import here
import chainer
import chainer.links as L
import chainer.functions as F
import chainer.optimizers 
import numpy as np
import matplotlib.pyplot as plt

chainer的神经网络层定义在chainer.links中,其中包括许多常用的网络层如Conv,BN,Linear,RNN,LSTM等

chainer.functions中定义一些常见的操作(部分不需要训练的层也定义于其中)如Pooling,add还有常用的loss函数

chainer.optimizers 顾名思义定义了大量的优化器

In [2]:
#Get data and define networks
y=np.array([[0],[3*4],[2*4],[4*4],[5*4],[7*4],[8*4],[9*4],[12*4],[15*4]]).astype(np.float32)
x=np.array([[0],[1],[2],[3],[4],[5],[6],[7],[8],[9]]).astype(np.float32)
class demo(chainer.Chain):
    def __init__(self,num=1):
        super(demo,self).__init__()
        with self.init_scope():
            self.fc=L.Linear(1,1)
    def forward(self,x):
        y=self.fc(x)
        return y
net=demo()

Chainer在自定义网络结构时可以采用2种方法,一种如上将其包装在一个类中,可以通过调用该类的实例来完成前向传播操作。另一种即不是用类进行封装具体实现如下

net=chainer.Chain(l1L.Linear(1,1),
                    l2=L.Linear(1,1))
def forward(x,model):
    y=model.l1(x)
    y=model.l2(y)
    return y

但总体来讲,两者并没有显著差距

In [3]:
#See before trained picture
yhat=net(x) #eval stage(only forward)
yhat=yhat.data.squeeze() # reshape yhat‘s shape from 10*1 to 10
plt.plot(x,yhat) # draw a line
plt.scatter(x,y) # draw some points
Out[3]:
<matplotlib.collections.PathCollection at 0x7fe502e0f320>
In [4]:
#define optimizers
opt=chainer.optimizers.SGD(lr=0.01) #define optimizers
opt.setup(net) #load model to optimizers
#train model
epoch=5
for _ in range(epoch):
    yhat=net(x)
    loss=F.mean_squared_error(y,yhat) #compute loss
    net.cleargrads()
    loss.backward() #compute gradient
    opt.update() # update weight
    print('loss:{}'.format(loss.data))
loss:892.8599853515625
loss:167.76364135742188
loss:42.485374450683594
loss:20.833375930786133
loss:17.08420181274414

这里说明一些深度学习中的常见模式或者术语:

epoch:即模型对数据的重复学习论次数,由于深度学习往往采用梯度下降方法优化模型,仅仅学习一遍无法很好的学习数据集的特征,故设置多轮epoch。epoch 一般由于数据的复杂程度决定。

lr:即学习率。梯度下降的公式可以概况为$\theta=\theta-\alpha\frac{\partial L}{\partial\theta}$,其中的$\alpha$便是学习率。详细的教学推荐吴恩达老师的CS229中的第二课,其中对梯度下降方法进行了详细的解释。

参数:即模型中的权重如最简单的单变量线性回归模型$y=\alpha x+\beta$中$\alpha,\beta$就是参数可以通过优化的方法自动的去计算一个合理值

超参数:超参数无法学习,是指示一些无法学习的人为设置的变量的集合。在这个简单的demo中实际上仅有2组超参数,epoch以及lr,实际上随着模型的提升超参数也越来越多,只能通过手动调节的方法不断尝试。

炼丹:深度学习的过程实际上就是炼丹的过程,选取了合适的丹炉(模型结构),相同的原料(数据),相同的丹炉(模型),利用不同的风水(超参数)炼出一个个不相同的丹(预测结果),同时你还不知道原料是如何组合的(神经网络的黑盒性)。后面将介绍利用不同的丹炉,如卷积炉(CNN),递归炉(RNN),生产对抗炉(GAN)等等,炼丹。

In [5]:
yhat=net(x)
yhat=yhat.data.squeeze()
plt.plot(x,yhat)
plt.scatter(x,y)
Out[5]:
<matplotlib.collections.PathCollection at 0x7fe502d3b908>
In [ ]: