Chainer教程 - 前言

Posted on Wed 30 May 2018 in MachineLearning • 1 min read

为什么是Chainer

无他，为手熟尔。接触Chainer大体是在2016年末，那时的主流框架还是Caffe以及Tensorflow。由于实验室中一些工作需要处理大量变长的数据，而Tensorflow本身对变长输入支持不好，后来了解到Chainer框架对变长输入支持不错便一直使用Chainer至今。

下面简单介绍以下Chainer。Chainer是日本创业公司Preferred Networks开发的开源项目。该项目完全由Python构建，主要由2部分组成神经网络框架 Chainer 以及 GPU版的numpy Cupy组成。(在Chainer 2.0版本发布后，cupy从chainer 中独立，后续使用需要单独安装)

Chainer与Pytorch

chainer与 pytorch 在 nn的设计上基本一致(pytorch的nn fork自 chainer)。那么两种框架之间的迁移是十分方便的。

chainer通过如下方式定义网络架构

import chainer
import chainer.functions as F
import chainer.links as L
class net(chainer.Chain):
    def __init__(self):
        super(net,self).__init__()
        with self.init_scope():
            self.conv1=L.Convolution2D(in_channels=None,
                out_channels=6,ksize=3,pad=1)
            self.fc=L.Linear(None,10)
    def forward(self,x):
        h=self.conv1(x)
        h=F.relu(h)
        h=F.max_pooling_2d(h,2,2)
        y=self.fc(h)
        return y,h

Pytorch通过如下方式定义网络结构

import torch
import torch.nn as nn
import torch.nn.functional as F
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv1d(1, 128, 3, 1)
        self.pool1 = nn.MaxPool1d(3)
        self.fc1 = nn.Linear(512, 10)
    def forward(self, x):
        x = self.conv1(x)
        x = F.leaky_relu(x)
        x = self.pool1(x)
        x = x.permute(0, 2, 1) #change the 512x1 to 1x512
        x = self.fc1(x)
        return F.log_softmax(x, dim = 2)

可以看出在网络构建过程中两者基本一致。在数据的操作上两者都是numpy like 的API。个人认为两种框架后续最大的区别在于处理主体的不同，chainer以chain为主体，pytorch以Variable为主体。很难说两者孰优孰劣，但在部分优化特点变量的问题中，Pytorch更为方便。而chainer的优势在于使用更加方便，如不需要去人为的计算每一次的输入(尤其是在全连接 Fully connect,FC中使用)。