Standard Link implementations¶
Chainer provides many Link
implementations in the
chainer.links
package.
Note
Some of the links are originally defined in the chainer.functions
namespace. They are still left in the namespace for backward compatibility,
though it is strongly recommended to use them via the chainer.links
package.
Learnable connections¶
chainer.links.Bias 
Broadcasted elementwise summation with learnable parameters. 
chainer.links.Bilinear 
Bilinear layer that performs tensor multiplication. 
chainer.links.Convolution2D 
Twodimensional convolutional layer. 
chainer.links.ConvolutionND 
Ndimensional convolution layer. 
chainer.links.Deconvolution2D 
Two dimensional deconvolution function. 
chainer.links.DeconvolutionND 
Ndimensional deconvolution function. 
chainer.links.DepthwiseConvolution2D 
Twodimensional depthwise convolutional layer. 
chainer.links.DilatedConvolution2D 
Twodimensional dilated convolutional layer. 
chainer.links.EmbedID 
Efficient linear layer for onehot input. 
chainer.links.GRU 
Stateful Gated Recurrent Unit function (GRU) 
chainer.links.Highway 
Highway module. 
chainer.links.Inception 
Inception module of GoogLeNet. 
chainer.links.InceptionBN 
Inception module of the new GoogLeNet with BatchNormalization. 
chainer.links.Linear 
Linear layer (a.k.a. 
chainer.links.LSTM 
Fullyconnected LSTM layer. 
chainer.links.MLPConvolution2D 
Twodimensional MLP convolution layer of Network in Network. 
chainer.links.NStepBiGRU 
Stacked Bidirectional GRU for sequnces. 
chainer.links.NStepBiLSTM 
Stacked Bidirectional LSTM for sequnces. 
chainer.links.NStepBiRNNReLU 
Stacked Bidirectional RNN for sequnces. 
chainer.links.NStepBiRNNTanh 
Stacked Bidirectional RNN for sequnces. 
chainer.links.NStepGRU 
Stacked Unidirectional GRU for sequnces. 
chainer.links.NStepLSTM 
Stacked Unidirectional LSTM for sequnces. 
chainer.links.NStepRNNReLU 
Stacked Unidirectional RNN for sequnces. 
chainer.links.NStepRNNTanh 
Stacked Unidirectional RNN for sequnces. 
chainer.links.Scale 
Broadcasted elementwise product with learnable parameters. 
chainer.links.StatefulGRU 
Stateful Gated Recurrent Unit function (GRU). 
chainer.links.StatefulPeepholeLSTM 
Fullyconnected LSTM layer with peephole connections. 
chainer.links.StatelessLSTM 
Stateless LSTM layer. 
ChildSumTreeLSTM¶

class
chainer.links.
ChildSumTreeLSTM
(in_size, out_size)¶ ChildSum TreeLSTM unit.
This is a ChildSum TreeLSTM unit as a chain. This link is a variable arguments function, which compounds the states of all children nodes into the new states of a current (parent) node. states denotes the cell state, \(c\), and the output, \(h\), which are produced by this link. This link doesn’t keep cell and hidden states internally.
For example, this link is called such as
func(c1, c2, h1, h2, x)
if the number of children nodes is 2, whilefunc(c1, c2, c3, h1, h2, h3, x)
if that is 3. This function is independent from an order of children nodes. Thus, the returns offunc(c1, c2, h1, h2, x)
equal to those offunc(c2, c1, h2, h1, x)
.Parameters: Variables:  W_x (chainer.links.Linear) – Linear layer of connections from input vectors.
 W_h_aio (chainer.links.Linear) – Linear layer of connections between (\(a\), \(i\), \(o\)) and summation of children’s output vectors. \(a\), \(i\) and \(o\) denotes input compound, input gate and output gate, respectively. \(a\), input compound, equals to \(u\) in the paper by Tai et al.
 W_h_f (chainer.links.Linear) – Linear layer of connections between forget gate \(f\) and the output of each child.
See the paper for details: Improved Semantic Representations From TreeStructured Long ShortTerm Memory Networks.

__call__
(*cshsx)¶ Returns new cell state and output of ChildSum TreeLSTM.
Parameters: cshsx (list of Variable
) – Variable arguments which include all cell vectors and all output vectors of variable children, and an input vector.Returns:  Returns
 \((c_{new}, h_{new})\), where \(c_{new}\) represents new cell state vector, and \(h_{new}\) is new output vector.
Return type: tuple of ~chainer.Variable
NaryTreeLSTM¶

class
chainer.links.
NaryTreeLSTM
(in_size, out_size, n_ary=2)¶ Nary TreeLSTM unit.
This is a Nary TreeLSTM unit as a chain. This link is a fixedlength arguments function, which compounds the states of all children nodes into the new states of a current (parent) node. states denotes the cell state, \(c\), and the output, \(h\), which are produced by this link. This link doesn’t keep cell and hidden states internally.
For example, this link is called such as
func(c1, c2, h1, h2, x)
if the number of children nodes was set 2 (n_ary = 2
), whilefunc(c1, c2, c3, h1, h2, h3, x)
if that was 3 (n_ary = 3
). This function is dependent from an order of children nodes unlike ChildSum TreeLSTM. Thus, the returns offunc(c1, c2, h1, h2, x)
are different from those offunc(c2, c1, h2, h1, x)
.Parameters: Variables:  W_x (chainer.links.Linear) – Linear layer of connections from input vectors.
 W_h (chainer.links.Linear) – Linear layer of connections between (\(a\), \(i\), \(o\), all \(f\)) and the output of each child. \(a\), \(i\), \(o\) and \(f\) denotes input compound, input gate, output gate and forget gate, respectively. \(a\), input compound, equals to \(u\) in the paper by Tai et al.
See the papers for details: Improved Semantic Representations From TreeStructured Long ShortTerm Memory Networks, and A Fast Unified Model for Parsing and Sentence Understanding.
Tai et al.’s NAry TreeLSTM is little extended in Bowman et al., and this link is based on the variant by Bowman et al. Specifically, eq. 10 in Tai et al. has only one \(W\) matrix to be applied to \(x\), consistently for all children. On the other hand, Bowman et al.’s model has multiple matrices, each of which affects the forget gate for each child’s cell individually.

__call__
(*cshsx)¶ Returns new cell state and output of Nary TreeLSTM.
Parameters: cshsx (list of Variable
) – Arguments which include all cell vectors and all output vectors of fixedlength children, and an input vector. The number of arguments must be same asn_ary * 2 + 1
.Returns:  Returns \((c_{new}, h_{new})\),
 where \(c_{new}\) represents new cell state vector, and \(h_{new}\) is new output vector.
Return type: tuple of ~chainer.Variable
Activation/loss/normalization functions with parameters¶
chainer.links.BatchNormalization 
Batch normalization layer on outputs of linear or convolution functions. 
chainer.links.LayerNormalization 
Layer normalization layer on outputs of linear functions. 
chainer.links.BinaryHierarchicalSoftmax 
Hierarchical softmax layer over binary tree. 
chainer.links.BlackOut 
BlackOut loss layer. 
chainer.links.CRF1d 
Linearchain conditional random field loss layer. 
chainer.links.SimplifiedDropconnect 
Fullyconnected layer with simplified dropconnect regularization. 
chainer.links.PReLU 
Parametric ReLU function as a link. 
chainer.links.Maxout 
Fullyconnected maxout layer. 
chainer.links.NegativeSampling 
Negative sampling loss layer. 
Machine learning models¶
chainer.links.Classifier 
A simple classifier model. 
Pretrained models¶
Pretrained models are mainly used to achieve a good performance with a small
dataset, or extract a semantic feature vector. Although CaffeFunction
automatically loads a pretrained model released as a caffemodel,
the following link models provide an interface for automatically converting
caffemodels, and easily extracting semantic feature vectors.
For example, to extract the feature vectors with VGG16Layers
, which is
a common pretrained model in the field of image recognition,
users need to write the following few lines:
from chainer.links import VGG16Layers
from PIL import Image
model = VGG16Layers()
img = Image.open("path/to/image.jpg")
feature = model.extract([img], layers=["fc7"])["fc7"]
where fc7
denotes a layer before the last fullyconnected layer.
Unlike the usual links, these classes automatically load all the
parameters from the pretrained models during initialization.
VGG16Layers¶
chainer.links.VGG16Layers 
A pretrained CNN model with 16 layers provided by VGG team. 
chainer.links.model.vision.vgg.prepare 
Converts the given image to the numpy array for VGG models. 
GoogLeNet¶
chainer.links.GoogLeNet 
A pretrained GoogLeNet model provided by BVLC. 
chainer.links.model.vision.googlenet.prepare 
Converts the given image to the numpy array for GoogLeNet. 
Residual Networks¶
chainer.links.model.vision.resnet.ResNetLayers 
A pretrained CNN model provided by MSRA. 
chainer.links.ResNet50Layers 
A pretrained CNN model with 50 layers provided by MSRA. 
chainer.links.ResNet101Layers 
A pretrained CNN model with 101 layers provided by MSRA. 
chainer.links.ResNet152Layers 
A pretrained CNN model with 152 layers provided by MSRA. 
chainer.links.model.vision.resnet.prepare 
Converts the given image to the numpy array for ResNets. 