chainer.links.NStepLSTM¶
-
class
chainer.links.
NStepLSTM
(self, n_layers, in_size, out_size, dropout)[source]¶ Stacked Uni-directional LSTM for sequences.
This link is stacked version of Uni-directional LSTM for sequences. It calculates hidden and cell states of all layer at end-of-string, and all hidden states of the last layer for each time.
Unlike
chainer.functions.n_step_lstm()
, this function automatically sort inputs in descending order by length, and transpose the sequence. Users just need to call the link with a list ofchainer.Variable
holding sequences.Warning
use_cudnn
argument is not supported anymore since v2. Instead, usechainer.using_config('use_cudnn', use_cudnn)
. Seechainer.using_config()
.Parameters: - n_layers (int) – Number of layers.
- in_size (int) – Dimensionality of input vectors.
- out_size (int) – Dimensionality of hidden states and output vectors.
- dropout (float) – Dropout ratio.
- initialW (initializer) – Initializer to
initialize the weight. When it is
numpy.ndarray
, itsndim
should be 2. - initial_bias (initializer) – Initializer to
initialize the bias. If
None
, the bias will be initialized to zero. When it isnumpy.ndarray
, itsndim
should be 1.
See also
Methods
-
__call__
(self, hx, cx, xs)[source]¶ Calculate all hidden states and cell states.
Warning
train
argument is not supported anymore since v2. Instead, usechainer.using_config('train', train)
. Seechainer.using_config()
.Parameters: - hx (Variable or None) – Initial hidden states. If
None
is specified zero-vector is used. Its shape is(S, B, N)
for uni-directional LSTM and(2S, B, N)
for bi-directional LSTM whereS
is the number of layers and is equal ton_layers
,B
is the mini-batch size, andN
is the dimension of the hidden units. - cx (Variable or None) – Initial cell states. If
None
is specified zero-vector is used. It has the same shape ashx
. - xs (list of ~chainer.Variable) – List of input sequences.
Each element
xs[i]
is achainer.Variable
holding a sequence. Its shape is(L_t, I)
, whereL_t
is the length of a sequence for timet
, andI
is the size of the input and is equal toin_size
.
Returns: This function returns a tuple containing three elements,
hy
,cy
andys
.hy
is an updated hidden states whose shape is the same ashx
.cy
is an updated cell states whose shape is the same ascx
.ys
is a list ofVariable
. Each elementys[t]
holds hidden states of the last layer corresponding to an inputxs[t]
. Its shape is(L_t, N)
for uni-directional LSTM and(L_t, 2N)
for bi-directional LSTM whereL_t
is the length of a sequence for timet
, andN
is size of hidden units.
Return type: - hx (Variable or None) – Initial hidden states. If
-
__getitem__
(index)[source]¶ Returns the child at given index.
Parameters: index (int) – Index of the child in the list. Returns: The index
-th child link.Return type: Link
-
add_link
(link)[source]¶ Registers a child link and adds it to the tail of the list.
Parameters: link (Link) – The link object to be registered.
-
add_param
(name, shape=None, dtype=<class 'numpy.float32'>, initializer=None)[source]¶ Registers a parameter to the link.
Deprecated since version v2.0.0: Assign a
Parameter
object directly to an attribute withininit_scope()
instead. For example, the following codelink.add_param('W', shape=(5, 3))
can be replaced by the following assignment.
with link.init_scope(): link.W = chainer.Parameter(None, (5, 3))
The latter is easier for IDEs to keep track of the attribute’s type.
Parameters: - name (str) – Name of the parameter. This name is also used as the attribute name.
- shape (int or tuple of ints) – Shape of the parameter array. If it is omitted, the parameter variable is left uninitialized.
- dtype – Data type of the parameter array.
- initializer – If it is not
None
, the data is initialized with the given initializer. If it is an array, the data is directly initialized by it. If it is callable, it is used as a weight initializer. Note that in these cases,dtype
argument is ignored.
-
add_persistent
(name, value)[source]¶ Registers a persistent value to the link.
The registered value is saved and loaded on serialization and deserialization. The value is set to an attribute of the link.
Parameters: - name (str) – Name of the persistent value. This name is also used for the attribute name.
- value – Value to be registered.
-
addgrads
(link)[source]¶ Accumulates gradient values from given link.
This method adds each gradient array of the given link to corresponding gradient array of this link. The accumulation is even done across host and different devices.
Parameters: link (Link) – Source link object.
-
append
(link)[source]¶ Registers a child link and adds it to the tail of the list.
This is equivalent to
add_link()
. This method has been added to emulate thelist
interface.Parameters: link (Link) – The link object to be regsitered.
-
children
()[source]¶ Returns a generator of all child links.
Returns: A generator object that generates all child links.
-
cleargrads
()[source]¶ Clears all gradient arrays.
This method should be called before the backward computation at every iteration of the optimization.
-
copy
(mode='share')[source]¶ Copies the link hierarchy to new one.
The whole hierarchy rooted by this link is copied. There are three modes to perform copy. Please see the document for the argument
mode
below.The name of the link is reset on the copy, since the copied instance does not belong to the original parent chain (even if exists).
Parameters: mode (str) – It should be either init
,copy
, orshare
.init
means parameter variables under the returned link object is re-initialized by calling theirinitialize()
method, so that all the parameters may have different initial values from the original link.copy
means that the link object is deeply copied, so that its parameters are not re-initialized but are also deeply copied. Thus, all parameters have same initial values but can be changed independently.share
means that the link is shallowly copied, so that its parameters’ arrays are shared with the original one. Thus, their values are changed synchronously. The defaultmode
isshare
.Returns: Copied link object. Return type: Link
-
copyparams
(link)[source]¶ Copies all parameters from given link.
This method copies data arrays of all parameters in the hierarchy. The copy is even done across the host and devices. Note that this method does not copy the gradient arrays.
Parameters: link (Link) – Source link object.
-
count_params
()[source]¶ Counts the total number of parameters.
This method counts the total number of scalar values included in all the
Parameter
s held by this link and its descendants.If the link containts uninitialized parameters, this method raises a warning.
Returns: The total size of parameters (int)
-
disable_update
()[source]¶ Disables update rules of all parameters under the link hierarchy.
This method sets the
enabled
flag of the update rule of each parameter variable toFalse
.
-
enable_update
()[source]¶ Enables update rules of all parameters under the link hierarchy.
This method sets the
enabled
flag of the update rule of each parameter variable toTrue
.
-
init_scope
()[source]¶ Creates an initialization scope.
This method returns a context manager object that enables registration of parameters (and links for
Chain
) by an assignment. AParameter
object can be automatically registered by assigning it to an attribute under this context manager.Example
In most cases, the parameter registration is done in the initializer method. Using the
init_scope
method, we can simply assign aParameter
object to register it to the link.class MyLink(chainer.Link): def __init__(self): super().__init__() with self.init_scope(): self.W = chainer.Parameter(0, (10, 5)) self.b = chainer.Parameter(0, (5,))
-
links
(skipself=False)[source]¶ Returns a generator of all links under the hierarchy.
Parameters: skipself (bool) – If True
, then the generator skips this link and starts with the first child link.Returns: A generator object that generates all links.
-
namedlinks
(skipself=False)[source]¶ Returns a generator of all (path, link) pairs under the hierarchy.
Parameters: skipself (bool) – If True
, then the generator skips this link and starts with the first child link.Returns: A generator object that generates all (path, link) pairs.
-
namedparams
(include_uninit=True)[source]¶ Returns a generator of all (path, param) pairs under the hierarchy.
Parameters: include_uninit (bool) – If True
, it also generates uninitialized parameters.Returns: A generator object that generates all (path, parameter) pairs. The paths are relative from this link.
-
params
(include_uninit=True)[source]¶ Returns a generator of all parameters under the link hierarchy.
Parameters: include_uninit (bool) – If True
, it also generates uninitialized parameters.Returns: A generator object that generates all parameters.
-
register_persistent
(name)[source]¶ Registers an attribute of a given name as a persistent value.
This is a convenient method to register an existing attribute as a persistent value. If
name
has been already registered as a parameter, this method removes it from the list of parameter names and re-registers it as a persistent value.Parameters: name (str) – Name of the attribute to be registered.
-
repeat
(n_repeat, mode='init')[source]¶ Repeats this link multiple times to make a
Sequential
.This method returns a
Sequential
object which has the sameLink
multiple times repeatedly. Themode
argument means how to copy this link to repeat.Example
You can repeat the same link multiple times to create a longer
Sequential
block like this:class ConvBNReLU(chainer.Chain): def __init__(self): super(ConvBNReLU, self).__init__() with self.init_scope(): self.conv = L.Convolution2D( None, 64, 3, 1, 1, nobias=True) self.bn = L.BatchNormalization(64) def __call__(self, x): return F.relu(self.bn(self.conv(x))) net = ConvBNReLU().repeat(16, mode='init')
The
net
object contains 16 blocks, each of which isConvBNReLU
. And themode
wasinit
, so each block is re-initialized with different parameters. If you givecopy
to this argument, each block has same values for its parameters but its object ID is different from others. If it isshare
, each block is same to others in terms of not only parameters but also the object IDs because they are shallow-copied, so that when the parameter of one block is changed, all the parameters in the others also change.Parameters: - n_repeat (int) – Number of times to repeat.
- mode (str) – It should be either
init
,copy
, orshare
.init
means parameters of each repeated element in the returnedSequential
will be re-initialized, so that all elements have different initial parameters.copy
means that the parameters will not be re-initialized but object itself will be deep-copied, so that all elements have same initial parameters but can be changed independently.share
means all the elements which consist the resultingSequential
object are same object because they are shallow-copied, so that all parameters of elements are shared with each other.
-
serialize
(serializer)[source]¶ Serializes the link object.
Parameters: serializer (AbstractSerializer) – Serializer object.
-
to_cpu
()[source]¶ Copies parameter variables and persistent values to CPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to CPU, the link implementation must override this method to do so.
Returns: self
-
to_gpu
(device=None)[source]¶ Copies parameter variables and persistent values to GPU.
This method does not handle non-registered attributes. If some of such attributes must be copied to GPU, the link implementation must override this method to do so.
Parameters: device – Target device specifier. If omitted, the current device is used. Returns: self
-
zerograds
()[source]¶ Initializes all gradient arrays by zero.
This method can be used for the same purpose of cleargrads, but less efficient. This method is left for backward compatibility.
Deprecated since version v1.15: Use
cleargrads()
instead.
Attributes
-
n_cells
¶ Returns the number of cells.
This function must be implemented in a child class.
-
n_weights
= 8¶
-
update_enabled
¶ True
if at least one parameter has an update rule enabled.
-
use_bi_direction
= False¶
-
within_init_scope
¶ True if the current code is inside of an initialization scope.
See
init_scope()
for the details of the initialization scope.