chainer.functions.swish

chainer.functions.swish(x, beta)[source]

Swish activation function.

\[f(x, \beta) = x \cdot \sigma(\beta x),\]

where \(\sigma(\cdot)\) is the sigmoid function. It has the following properties:

\[\begin{split}f(x, 0) &= \frac{x}{2}, \\ \lim_{\beta \to \infty} f(x, \beta) &= \max(0, x).\end{split}\]
Parameters
  • x (Variable or N-dimensional array) – Input variable of shape \((s_B, s_1, s_2, ..., s_N)\), where \(s_B\) is assumed to be the minibatch dimension.

  • beta (Variable or N-dimensional array) – Parameter variable \(\beta\) of shape \((s_1, s_2, ..., s_M)\), where \(M\) is an arbitrary integer between \(0 \leq M \leq N\). The number of dimensions of beta will be matched with x by reshaping it as \((1, s_1, ..., s_M, 1, ... 1)\), then beta and x are multiplied together in an element-wise manner.

Returns

Output variable of the same shape as x.

Return type

Variable

Warning

\(\beta\) is a trainable parameter in the original paper (https://arxiv.org/abs/1710.05941). To train \(\beta\), use chainer.links.Swish instead.

See also

chainer.links.Swish to manage the model parameter beta.